[Pharo-dev] [Cuis] Sorting Unicode strings (Re: [Unicode]
collation sequences (Re: [squeak-dev] Unicode Support))
EuanM
euanmee at gmail.com
Tue Dec 8 23:52:19 UTC 2015
Equally old are the NextStep Object C functions which are now embodied
within MacOS X.
It might also be informative for us to see what's done there.
On 8 December 2015 at 23:50, EuanM <euanmee at gmail.com> wrote:
> Dale,
>
> yes - sorting based on the value of codepoints is almost always
> guaranteed to be wrong. Sorting is an application-specific issue, not
> a technical Unicode issue, as there is more than one canonical sort
> order per culture, and there is often more than one culture per
> writing system.
>
> e.g. ISO Latin 1 / Latin 9
> covers these cultures (amongst others)
> English (2 sort orders); Spanish; French (2 sort orders); German (2
> sort orders); Swedish; etc
>
> German sort order differs from Swedish for the same characters, etc
>
> Todd,
>
> My thinking is that if we implement fully-composed strings as
> heterogenous arrays, we sidestep a lot of the complexity of the ICU.
>
> If it turns out that the performance is terrible, we can then seek to
> incorporate the ICU.
>
>
> On 8 December 2015 at 22:36, Todd Blanchard <tblanchard at mac.com> wrote:
>> I just want to second Dale's endorsement of the ICU library. It has been
>> around a long time (originally developed by Taligent) and it provides the
>> base unicode capabilities for an awful lot of software.
>>
>> I think it would make more sense to bring icu into Smalltalk as a
>> NativeBoost library than to spend resources reimplementing and maintaining
>> it.
>>
>> -Todd Blanchard
>>
>> On Dec 8, 2015, at 11:20, Dale Henrichs <dale.henrichs at gemtalksystems.com>
>> wrote:
>>
>> On 12/07/2015 11:31 PM, H. Hirzel wrote:
>>
>> Dale
>>
>> Thank you for your answer with links to the ICU library and the notes
>> about classes in Gemstone. Noteworthy that you have a class Utf8 as a
>> subclass of ByteArray.
>>
>> I understand that Gemstone uses the ICU library and thus does not
>> implement the algorithms in Smalltalk.
>>
>> I am currently looking into what the ICU library provides.
>>
>>
More information about the Squeak-dev
mailing list
|