[Pharo-dev] [Cuis] Sorting Unicode strings (Re: [Unicode] collation sequences (Re: [squeak-dev] Unicode Support))

EuanM euanmee at gmail.com
Tue Dec 8 23:52:19 UTC 2015


Equally old are the NextStep Object C functions which are now embodied
within MacOS X.

It might also be informative for us to see what's done there.

On 8 December 2015 at 23:50, EuanM <euanmee at gmail.com> wrote:
> Dale,
>
> yes - sorting based on the value of codepoints is almost always
> guaranteed to be wrong.  Sorting is an application-specific issue, not
> a technical Unicode issue, as there is more than one canonical sort
> order per culture, and there is often more than one culture per
> writing system.
>
> e.g. ISO Latin 1 / Latin 9
> covers these cultures (amongst others)
> English (2 sort orders); Spanish; French (2 sort orders); German (2
> sort orders); Swedish;  etc
>
> German sort order differs from Swedish for the same characters, etc
>
> Todd,
>
> My thinking is that if we implement fully-composed strings as
> heterogenous arrays, we sidestep a lot of the complexity of the ICU.
>
> If it turns out that the performance is terrible, we can then seek to
> incorporate the ICU.
>
>
> On 8 December 2015 at 22:36, Todd Blanchard <tblanchard at mac.com> wrote:
>> I just want to second Dale's endorsement of the ICU library.  It has been
>> around a long time (originally developed by Taligent) and it provides the
>> base unicode capabilities for an awful lot of software.
>>
>> I think it would make more sense to bring icu into Smalltalk as a
>> NativeBoost library than to spend resources reimplementing and maintaining
>> it.
>>
>> -Todd Blanchard
>>
>> On Dec 8, 2015, at 11:20, Dale Henrichs <dale.henrichs at gemtalksystems.com>
>> wrote:
>>
>> On 12/07/2015 11:31 PM, H. Hirzel wrote:
>>
>> Dale
>>
>> Thank you for your answer with links to the ICU library and the notes
>> about classes in Gemstone. Noteworthy that you have a class Utf8 as a
>> subclass of ByteArray.
>>
>> I understand that Gemstone uses the ICU library and thus does not
>> implement the algorithms in Smalltalk.
>>
>> I am currently looking into what the  ICU  library provides.
>>
>>


More information about the Squeak-dev mailing list