[squeak-dev] Re: [Pharo-dev] String >> #=

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Wed May 28 06:14:10 UTC 2014


2014-05-28 4:50 GMT+02:00 Yoshiki Ohshima <Yoshiki.Ohshima at acm.org>:

> At Tue, 27 May 2014 19:23:09 -0700,
> Andres Valloud wrote:
> >
> > String encoding is perpendicular to my point.  I'm referring to
> > canonical equivalence as defined in section 1.1 of the document
> > referenced by the URL I sent.  For instance, the Hangul example in the
> > first table shows that a combination of two characters (regardless of
> > encoding) is to be considered canonically equivalent to a single
> > character.  From the document (which claims to be Unicode Standard Annex
> > #15),
> >
> > "Canonical equivalence is a fundamental equivalency between characters
> > or sequences of characters that represent the same abstract character,
> > and when correctly displayed should always have the same visual
> > appearance and behavior."
> >
> > How do you propose that a size check is appropriate in the presence of
> > canonical equivalence?  What is string equivalence supposed to mean?  I
> > think more attention should be given to those questions.
>
> I think that the single equal message (=) in the Smalltalk language
> should not really worry about canonical equvalence.  For those who
> need it, it'd be fine to define a new selector and does the real
> stuff, and such method could track the Unicode standard revisions and
> do the right thing.  But something as fundamental as String>>#= does
> not have to have dependency to the external standard.
>
> -- Yoshiki
>
>
If internal representation is not canonical, we are going toward a path of
maximum complexity.
All comparison functions = < > <= >= hash will have to first canonicalize.
So i tend to agree with Yoshiki, let these kernel methods perform their
dumb task, and reject this complexity outside.

Well beyond the complexity of Unicode, the cr-lf mess already creates the
same problem.
There is no semantic difference between cr and cr-lf.
Though I had to insert a few withSqueakLineEndings sends in Monticello
when playing with GitFileTree.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20140528/39abb8b3/attachment.htm


More information about the Squeak-dev mailing list