[squeak-dev] The Trunk: Collections-topa.806.mcz
Das.Linux at gmx.de
Thu Sep 13 15:11:31 UTC 2018
> On 13.09.2018, at 16:35, Levente Uzonyi <leves at caesar.elte.hu> wrote:
> You're opening a can of worms with this. There are several other separator/white space characters missing from that list.
Yeah, thats listed below in a comment. I am hesitating to add the other because WideString, so I just put them in a comment.
> Also, this change makes the various #*separator* implementations (e.g. #isSeparator) inconsistent, so I strongly disagree with this change.
Hmm. But isSeparator is Wrong, then… because nbsp _is_ a separator, right?
See the discussion with Ron.
On a related note, is a very fast #isSeparator important?
Otherwise I'd just propose
^ #( 9 10 12 13 32 160 ) includes: self asInteger
All other *separator* messages fall back either to either Character>>#isSeparator or #separators from CharacterSet, which in turn is based on Character class>>#separators.
> On Wed, 12 Sep 2018, commits at source.squeak.org wrote:
>> Tobias Pape uploaded a new version of Collections to project The Trunk:
>> ==================== Summary ====================
>> Name: Collections-topa.806
>> Author: topa
>> Time: 12 September 2018, 3:28:40.687052 pm
>> UUID: 46b95db5-a773-4113-92f0-5ee905404b49
>> Ancestors: Collections-cmm.805
>> Fix separators to include U+00A0 (no break space)
>> Thanks Ron!
>> =============== Diff against Collections-cmm.805 ===============
>> Item was changed:
>> ----- Method: Character class>>separators (in category 'instance creation') -----
>> + "Answer a collection of space-like separator characters.
>> + Note that we do not consider spaces in >8bit code points yet.
>> + "
>> - "Answer a collection of the standard ASCII separator characters."
>> + ^ #(9 "tab"
>> - ^ #(32 "space"
>> - 13 "cr"
>> - 9 "tab"
>> 10 "line feed"
>> + 12 "form feed"
>> + 13 "cr"
>> + 32 "space"
>> + 160 "non-breaking space, see Unicode Z general category")
>> + collect: [:v | Character value: v] as: String
>> + " To be considered:
>> + 16r1680 OGHAM SPACE MARK
>> + 16r2000 EN QUAD
>> + 16r2001 EM QUAD
>> + 16r2002 EN SPACE
>> + 16r2003 EM SPACE
>> + 16r2004 THREE-PER-EM SPACE
>> + 16r2005 FOUR-PER-EM SPACE
>> + 16r2006 SIX-PER-EM SPACE
>> + 16r2007 FIGURE SPACE
>> + 16r2008 PUNCTUATION SPACE
>> + 16r2009 THIN SPACE
>> + 16r200A HAIR SPACE
>> + 16r2028 LINE SEPARATOR
>> + 16r2029 PARAGRAPH SEPARATOR
>> + 16r202F NARROW NO-BREAK SPACE
>> + 16r205F MEDIUM MATHEMATICAL SPACE
>> + 16r3000 IDEOGRAPHIC SPACE
>> + "!
>> - 12 "form feed")
>> - collect: [:v | Character value: v] as: String!
>> Item was changed:
>> + (PackageInfo named: 'Collections') postscript: 'CharacterSet cleanUp: false.'!
>> - (PackageInfo named: 'Collections') postscript: 'Character initializeClassificationTable'!
More information about the Squeak-dev