Unicode support
Andrew C. Greenberg
werdna at gate.net
Wed Sep 22 06:09:06 UTC 1999
I agree with Peter's rough cuts:
(1) Strings are somewhat homogenous;
(2) Strings are comprised of elements of a Flyweight pattern class;
(3) Much of the "usual" things we do with a string are
determined by the underlying character class.
Perhaps we can study what things we really do with strings, at least
to enumerate them, and see how they matter.
What do strings do? What must they do? Which of the following are
necessary, mandatory, or even useful? Which depend upon the class?
(A) support a (partial;total) linear ordering (=,<,>)
(B) substring
(C) catenation
(D) indexing
(E) collecting, selecting, allAre, someIs, based upon blocks
that are passed (conceptually) to individual objects.
(F) clumping and declumping (word-based, delimiter-based, token-based)
(G) random access out (not as special case of substring, but pulling
the character object out of the class as well.
(H) random access in (not as special case of catentation, but
validating
and/or coercing the character on the way in.
(I) notion and creation of a null string as identity operator for
many of the preceding operations
(J) searching for substrings.
(K) sizing
I don't know that I agree with those who believe that a string must
be "growable/shrinkable" Perhaps we should consider making strings
that are length-immutable, per Python. Doing so can facilitate other
of what I have now come to think of as string-like operations, in
particular, slicing and index-shifting, by creation of proxy objects
on the original string, which share changes to the underlying
content. Size-shifting operations can be permitted, but create
copies of the original, changes to which do not impact the content of
already-taken slices.
A newbie recently asked how to compute the equivalent of:
word 4 of line 7
and
set word 4 of line 7 to "foobar"
Should these sort of operations, presumably defined somehow in terms
of operations on the underlying Character class be abstracted, or are
there orthogonal operations that provide access to the functionality
that should be better preserved.
Is the Applescript container notion, however fluffy, perhaps a better
model than we have considered to date? Is it important to a
programming language? Can it be efficient ever?
it's late, and I'm babbling. Time to go to bed.
More information about the Squeak-dev
mailing list
|