[squeak-dev] String & Text

Jakob Reschke jakres+squeak at gmail.com
Sat Jul 16 10:13:28 UTC 2022

I scrolled through the protocols of String and Text two days ago. The main
categories of methods (which don't overlap well with the classification in
the image) on String that I remember are:
- general string parts enumeration (finding, lines)
- general string manipulation (trimming, splitting, slicing, character
substitution, substring substitution, case changes, padding, ...)
- tokenization and parsing (e. g. csv/tsv), some of it Smalltalk
language-specific (getters, setters, arguments/keywords count)
   - natural language conversions (e. g. camelCase <-> words)
   - MIME
- formatting (interpolation, line wrapping or joining, indentation)
- collation and (equality) comparison
- pattern matching
- similarity computations
- character classification (digits, letters, whitespace, ...)
- graphics display
- arithmetic with numbers
- multilingual stuff (among others: dealing with leadingChar)
- character set conversions
- VM paths <-> Squeak paths conversion
- HTML and HTTP conversions (e. g. URL %-decoding)
- crc16
- extension methods from various packages like Etoys, JSON, Monticello,
Morphic, Regex, Network, ...
   - chronology conversions

Next to "format:" there seems to be yet another interpolation mechanism in

Overall, some methods treat strings as text, others treat them as technical
data (markup, file paths, URLs, encoding---ByteArrays with characters). The
leadingChar stuff is strangely in the middle of the two. Text functionality
like interpolation can also be used to produce technical data, of course.

Which selectors Text also understands is at times chaotic. Of some groups
of related selectors, Text may just understand one of them. When it comes
to Text's "core domain" of displaying, it does not understand any of the
displayAt:/displayOn:* selectors...

Am Sa., 16. Juli 2022 um 03:28 Uhr schrieb Chris Muller <
ma.chris.m at gmail.com>:

> Hi Marcel,
> > > But until we do that, and whole hog like Eliot suggested,
> > > what we will have are *some* domain things that String
> > > can do that Text can't -- a partial overlap.
> >
> > It's rather easy. Once we have a CharacterCollection, we can
> > finally see the special cases on String. The common stuff can be
> > moved up to then benefit both String and Text.
> I understand.  CharacterCollection would make it so we "could" do it, but
> it's still worth scrutinizing heavily first whether we should.
> > > In other words, an incomplete mess for an indefinite period of time.
> >
> > Disagree. This kind of refactoring does not look too difficult.
> I was speaking about the state of the system until such time as that
> refactoring was completed, which is indefinite.  Until then, the API would
> be, by definition, incomplete.  We can disagree about it being a "mess",
> though.   :)
> > > By removing all the domain stuff [...]
> >
> > I think that we have a different understand of the term "domain"
> > here. Maybe you are worried about Magma. If so, please
> > elaborate your concerns from that perspective.
> No, I'm speaking strictly in terms of good OO design, where too many
> responsibilities for a class is considered not good design.
> In the other thread, Tim just wrote this about formatting comments.  It
> stuck out to me as an example relevant to the question of this thread.
> (Tim wrote:)
> > Oh, I'm not wanting to have any tabs or spaces inserted - I want the
> formatting to be live and use the left indent.
> > Shout does all that work to colour (etc) the text so why not use the
> fact that it detects comments.
> I agree with him 100%, and this continues to remind us of the numerous
> responsibilities that can be considered "presentation" only.  Before I had
> only mentioned fonts, colors, and attributes (bold, italic, center, right
> justify, left jusify, etc.), but Tim reminds us that indentation and layout
> is in there, too!
> This is already a nice collection of behaviors that completely distinct
> from the ones concerned with the _contents_ of the Text (e.g., its
> 'string'), which is what I mean when I refer to the "domain" vs.
> presentation responsibilities of Text.
> Maybe bloating up Text with such a huge API (domain + presentation) MIGHT
> be the best way out.  I don't know for sure, and I trust this brilliant
> community will come down on the right choice.  I'm only saying that some of
> the usual OO design quality metrics (e.g., number of methods per class,
> among others) will get blown out of the water by this, and that this is a
> sign that it's really worth being cautious.  It also looks to be a one-way
> ticket -- eliminating this delineation of responsibility and piling on
> hundreds of domain accessing / mutating methods onto Text's API will be a
> lot easier than going the other way.  Once we have 5 years of accumulated
> dependency on its domain-accessing responsibilities, it would be a lot
> harder to untangle that in 2027 (in case it became unmanageable) than it
> was to mash them together in 2022.  I'm not necessarily against mashing
> them, I just think we should give it some heavy scrutiny first..
> > > #format: was introduced to Text in 2019.
> >
> > And long overdue since at least 2015. ;-P Thanks again,
> > Christoph (ct) for adding it! It made GUI programming much
> > easier. I had that one in mind for many years now.
> >...
> > > I don't think updates to Text will or should occur except
> > > when driven by specific need.
> Christoph chose to add that one, #format:, and not hundreds of others that
> day.  His decision was based on _something_ which could be considered
> akin to a "need".  Maybe the lack of need until that "overdue" time
> expresses that Text, in the least, *didn't*, in fact, need that
> responsibility, if not "doesn't", going forward.
> In summary, IMO, if there's any way to keep Text's responsibilities
> separate, it's at least worth considering.
>  - Chris
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20220716/03f0adc1/attachment.html>

More information about the Squeak-dev mailing list