[squeak-dev] The Trunk: Collections-pre.857.mcz

Thiede, Christoph Christoph.Thiede at student.hpi.uni-potsdam.de
Mon Sep 6 10:55:10 UTC 2021


Hi Dave,


would you (or someone else) mind merging Collections-ct.956 which flushes the caches in CharacterSet? At the moment, CharacterSet separators still does not contain the SOH character. :-)


Best,

Christoph

________________________________
Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von Thiede, Christoph
Gesendet: Montag, 30. August 2021 11:30:47
An: squeak-dev at lists.squeakfoundation.org; lewis at mail.msen.com
Betreff: Re: [squeak-dev] The Trunk: Collections-pre.857.mcz

Great, thank you, Dave! :-)

Best,
Christoph

---
Sent from Squeak Inbox Talk<https://github.com/hpi-swa-lab/squeak-inbox-talk>

On 2021-08-26T17:10:40-04:00, lewis at mail.msen.com wrote:

> Done. Updated in Collections-dtl.954.
>
> Dave
>
>
> On Sat, Aug 21, 2021 at 02:34:52PM -0400, David T. Lewis wrote:
> > Hi Christoph,
> >
> > I just tried this again, but it results in a new test failure for
> > CharacterSetTest>>testIntersectionOfLazy. I'm not sure I understand
> > the implications, but I am attaching the change in case someone wants
> > to have a look at it.
> >
> > BTW, it's very nice reviewing issues like this in your Squeak inbox Talk
> > utility :-)
> >
> > Dave
> >
> >
> >
> > On Thu, Aug 19, 2021 at 06:11:14PM +0000, Thiede, Christoph wrote:
> > > Hi all,
> > >
> > > two years later, I still would love to see Patrick's proposal being accepted in the Trunk.
> > >
> > > My concrete problem with SOH (start of header) not being in Character separators is that text anchors in Smalltalk source code currently mix up the Shout styler, which is due to the send to CharacterSet nonSeparators from SHParserST80 scanWhitespace. Now one might argue that we could introduce a separate CharacterSet notAtAllSeparators/nonUnicodeSeparator autc. (which would also exclude SOH), but I would rather dislike this proposal because it would force us to maintain multiple different definitions of the term "character" and increase the overall domain complexity. I can't see what would be wrong with treating all character instances according to the Unicode standard (as other frameworks such as .NET seem to do, too).
> > >
> > > I have been using Dave's version of Character separators from above for the latest months and I did not experience any unintended side effects of the change.
> > >
> > > Could we please integrate Patrick's change, or are there any major objections? It would be great to get this kind of stuff working in Babylonian & Co. :-)
> > >
> > > Best,
> > > Christoph
> > >
> > > PS: See also: http://forum.world.st/ENH-isSeparator-td5129517.html
> > >
> > > ________________________________
> > > Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von Thiede, Christoph
> > > Gesendet: Montag, 21. Juni 2021 15:47:49
> > > An: The general-purpose Squeak developers list
> > > Betreff: Re: [squeak-dev] The Trunk: Collections-pre.857.mcz
> > >
> > >
> > > > It is! but do we really need a collect for this static list?
> > >
> > > > We could put that code in a comment and just return the resulting string?
> > >
> > >
> > > My suggestion would be
> > >
> > > <http://www.hpi.de/>
> > >
> > > ^ Separators ifNil: [Separators := self allCharacters select: [:ea | ea isSeparator]]
> > >
> > > Then we won't need to duplicate the logic.
> > >
> > > Best,
> > > Christoph
> > > ________________________________
> > > Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von Tobias Pape <Das.Linux at gmx.de>
> > > Gesendet: Montag, 21. Juni 2021 12:37:13
> > > An: The general-purpose Squeak developers list
> > > Betreff: Re: [squeak-dev] The Trunk: Collections-pre.857.mcz
> > >
> > > Hi Dave
> > >
> > >
> > > > On 4. Oct 2019, at 15:26, David T. Lewis <lewis at mail.msen.com> wrote:
> > > >
> > > > +1
> > > >
> > > > This sounds like a good approach to me. If I understand correctly, it
> > > > amounts to this:
> > > >
> > > > Character class>>separators
> > > > "Answer a collection of the standard ASCII separator characters."
> > > >
> > > > ^ #(32 "space"
> > > > 13 "cr"
> > > > 9 "tab"
> > > > 10 "line feed"
> > > > 12 "form feed"
> > > > 1 "text separator")
> > > > collect: [:v | Character value: v] as: String
> > > >
> > > > This seems simple and clear to me.
> > >
> > > It is! but do we really need a collect for this static list?
> > > We could put that code in a comment and just return the resulting string?
> > >
> > > Best regards
> > > -Tobias
> > >
> > > >
> > > > There are a lot of senders of #separators in the image, so it is
> > > > possible that it might have some unintended side effect. But that
> > > > seems unlikely.
> > > >
> > > > Dave
> > > >
> > > >
> > > > On Fri, Oct 04, 2019 at 01:01:30PM +0200, patrick.rein at hpi.uni-potsdam.de wrote:
> > > >> Hi everyone,
> > > >>
> > > >> in the context of the new text anchor layouting infrastructure there is still one thing missing. Currently start of header is not included in Character class>>#separators. This leads to problems with text editing Morphs. As start of header is not printed at all (not even as white space) I would rather classify it as a separator and add it to the list in Character class>>#separators and to Character>>#isSeparator. The advantage of this approach is that we would not need any special case in the text editing morphs. The disadvantage is that the list of separators will be less obvious to understand.
> > > >>
> > > >> Any thoughts about this?
> > > >>
> > > >> Bests
> > > >> Patrick
> > > >>
> > > >> P.S.: Conceptually start of header (or heading) is a control character. ietf says: " A communication control character used at the beginning of a sequence of characters which constitute a machine-sensible address or routing information. Such a sequence is referred to as the "heading." An STX character has the effect of terminating a heading."
> > > >> (https://tools.ietf.org/html/rfc20#section-5.2)
> > > >>
> > > >>> Patrick Rein uploaded a new version of Collections to project The Trunk:
> > > >>> http://source.squeak.org/trunk/Collections-pre.857.mcz
> > > >>>
> > > >>> ==================== Summary ====================
> > > >>>
> > > >>> Name: Collections-pre.857
> > > >>> Author: pre
> > > >>> Time: 4 October 2019, 11:04:30.363303 am
> > > >>> UUID: 5ef00b65-3884-c445-b276-0cc01f0b10a1
> > > >>> Ancestors: Collections-pre.856
> > > >>>
> > > >>> Adds startOfHeader to Character, adds empty abstract implementations of scanFrom:, writeScanOn: to TextAttribute to allow for Texts which include TextAttributes which do not implement serialization to still be serialized, adds a comment to these methods.
> > > >>>
> > > >>> =============== Diff against Collections-pre.856 ===============
> > > >>>
> > > >>> Item was added:
> > > >>> + ----- Method: Character class>>startOfHeader (in category 'accessing untypeable characters') -----
> > > >>> + startOfHeader
> > > >>> +
> > > >>> + ^ self value: 1 !
> > > >>>
> > > >>> Item was added:
> > > >>> + ----- Method: TextAttribute class>>scanFrom: (in category 'fileIn/Out') -----
> > > >>> + scanFrom: strm
> > > >>> + "Read the text attribute properties from the stream. When this method has
> > > >>> + been called the concrete TextAttribute class has already been selected via
> > > >>> + scanCharacter. (see TextAttribute class>>#newFrom:).
> > > >>> + For writing the format see TextAttribute>>#writeScanOn:"!
> > > >>>
> > > >>> Item was added:
> > > >>> + ----- Method: TextAttribute>>writeScanOn: (in category 'fileIn/fileOut') -----
> > > >>> + writeScanOn: strm
> > > >>> + "Implement this method for a text attribute to define how it it should be written
> > > >>> + to a serialized form of a text object. The form should correspond to the source
> > > >>> + file format, i.e. use a scan character to denote its subclass.
> > > >>> + As TextAttributes are stored in RunArrays, this method is mostly called from RunArray>>#write scan.
> > > >>> + For reading the written information see TextAttribute class>>#scanFrom:"
> > > >>> +
> > > >>> + "Do nothing because of abstract class"!
> > > >>>
> > > >>>
> > > >>
> > > >
> > >
> > >
> > >
> >
> > >
> >
>
> > 'From Squeak6.0alpha of 13 August 2021 [latest update: #20601] on 21 August 2021 at 2:25:03 pm'!!Character class methodsFor: 'instance creation' stamp: 'dtl 8/21/2021 14:16'!separators    "Answer a collection of the standard ASCII separator characters."    ^ {    Character value: 32. "space"        Character value: 13. "cr"        Character value: 9. "tab"        Character value: 10. "line feed"        Character value: 12. "form feed"        Character value: 1. "start of heading"    } as: String! !
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20210906/7318671e/attachment.html>


More information about the Squeak-dev mailing list