Chris Muller uploaded a new version of Collections to project The Inbox: http://source.squeak.org/inbox/Collections-cmm.1022.mcz
==================== Summary ====================
Name: Collections-cmm.1022 Author: cmm Time: 18 November 2022, 11:14:01.223447 pm UUID: 9a05d251-4f02-43de-94e8-0f077ec51680 Ancestors: Collections-nice.1021
Allow empty fields when using #subStrings:.
=============== Diff against Collections-nice.1021 ===============
Item was changed: ----- Method: String>>subStrings: (in category 'converting') ----- subStrings: separators "Answer an array containing the substrings in the receiver separated by the elements of separators." | char result sourceStream subString | - #Collectn. - "Changed 2000/04/08 For ANSI <readableString> protocol." (separators isString or:[separators allSatisfy: [:element | element isCharacter]]) ifFalse: [^ self error: 'separators must be Characters.']. sourceStream := ReadStream on: self. result := OrderedCollection new. subString := String new. [sourceStream atEnd] whileFalse: [char := sourceStream next. (separators includes: char) + ifTrue: [result add: subString copy. + subString := String new] - ifTrue: [subString notEmpty - ifTrue: - [result add: subString copy. - subString := String new]] ifFalse: [subString := subString , (String with: char)]]. subString notEmpty ifTrue: [result add: subString copy]. ^ result asArray!
I don't know why subStrings: should behave differently for empty fields as non-empty.
Before this change, empty fields are ignored:
'a:b::d' subStrings: ':' ==> " #('a' 'b' 'd') " '::::' subStrings: ':' ==> #()
after this change, empty fields are honored:
'a:b::d' subStrings: ':' ==> " #('a' 'b' '' 'd') " '::::' subStrings: ':' ==> #('' '' '' '')
Thoughts?
- Chris
On Fri, Nov 18, 2022 at 11:14 PM commits@source.squeak.org wrote:
Chris Muller uploaded a new version of Collections to project The Inbox: http://source.squeak.org/inbox/Collections-cmm.1022.mcz
==================== Summary ====================
Name: Collections-cmm.1022 Author: cmm Time: 18 November 2022, 11:14:01.223447 pm UUID: 9a05d251-4f02-43de-94e8-0f077ec51680 Ancestors: Collections-nice.1021
Allow empty fields when using #subStrings:.
=============== Diff against Collections-nice.1021 ===============
Item was changed: ----- Method: String>>subStrings: (in category 'converting') ----- subStrings: separators "Answer an array containing the substrings in the receiver separated by the elements of separators." | char result sourceStream subString |
#Collectn.
"Changed 2000/04/08 For ANSI <readableString> protocol." (separators isString or:[separators allSatisfy: [:element |
element isCharacter]]) ifFalse: [^ self error: 'separators must be Characters.']. sourceStream := ReadStream on: self. result := OrderedCollection new. subString := String new. [sourceStream atEnd] whileFalse: [char := sourceStream next. (separators includes: char)
ifTrue: [result add: subString copy.
subString :=
String new]
ifTrue: [subString notEmpty
ifTrue:
[result add:
subString copy.
subString :=
String new]] ifFalse: [subString := subString , (String with: char)]]. subString notEmpty ifTrue: [result add: subString copy]. ^ result asArray!
Hm ... don't we have #splitBy: for this? This seems like a possibly dangerous breaking change to me. No strong opinion, though. :-)
Best,
Christoph
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Chris Muller asqueaker@gmail.com Gesendet: Samstag, 19. November 2022 06:24:43 An: squeak-dev@lists.squeakfoundation.org Betreff: Re: [squeak-dev] The Inbox: Collections-cmm.1022.mcz
I don't know why subStrings: should behave differently for empty fields as non-empty.
Before this change, empty fields are ignored:
'a:b::d' subStrings: ':' ==> " #('a' 'b' 'd') " '::::' subStrings: ':' ==> #()
after this change, empty fields are honored:
'a:b::d' subStrings: ':' ==> " #('a' 'b' '' 'd') " '::::' subStrings: ':' ==> #('' '' '' '')
Thoughts?
- Chris
On Fri, Nov 18, 2022 at 11:14 PM <commits@source.squeak.orgmailto:commits@source.squeak.org> wrote: Chris Muller uploaded a new version of Collections to project The Inbox: http://source.squeak.org/inbox/Collections-cmm.1022.mcz
==================== Summary ====================
Name: Collections-cmm.1022 Author: cmm Time: 18 November 2022, 11:14:01.223447 pm UUID: 9a05d251-4f02-43de-94e8-0f077ec51680 Ancestors: Collections-nice.1021
Allow empty fields when using #subStrings:.
=============== Diff against Collections-nice.1021 ===============
Item was changed: ----- Method: String>>subStrings: (in category 'converting') ----- subStrings: separators "Answer an array containing the substrings in the receiver separated by the elements of separators." | char result sourceStream subString | - #Collectn. - "Changed 2000/04/08 For ANSI <readableString> protocol." (separators isString or:[separators allSatisfy: [:element | element isCharacter]]) ifFalse: [^ self error: 'separators must be Characters.']. sourceStream := ReadStream on: self. result := OrderedCollection new. subString := String new. [sourceStream atEnd] whileFalse: [char := sourceStream next. (separators includes: char) + ifTrue: [result add: subString copy. + subString := String new] - ifTrue: [subString notEmpty - ifTrue: - [result add: subString copy. - subString := String new]] ifFalse: [subString := subString , (String with: char)]]. subString notEmpty ifTrue: [result add: subString copy]. ^ result asArray!
Hi,
Were the method new, I wouldn't mind if it treated empty fields differently. But since it is not new, I think this is indeed a change that may break applications if they have relied on the previous behavior. Therefore I vote against it unless it is shown that Squeak is the only Smalltalk that implements subStrings: in the current way. The ANSI standard is unfortunately vague and not much guidance:
Synopsis Answer an array containing the substrings in the receiver separated by the elements of separators. Definition: <readableString> Answer an array of strings. Each element represents a group of characters separated by any of the characters in the list of separators. Return Values <Array> unspecified
@Chris: does splitBy:, as recommended by Christoph, cover your requirements?
Should we add a message to split on _any_ of the elements of the argument (like subStrings:, unlike splitBy:), preserving empty groups (like splitBy:, unlike subStrings:)? For example, #splitByAnyOf:
Kind regards, Jakob
Am Sa., 19. Nov. 2022 um 13:23 Uhr schrieb Thiede, Christoph < Christoph.Thiede@student.hpi.uni-potsdam.de>:
Hm ... don't we have #splitBy: for this? This seems like a possibly dangerous breaking change to me. No strong opinion, though. :-)
Best,
Christoph
*Von:* Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Chris Muller asqueaker@gmail.com *Gesendet:* Samstag, 19. November 2022 06:24:43 *An:* squeak-dev@lists.squeakfoundation.org *Betreff:* Re: [squeak-dev] The Inbox: Collections-cmm.1022.mcz
I don't know why subStrings: should behave differently for empty fields as non-empty.
Before this change, empty fields are ignored:
'a:b::d' subStrings: ':' ==> " #('a' 'b' 'd') " '::::' subStrings: ':' ==> #()
after this change, empty fields are honored:
'a:b::d' subStrings: ':' ==> " #('a' 'b' '' 'd') " '::::' subStrings: ':' ==> #('' '' '' '')
Thoughts?
- Chris
On Fri, Nov 18, 2022 at 11:14 PM commits@source.squeak.org wrote:
Chris Muller uploaded a new version of Collections to project The Inbox: http://source.squeak.org/inbox/Collections-cmm.1022.mcz
==================== Summary ====================
Name: Collections-cmm.1022 Author: cmm Time: 18 November 2022, 11:14:01.223447 pm UUID: 9a05d251-4f02-43de-94e8-0f077ec51680 Ancestors: Collections-nice.1021
Allow empty fields when using #subStrings:.
=============== Diff against Collections-nice.1021 ===============
Item was changed: ----- Method: String>>subStrings: (in category 'converting') ----- subStrings: separators "Answer an array containing the substrings in the receiver separated by the elements of separators." | char result sourceStream subString |
#Collectn.
"Changed 2000/04/08 For ANSI <readableString> protocol." (separators isString or:[separators allSatisfy: [:element |
element isCharacter]]) ifFalse: [^ self error: 'separators must be Characters.']. sourceStream := ReadStream on: self. result := OrderedCollection new. subString := String new. [sourceStream atEnd] whileFalse: [char := sourceStream next. (separators includes: char)
ifTrue: [result add: subString copy.
subString :=
String new]
ifTrue: [subString notEmpty
ifTrue:
[result add:
subString copy.
subString :=
String new]] ifFalse: [subString := subString , (String with: char)]]. subString notEmpty ifTrue: [result add: subString copy]. ^ result asArray!
On Sat, Nov 19, 2022 at 07:09:10PM +0100, Jakob Reschke wrote:
Should we add a message to split on _any_ of the elements of the argument (like subStrings:, unlike splitBy:), preserving empty groups (like splitBy:, unlike subStrings:)? For example, #splitByAnyOf:
I would say to add it if only if it is really going to be used, otherwise no.
With #splitBy: #subStrings: and #findTokens: the protocol already seems a bit cluttered.
Dave
So does that throw away ANSI compat? -t
On 19. Nov 2022, at 13:23, Thiede, Christoph Christoph.Thiede@student.hpi.uni-potsdam.de wrote:
Allow empty fields when using #subStrings:.
=============== Diff against Collections-nice.1021 ===============
Item was changed: ----- Method: String>>subStrings: (in category 'converting') ----- subStrings: separators "Answer an array containing the substrings in the receiver separated by the elements of separators." | char result sourceStream subString |
#Collectn.
"Changed 2000/04/08 For ANSI <readableString> protocol." (separators isString or:[separators allSatisfy: [:element | element isCharacter]]) ifFalse: [^ self error: 'separators must be Characters.']. sourceStream := ReadStream on: self. result := OrderedCollection new. subString := String new. [sourceStream atEnd] whileFalse: [char := sourceStream next. (separators includes: char)
ifTrue: [result add: subString copy.
subString := String new]
ifTrue: [subString notEmpty
ifTrue:
[result add: subString copy.
subString := String new]] ifFalse: [subString := subString , (String with: char)]]. subString notEmpty ifTrue: [result add: subString copy]. ^ result asArray!
You're all 100% right. I had forgotten about #splitBy:, but changing subStrings: certainly didn't feel right. Interestingly, the in-image Method Finder doesn't find either one. Thank you, community! :)
I moved this to Treated.
- Chris
On Sat, Nov 19, 2022 at 6:23 AM Thiede, Christoph < Christoph.Thiede@student.hpi.uni-potsdam.de> wrote:
Hm ... don't we have #splitBy: for this? This seems like a possibly dangerous breaking change to me. No strong opinion, though. :-)
Best,
Christoph
*Von:* Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Chris Muller asqueaker@gmail.com *Gesendet:* Samstag, 19. November 2022 06:24:43 *An:* squeak-dev@lists.squeakfoundation.org *Betreff:* Re: [squeak-dev] The Inbox: Collections-cmm.1022.mcz
I don't know why subStrings: should behave differently for empty fields as non-empty.
Before this change, empty fields are ignored:
'a:b::d' subStrings: ':' ==> " #('a' 'b' 'd') " '::::' subStrings: ':' ==> #()
after this change, empty fields are honored:
'a:b::d' subStrings: ':' ==> " #('a' 'b' '' 'd') " '::::' subStrings: ':' ==> #('' '' '' '')
Thoughts?
- Chris
On Fri, Nov 18, 2022 at 11:14 PM commits@source.squeak.org wrote:
Chris Muller uploaded a new version of Collections to project The Inbox: http://source.squeak.org/inbox/Collections-cmm.1022.mcz
==================== Summary ====================
Name: Collections-cmm.1022 Author: cmm Time: 18 November 2022, 11:14:01.223447 pm UUID: 9a05d251-4f02-43de-94e8-0f077ec51680 Ancestors: Collections-nice.1021
Allow empty fields when using #subStrings:.
=============== Diff against Collections-nice.1021 ===============
Item was changed: ----- Method: String>>subStrings: (in category 'converting') ----- subStrings: separators "Answer an array containing the substrings in the receiver separated by the elements of separators." | char result sourceStream subString |
#Collectn.
"Changed 2000/04/08 For ANSI <readableString> protocol." (separators isString or:[separators allSatisfy: [:element |
element isCharacter]]) ifFalse: [^ self error: 'separators must be Characters.']. sourceStream := ReadStream on: self. result := OrderedCollection new. subString := String new. [sourceStream atEnd] whileFalse: [char := sourceStream next. (separators includes: char)
ifTrue: [result add: subString copy.
subString :=
String new]
ifTrue: [subString notEmpty
ifTrue:
[result add:
subString copy.
subString :=
String new]] ifFalse: [subString := subString , (String with: char)]]. subString notEmpty ifTrue: [result add: subString copy]. ^ result asArray!
Interestingly, the in-image Method Finder doesn't find either one.
Simulation Method Finder does. :-)
Best, Christoph
________________________________ Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Chris Muller asqueaker@gmail.com Gesendet: Montag, 21. November 2022, 00:00 An: The general-purpose Squeak developers list squeak-dev@lists.squeakfoundation.org Betreff: Re: [squeak-dev] The Inbox: Collections-cmm.1022.mcz
You're all 100% right. I had forgotten about #splitBy:, but changing subStrings: certainly didn't feel right. Interestingly, the in-image Method Finder doesn't find either one. Thank you, community! :)
I moved this to Treated.
- Chris
On Sat, Nov 19, 2022 at 6:23 AM Thiede, Christoph <Christoph.Thiede@student.hpi.uni-potsdam.demailto:Christoph.Thiede@student.hpi.uni-potsdam.de> wrote:
Hm ... don't we have #splitBy: for this? This seems like a possibly dangerous breaking change to me. No strong opinion, though. :-)
Best,
Christoph
________________________________ Von: Squeak-dev <squeak-dev-bounces@lists.squeakfoundation.orgmailto:squeak-dev-bounces@lists.squeakfoundation.org> im Auftrag von Chris Muller <asqueaker@gmail.commailto:asqueaker@gmail.com> Gesendet: Samstag, 19. November 2022 06:24:43 An: squeak-dev@lists.squeakfoundation.orgmailto:squeak-dev@lists.squeakfoundation.org Betreff: Re: [squeak-dev] The Inbox: Collections-cmm.1022.mcz
I don't know why subStrings: should behave differently for empty fields as non-empty.
Before this change, empty fields are ignored:
'a:b::d' subStrings: ':' ==> " #('a' 'b' 'd') " '::::' subStrings: ':' ==> #()
after this change, empty fields are honored:
'a:b::d' subStrings: ':' ==> " #('a' 'b' '' 'd') " '::::' subStrings: ':' ==> #('' '' '' '')
Thoughts?
- Chris
On Fri, Nov 18, 2022 at 11:14 PM <commits@source.squeak.orgmailto:commits@source.squeak.org> wrote: Chris Muller uploaded a new version of Collections to project The Inbox: http://source.squeak.org/inbox/Collections-cmm.1022.mcz
==================== Summary ====================
Name: Collections-cmm.1022 Author: cmm Time: 18 November 2022, 11:14:01.223447 pm UUID: 9a05d251-4f02-43de-94e8-0f077ec51680 Ancestors: Collections-nice.1021
Allow empty fields when using #subStrings:.
=============== Diff against Collections-nice.1021 ===============
Item was changed: ----- Method: String>>subStrings: (in category 'converting') ----- subStrings: separators "Answer an array containing the substrings in the receiver separated by the elements of separators." | char result sourceStream subString | - #Collectn. - "Changed 2000/04/08 For ANSI <readableString> protocol." (separators isString or:[separators allSatisfy: [:element | element isCharacter]]) ifFalse: [^ self error: 'separators must be Characters.']. sourceStream := ReadStream on: self. result := OrderedCollection new. subString := String new. [sourceStream atEnd] whileFalse: [char := sourceStream next. (separators includes: char) + ifTrue: [result add: subString copy. + subString := String new] - ifTrue: [subString notEmpty - ifTrue: - [result add: subString copy. - subString := String new]] ifFalse: [subString := subString , (String with: char)]]. subString notEmpty ifTrue: [result add: subString copy]. ^ result asArray!
FYI
https://github.com/LinqLover/SimulationStudio#simulationstudio-tools [https://github.com/LinqLover/SimulationStudio#simulationstudio-tools]
Best, Marcel Am 21.11.2022 01:11:41 schrieb Thiede, Christoph christoph.thiede@student.hpi.uni-potsdam.de:
Interestingly, the in-image Method Finder doesn't find either one.
Simulation Method Finder does. :-)
Best, Christoph
Von: Squeak-dev squeak-dev-bounces@lists.squeakfoundation.org im Auftrag von Chris Muller asqueaker@gmail.com Gesendet: Montag, 21. November 2022, 00:00 An: The general-purpose Squeak developers list squeak-dev@lists.squeakfoundation.org Betreff: Re: [squeak-dev] The Inbox: Collections-cmm.1022.mcz
You're all 100% right. I had forgotten about #splitBy:, but changing subStrings: certainly didn't feel right. Interestingly, the in-image Method Finder doesn't find either one. Thank you, community! :)
I moved this to Treated.
- Chris
On Sat, Nov 19, 2022 at 6:23 AM Thiede, Christoph <Christoph.Thiede@student.hpi.uni-potsdam.de [mailto:Christoph.Thiede@student.hpi.uni-potsdam.de]> wrote:
Hm ... don't we have #splitBy: for this? This seems like a possibly dangerous breaking change to me. No strong opinion, though. :-)
Best, Christoph Von: Squeak-dev <squeak-dev-bounces@lists.squeakfoundation.org [mailto:squeak-dev-bounces@lists.squeakfoundation.org]> im Auftrag von Chris Muller <asqueaker@gmail.com [mailto:asqueaker@gmail.com]> Gesendet: Samstag, 19. November 2022 06:24:43 An: squeak-dev@lists.squeakfoundation.org [mailto:squeak-dev@lists.squeakfoundation.org] Betreff: Re: [squeak-dev] The Inbox: Collections-cmm.1022.mcz I don't know why subStrings: should behave differently for empty fields as non-empty.
Before this change, empty fields are ignored:
'a:b::d' subStrings: ':' ==> " #('a' 'b' 'd') " '::::' subStrings: ':' ==> #()
after this change, empty fields are honored:
'a:b::d' subStrings: ':' ==> " #('a' 'b' '' 'd') " '::::' subStrings: ':' ==> #('' '' '' '')
Thoughts?
- Chris
On Fri, Nov 18, 2022 at 11:14 PM <commits@source.squeak.org [mailto:commits@source.squeak.org]> wrote:
Chris Muller uploaded a new version of Collections to project The Inbox: http://source.squeak.org/inbox/Collections-cmm.1022.mcz [http://source.squeak.org/inbox/Collections-cmm.1022.mcz]
==================== Summary ====================
Name: Collections-cmm.1022 Author: cmm Time: 18 November 2022, 11:14:01.223447 pm UUID: 9a05d251-4f02-43de-94e8-0f077ec51680 Ancestors: Collections-nice.1021
Allow empty fields when using #subStrings:.
=============== Diff against Collections-nice.1021 ===============
Item was changed: ----- Method: String>>subStrings: (in category 'converting') ----- subStrings: separators "Answer an array containing the substrings in the receiver separated by the elements of separators." | char result sourceStream subString | - #Collectn. - "Changed 2000/04/08 For ANSI <readableString> protocol." (separators isString or:[separators allSatisfy: [:element | element isCharacter]]) ifFalse: [^ self error: 'separators must be Characters.']. sourceStream := ReadStream on: self. result := OrderedCollection new. subString := String new. [sourceStream atEnd] whileFalse: [char := sourceStream next. (separators includes: char) + ifTrue: [result add: subString copy. + subString := String new] - ifTrue: [subString notEmpty - ifTrue: - [result add: subString copy. - subString := String new]] ifFalse: [subString := subString , (String with: char)]]. subString notEmpty ifTrue: [result add: subString copy]. ^ result asArray!
Hi Chris,
why “add: subString copy” and not just “add: subString”? subString’s values are all instantiated locally to subStrings:, don’t escape other than through the result sequence, and are not duplicated. The send of copy looks to me to have been a misunderstanding on behalf of a prior author.
I’d also love to see the argument be able to be either a string or a character, with two separate loops in ifTrue:ifFalse: gif efficiency, avoiding instantiating a singleton collection if the arg is a character.
_,,,^..^,,,_ (phone)
On Nov 18, 2022, at 9:14 PM, commits@source.squeak.org wrote:
Chris Muller uploaded a new version of Collections to project The Inbox: http://source.squeak.org/inbox/Collections-cmm.1022.mcz
==================== Summary ====================
Name: Collections-cmm.1022 Author: cmm Time: 18 November 2022, 11:14:01.223447 pm UUID: 9a05d251-4f02-43de-94e8-0f077ec51680 Ancestors: Collections-nice.1021
Allow empty fields when using #subStrings:.
=============== Diff against Collections-nice.1021 ===============
Item was changed: ----- Method: String>>subStrings: (in category 'converting') ----- subStrings: separators "Answer an array containing the substrings in the receiver separated by the elements of separators." | char result sourceStream subString |
- #Collectn.
- "Changed 2000/04/08 For ANSI <readableString> protocol." (separators isString or:[separators allSatisfy: [:element | element isCharacter]]) ifFalse: [^ self error: 'separators must be Characters.']. sourceStream := ReadStream on: self. result := OrderedCollection new. subString := String new. [sourceStream atEnd] whileFalse: [char := sourceStream next. (separators includes: char)
ifTrue: [result add: subString copy.
subString := String new]
ifTrue: [subString notEmpty
ifTrue:
[result add: subString copy.
subString notEmpty ifTrue: [result add: subString copy]. ^ result asArray!subString := String new]] ifFalse: [subString := subString , (String with: char)]].
Hi Eliot,
why “add: subString copy” and not just “add: subString”? subString’s
values are all instantiated locally to subStrings:, don’t escape other than through the result sequence, and are not duplicated. The send of copy looks to me to have been a misunderstanding on behalf of a prior author.
I actually noticed and wondered about that, but didn't want to distract too much from my main question in case there was some magical need for it. It's so peculiar it was proposed to survive its third developer initials, ha ha. :)
Looks like we have a minor optimization opportunity there the next time someone swings around to subStrings:.
I’d also love to see the argument be able to be either a string or a character, with two separate loops in ifTrue:ifFalse: gif efficiency, avoiding instantiating a singleton collection if the arg is a character.
+1.
- Chris
squeak-dev@lists.squeakfoundation.org