[squeak-dev] The Inbox: Collections-ul.844.mcz

Tobias Pape Das.Linux at gmx.de
Fri Jul 19 19:57:20 UTC 2019


> On 19.07.2019, at 15:54, Levente Uzonyi <leves at caesar.elte.hu> wrote:
> 
> Hi Tobias,
> 
> On Fri, 19 Jul 2019, Tobias wrote:
> 
>> 
>> Hi Levente
>> 
>> On Thu, 2019-07-18 at 22:13 +0000, commits at source.squeak.org wrote:
>>> Levente Uzonyi uploaded a new version of Collections to project The
>>> Inbox:
>>> http://source.squeak.org/inbox/Collections-ul.844.mcz
>>> 
>>> ==================== Summary ====================
>>> 
>>> Name: Collections-ul.844
>>> Author: ul
>>> Time: 19 July 2019, 12:08:51.94435 am
>>> UUID: df5ebfa9-4ebf-4505-8031-afd892a1061c
>>> Ancestors: Collections-mt.843
>>> 
>>> - added String >> #atOrNil: which uses primitive 63 and returns
>>> either the character at the given index or nil when the primtiive
>>> fails. This is a faster alternative to #at:ifAbsent: when the absent
>>> block would yield nil.
>>> 
>> 
>> Could you tell me the use case here?
> 
> ShoutCore-ul.66 in the Inbox.

I see. I'd rather have Shout to make sure the index is valid _before_ accessing the string.
at:ifAbsent: is more concise and still idiomatic. It the out-of-range access a frequent case?
If so, we should really tackle _that_ instead of making pouring out nils fast.

> 
>> This sounds like a convoluted range check.
> 
> Yes, it's a way to avoid the range check being done by the image.
> 
>> I mean, yes it would be faster to do
>> 
>> knorz := 'foo'
>> index := 32
>> ^ knorz atOrNil: index
>> 
>> but I would rather see
>> 
>> knorz := 'foo'
>> index := 32
>> ^ index <= knorz size ifTrue:  [knorz at: size]
> 
> That's not enough. You have to check the other end of the range as well.
> And that's what this method tries to avoid.

But its telling a different story.
We're going pretty low-level there, and I frankly don't see the benefit. Why should performance be paramount?


> 
>> 
>> or more robust even
>> 
>> 
>> knorz := 'foo'
>> index := 32
>> ^ knorz atPin: 32
> 
> You want to be able to distinguish invalid input and correct input in most cases. With Shout that's mandatory, so #atPin: wouldn't work.

Well, I don't know. Rinard says otherwise[1]. In most cases, you want some valid data when accessing some collection; atPin doest that quite well.

The case at hand seems just to indicate that writing parsers is boring and hard, and we probably shouldn't have 2 parsers (3, if your have refactorings installed) in the image, that all will disagree slightly and have different characteristics each.

It's cool to make shout faster, however, we shouldn't introduce easy-to-misuse low-level API for that.

Best regards
	-Tobias :)



[1]: https://people.csail.mit.edu/rinard/paper/osdi04.pdf
> 
> Levente
> 
>> 
>> (even though that changes the semantic)
>> 
>> So, can you tell me the use case here?
>> Best regards
>> 	-Tobias
>> 
>> PS: That said, I think using #at: for String in the first place will
>> bite us down the road, whether we want to change the String
>> representation at some point (ropes, immutables, values, ...) or just
>> because Unicode is hard: ('Lèon' at: 3) can surely yield $'̀ ', $e
>> or $'è'




More information about the Squeak-dev mailing list