[squeak-dev] The Trunk: Collections-eem.792.mcz

Eliot Miranda eliot.miranda at gmail.com
Fri May 4 20:10:49 UTC 2018


On Fri, May 4, 2018 at 12:44 PM, Nicolas Cellier <
nicolas.cellier.aka.nice at gmail.com> wrote:

>
>
> 2018-05-04 0:50 GMT+02:00 Eliot Miranda <eliot.miranda at gmail.com>:
>
>> Hi Tobias, Hi All,
>>
>>
>> > On May 3, 2018, at 3:08 PM, Levente Uzonyi <leves at caesar.elte.hu>
>> wrote:
>> >
>> >> On Thu, 3 May 2018, Tobias Pape wrote:
>> >>
>> >>
>> >>> On 03.05.2018, at 22:48, Nicolas Cellier <
>> nicolas.cellier.aka.nice at gmail.com> wrote:
>> >>> But WideString requires another hack...
>> >>
>> >> Like
>> >>
>> >>    ^false
>> >>
>> >> ? :D
>> >
>> > Not really: ((WideString new: 2) first: 1) isAsciiString
>> >
>> > Levente
>>
>> Note that this is a common issue in Smalltalk, where we can have
>> different implementations (classes) with the same interface.  Take
>> LargeInteger and SmallInteger.  The arithmetic system and the VM are both
>> implemented to almost never represent something in the SmallInteger range
>> as a LargeInteger (there are rare circumstances but it's safe to assume
>> that the invariant is always maintained, and the invariant is depended
>> upon).  This allows the VM to only ever check for SmallIntegers for things
>> like indices, never having to waste code bloat or cycles checking for
>> denormalised LargeIntegers.
>>
>> Why can we do this with SmallInteger & LargeInteger, but not with
>> ByteString and WideString?  Because ByteString and WideString are mutable
>> (and because of the FFI).  Were the system to maintain the invariant that
>> strings containing characters in the range 0 to 255 were always represented
>> by ByteString, then, Tobias, your WideString>>isAsciiString ^false would
>> work.  But the cost of maintaining that invariant would be scanning the ret
>> of the string every time at:put: deposited a byte character, to see if we
>> had just replaced the last wide character by a byte one and hence needed to
>> do a become:.  We'd also potentially spend a lot of time doing becomes, and
>> we'd also have to allow for denormalisation when passing an ascii string
>> through the FFI to code requiring a wide string.  And even worse we'd have
>> to avoid WideString new: n like the plague since new strings are always
>> ascii, being full of nuls.  So only WideString with:... forms would make
>> sense.
>>
>> In such case we would maintain a count of non-byte characters and avoid
> scanning...
>

Only possible if the representation makes room for a count, which could
easily require more than 24 bits.  It is non-trivialto implement, and of
course slows down access.

So this kind of multiple implementation approach only works well with
>> certain types and access patterns.  Interesting, no?
>>
>
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20180504/fa8f93bd/attachment.html>


More information about the Squeak-dev mailing list