[squeak-dev] Letting Set contain nils?

Igor Stasenko siguctua at gmail.com
Sat Aug 9 02:07:57 UTC 2008


2008/8/9 Eliot Miranda <eliot.miranda at gmail.com>:
>
>
> On Fri, Aug 8, 2008 at 6:09 PM, Igor Stasenko <siguctua at gmail.com> wrote:
>>
>> 2008/8/9 Eliot Miranda <eliot.miranda at gmail.com>:
>> >
>> >
>> > On Fri, Aug 8, 2008 at 5:46 PM, Igor Stasenko <siguctua at gmail.com>
>> > wrote:
>> >>
>> >> 2008/8/9 Eliot Miranda <eliot.miranda at gmail.com>:
>> >> >
>> >> >
>> >> > On Fri, Aug 8, 2008 at 3:53 PM, Igor Stasenko <siguctua at gmail.com>
>> >> > wrote:
>> >> >>
>> >> >> Sets which not allowing contain nil as element is a point of
>> >> >> inconvenience.
>> >> >>
>> >> >> There are two ways how get around that:
>> >> >> - initialize an array which contains set elements with unique object
>> >> >> ~~
>> >> >> nil.
>> >> >> and fix methods which testing for an empty slots to compare against
>> >> >> this object, not nil.
>> >> >> This can cause a slowdown during rehashing, because VM initially
>> >> >> creating arrays filled with nils, while we need to fill them with
>> >> >> another object.
>> >> >> There is also a problem with preserving it 'uniqueness' by not
>> >> >> giving
>> >> >> this object outside a set.
>> >> >>
>> >> >> - add an instVar 'containsNil'
>> >> >> then when set receiving 'add: nil' , it simply sets this flag to
>> >> >> true.
>> >> >> modify #collect: , #do: , #remove: to be aware of flag value.
>> >> >>
>> >> >> I find the second way is more appropriate. While it costs additional
>> >> >> memory per Set/IdentitySet instance, it costs almost nothing in
>> >> >> speed.
>> >> >>
>> >> >> What do you think about supporting Sets to contain nils in general,
>> >> >> and about methods how to achieve that?
>> >> >
>> >> > Here's a third approach (a variation on your 2nd approach).  Have an
>> >> > instVar
>> >> > 'includesSelf' and fill the array with the Set itself.  So add a new
>> >> > primitive new:fillWith: (primitiveNewWithArgAndFillValue?) and use
>> >> > this
>> >> > to
>> >> > create the empty array filled with the Set itself.  Check for adding
>> >> > the
>> >> > set
>> >> > itself, and itself being the null entry.  The advantage over the flag
>> >> > for
>> >> > nil approach is that you kill two birds with one stone.
>> >> > 1. You need a unique value anyway, and the Set can nicely be its own
>> >> > unique
>> >> > value
>> >> > 2. recursive collections are a problem to print and with the explicit
>> >> > flag
>> >> > this becomes much easier to deal with.
>> >>
>> >> In math domain, any set includes itself , not as element of course,
>> >> but as subset :)
>> >> And i don't see how this is better comparing to having 'containsNil'
>> >> ivar?
>> >> You still have to test this flag in each method which deals with
>> >> elements , so be it containsFoo or containsBar - no real difference.
>> >
>> > One difference is the use of self for the unique object instead of
>> > another.
>> > Another difference is that recursive sets no longer fail to print,
>> > inspect,
>> > etc.
>> > I think those are real differences.
>> >
>>
>> Valid point.
>>
>> As a bytecode proofy, can you tell how much different a bytecode will be
>> for:
>
> what's a proofy?  ;)
>
an expert :)

>>
>> obj == nil
>> versus
>> obj == self
>> versus
>> obj isNil
>
> Depends.  In an Interpreter obj == self likely to be slightly faster than
> obj == nil since nil must be fetched either form the specialObjectsArray (if
> bytecode set has pushNl, as it does) or from the method's literal frame.  In
> a JIT they're liely the same because self is a read through the frame
> pointer and nil is a constant embedded in the instruction stream.  These are
> likely to be of similar cost.
> obj isNil will either be the same cost as == nil or slower depending on
> whether the bytecode compiler inlines it.
> But the difference between obj == self & obj == nil/obj isNil will be in the
> noise.
> You should make the decision on convenience.
>>
>> what is faster/slower?
>
> anArray asSet size is faster than any of the alternatives because it is
> easier to thunk about ;)  Faster thought is much more valuable than faster
> processing, c.f. Smalltalk programs vs C++ programs ;)
>

yes, i raised this topic exactly from this reason.
What i'm not sure that is this change is so badly needed.
Maybe its only me who get stuck with a problem how to deal with sets
where nils are meaningful and useful value.

-- 
Best regards,
Igor Stasenko AKA sig.



More information about the Squeak-dev mailing list