[squeak-dev] Letting Set contain nils?

Eliot Miranda eliot.miranda at gmail.com
Sat Aug 9 02:10:00 UTC 2008


On Fri, Aug 8, 2008 at 7:07 PM, Igor Stasenko <siguctua at gmail.com> wrote:

> 2008/8/9 Eliot Miranda <eliot.miranda at gmail.com>:
> >
> >
> > On Fri, Aug 8, 2008 at 6:09 PM, Igor Stasenko <siguctua at gmail.com>
> wrote:
> >>
> >> 2008/8/9 Eliot Miranda <eliot.miranda at gmail.com>:
> >> >
> >> >
> >> > On Fri, Aug 8, 2008 at 5:46 PM, Igor Stasenko <siguctua at gmail.com>
> >> > wrote:
> >> >>
> >> >> 2008/8/9 Eliot Miranda <eliot.miranda at gmail.com>:
> >> >> >
> >> >> >
> >> >> > On Fri, Aug 8, 2008 at 3:53 PM, Igor Stasenko <siguctua at gmail.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> Sets which not allowing contain nil as element is a point of
> >> >> >> inconvenience.
> >> >> >>
> >> >> >> There are two ways how get around that:
> >> >> >> - initialize an array which contains set elements with unique
> object
> >> >> >> ~~
> >> >> >> nil.
> >> >> >> and fix methods which testing for an empty slots to compare
> against
> >> >> >> this object, not nil.
> >> >> >> This can cause a slowdown during rehashing, because VM initially
> >> >> >> creating arrays filled with nils, while we need to fill them with
> >> >> >> another object.
> >> >> >> There is also a problem with preserving it 'uniqueness' by not
> >> >> >> giving
> >> >> >> this object outside a set.
> >> >> >>
> >> >> >> - add an instVar 'containsNil'
> >> >> >> then when set receiving 'add: nil' , it simply sets this flag to
> >> >> >> true.
> >> >> >> modify #collect: , #do: , #remove: to be aware of flag value.
> >> >> >>
> >> >> >> I find the second way is more appropriate. While it costs
> additional
> >> >> >> memory per Set/IdentitySet instance, it costs almost nothing in
> >> >> >> speed.
> >> >> >>
> >> >> >> What do you think about supporting Sets to contain nils in
> general,
> >> >> >> and about methods how to achieve that?
> >> >> >
> >> >> > Here's a third approach (a variation on your 2nd approach).  Have
> an
> >> >> > instVar
> >> >> > 'includesSelf' and fill the array with the Set itself.  So add a
> new
> >> >> > primitive new:fillWith: (primitiveNewWithArgAndFillValue?) and use
> >> >> > this
> >> >> > to
> >> >> > create the empty array filled with the Set itself.  Check for
> adding
> >> >> > the
> >> >> > set
> >> >> > itself, and itself being the null entry.  The advantage over the
> flag
> >> >> > for
> >> >> > nil approach is that you kill two birds with one stone.
> >> >> > 1. You need a unique value anyway, and the Set can nicely be its
> own
> >> >> > unique
> >> >> > value
> >> >> > 2. recursive collections are a problem to print and with the
> explicit
> >> >> > flag
> >> >> > this becomes much easier to deal with.
> >> >>
> >> >> In math domain, any set includes itself , not as element of course,
> >> >> but as subset :)
> >> >> And i don't see how this is better comparing to having 'containsNil'
> >> >> ivar?
> >> >> You still have to test this flag in each method which deals with
> >> >> elements , so be it containsFoo or containsBar - no real difference.
> >> >
> >> > One difference is the use of self for the unique object instead of
> >> > another.
> >> > Another difference is that recursive sets no longer fail to print,
> >> > inspect,
> >> > etc.
> >> > I think those are real differences.
> >> >
> >>
> >> Valid point.
> >>
> >> As a bytecode proofy, can you tell how much different a bytecode will be
> >> for:
> >
> > what's a proofy?  ;)
> >
> an expert :)
>
> >>
> >> obj == nil
> >> versus
> >> obj == self
> >> versus
> >> obj isNil
> >
> > Depends.  In an Interpreter obj == self likely to be slightly faster than
> > obj == nil since nil must be fetched either form the specialObjectsArray
> (if
> > bytecode set has pushNl, as it does) or from the method's literal frame.
>  In
> > a JIT they're liely the same because self is a read through the frame
> > pointer and nil is a constant embedded in the instruction stream.  These
> are
> > likely to be of similar cost.
> > obj isNil will either be the same cost as == nil or slower depending on
> > whether the bytecode compiler inlines it.
> > But the difference between obj == self & obj == nil/obj isNil will be in
> the
> > noise.
> > You should make the decision on convenience.
> >>
> >> what is faster/slower?
> >
> > anArray asSet size is faster than any of the alternatives because it is
> > easier to thunk about ;)  Faster thought is much more valuable than
> faster
> > processing, c.f. Smalltalk programs vs C++ programs ;)
> >
>
> yes, i raised this topic exactly from this reason.
> What i'm not sure that is this change is so badly needed.
> Maybe its only me who get stuck with a problem how to deal with sets
> where nils are meaningful and useful value.


I've certainly run into it before and I know colleagues have in the past.
 What's hard to tell is how much code out there is working around the
limitation.


>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20080808/ce03b94c/attachment.htm


More information about the Squeak-dev mailing list