[squeak-dev] Re: includes: does linear search for SortedCollection?

Ralph Boland rpboland at gmail.com
Mon Dec 20 16:20:58 UTC 2010


> From: James Foster <Smalltalk at JGFoster.net>
> Subject: Re: [squeak-dev] Re: includes: does linear search for SortedCollection???????

In GemStone we used to do a binary search on #'includes:' but a
customer complained that they had a domain object where #'=' was
implemented using different instance variables than #'<=' so that a
linear search would find an equivalent object but a binary search
would miss it. We changed #'includes:' to use a linear search and
added another method, #'binarySearchIncludes:' to have the old (fast)
behavior.

I think you should have instead implemented #linearSearchIncludes:
and told your customer to use that instead of the regular #includes.
Your customer may not have been happy (or s(he) might of been since
this message flags the special nature of the method and is thus
better self documenting code) but I think it is better to provide all
of your customers with the expected O(log n) worst case cost of
#includes: than to punish them for the sake of one customer.
I must admit though that a quick search of Squeak 10.2 found no
examples of includes: applied to SortedCollections.
Perhaps I should set a breakpoint and see what happens!

Ralph

P.S.    #linearSearchIncludes: really needs a good comment as to why it is
needed.  Perhaps some method  #firstSatisfying: aBlock might be better still.
Wait:  isn't this like #detect: ?

P.P.S.  I don't really understand the nature of your customers needs.
I assume that the problem is that the sorted collection could contain
a series of elements that were equivalent and consecutive values
(say  3,3,3,3)  but where each 3 had some additional property that
distinguished them.  In this case one would have to be careful since
for example a binary search would typically find only the first or the
last.  If what you really want in this case is to compare all of the 3's
to some value to see if any match then a special binary search
could be done that found the range of 3's and a compare could
then be done on each of the 3's. This would still be much faster
than linear search in most circumstances.
If this is not what your customer needs then please tell me why the
customer needs to use a sorted collection?
And if there really is a reason then perhaps he should subclass
SortedCollection for his particular domain and simply override
#includes for a version of his choosing!
Sorry, but the more I think about this the more convinced I become
that your solution is not the correct one.



More information about the Squeak-dev mailing list