[Newbies] Re: Another extension proposal -> subsets

Zulq Alam me at zulq.net
Thu Jul 24 18:09:53 UTC 2008


Klaus D. Witzel wrote:

> Squeak looks to be already fast in this case, your routine is almost 
> optimal :) Implementing it in Set so that Set's internal can be of 
> benefit won't bring these figures down much -- what remains is (aSet 
> includes: another), times #copyWithout: loop, and that seems to be 
> unavoidable.

It's significantly faster with a more appropriate hashing function:

{2->0 . 3->0 . 4->1 . 5->1 . 6->3 . 7->9 . 8->22 . 9->59 . 10->138 . 
11->305 . 12->686 . 13->1640 . 14->4366 . 15->10550 . 16->28104 . 
17->93373 . 18->303425}

combinations
   | combinations |
   combinations := (PluggableSet new: (2 raisedTo: self size))
     hashBlock:
       [:aSet |
       aSet
         inject: aSet size
         into: [:hash :each | hash * each hash + each]];
     yourself.
   self combinationsInto: combinations.
   ^ combinations

combinationsInto: aSet
   (aSet includes: self) ifTrue: [^ self].
   aSet add: self.
   self do:
     [:each |
     (self copyWithout: each) combinationsInto: aSet].	

I think the problem is that most of the sets contain similar data so the 
hashes (calculated with bitXor) tend towards to a small set of values. 
This means there are a lot of collisions and each #add: and #includes: 
is very expensive.

> 
> How about naming
> 
> Collection>>#asPowerset
>     ^ self asSet powersetInto: (Set new: (2 raisedTo: self size))
> 
> with Set>>#powersetInto: and putting that into the next release.
> 

Even now, I'm still not clear on how one takes a change like this and 
drives it into a squeak release - Tests, Mantis, then what?

Z.



More information about the Beginners mailing list