collect: vs. to:do:

Lex Spoon lex at cc.gatech.edu
Fri Oct 20 20:06:31 UTC 2000


But, OC only takes like 8-12 more bytes than an array.  A typical
bookmorph takes kilobytes.  Anyway, surely you should only optimize
*after* you get the thing working.  Array versus OC is unlikely to make
a critical difference, so why not do the convenient thing first?  FWIW,
I've only had to optimize for memory once that I remember, and when I
did it, I ended up chucking a certain amount of data completely, rather
than encoding anything more compactly.


Anyway, it's not real Smalltalky to use parallel arrays to represent
associations; normally, one would either add instance variables to the
relevant objects, or one would use dictionaries mapping objects to the
new attribute.  In this method, the pages would probably be the natural
place to put a lot of this data, but I'm guessing there are
complications because BookMorph pages can be stored on ftp sites or out
on disk.


-Lex





Dan Winkler <fendidog at yahoo.com> wrote:
> > Dunno, what method are you looking at? 
> 
> I'm looking at BookMorph>>getAllText.  
> 
> Actually, the comment says the author thought he was getting Arrays in both
> cases but I think he gets an OrderedCollection in the first line:
> 
> 	allText _ pages collect: [:pg | OrderedCollection new].
> 	allTextUrls _ Array new: pages size.
> 
> > Is the compactness worth it?
> 
> Yup, because I'm thinking about writing for a Palm Pilot (Pocket Smalltalk)
> where memory is scarce.
> 
> 
> --- Bijan Parsia <bparsia at email.unc.edu> wrote:
> > On Fri, 20 Oct 2000, Dan Winkler wrote:
> > 
> > > Thank you.
> > > 
> > > That's interesting that collect: returns different classes of
> > > collections based on the class of the source collection.  When would
> > > you want that behavior?
> > 
> > Er..all the time? #collect, #select, etc. return collection that is
> > "like" the receiver. Think of them as being like filter and map in Scheme.
> >  
> > > I see in this particular method, the next line allocates and Array
> > > explicitly.  Is there a reason we want to let collect: decide the
> > > class of collection in the first line but use an Array in particular
> > > in the second line?
> > 
> > Dunno, what method are you looking at? Offhand, it's a little hard to
> > tell. I don't see any *harm* in using #collect: to initialize the Array in
> > the first case. As I wrote, maybe you'll want to modify it in the future.
> > 
> > You can't use Array class>>#new:withAll: because you'd end up with only
> > one OrderedCollectoin, which is clearly not what you want. You could have
> > a #new:withAllValuesFrom: or something, that took a block as it's second
> > arg, but I'm not seeing a huge gain here.
> >  
> > > 	allText _ pages collect: [:pg | OrderedCollection new].
> > > 	allTextUrls _ Array new: pages size.
> > > 
> > > Would you say we should change the second line to:
> > > 
> > >         allTextUrls _ pages collect: [:pg | nil].
> > 
> > No that's silly, since all slots are niled by default. I *might* say
> > 
> > 	 allTextUrls _ pages species new: pages size
> > 
> > *if* I thought that keeping all the colletions the same type were
> > important. For all I know, there's a reason to have allTextUrls be an
> > Array.
> > 
> > > As an old C programmer I'd be tempted to go the other way and force them
> > both
> > > to be Arrays for compactness since I know they're not going to grow.
> > 
> > Do you? Is the compactness worth it?
> > 
> > Even if so, I'm still inclined to prefer the #collect: to any more verbose
> > version. I mean, what's the *clarity* gain?
> > 
> > 
> > 	 allText _ pages species new: pages size.
> > 	"or, allText _ Array new: pages size."
> > 	"Gah! I imediately want to do a collect :))"
> > 	1 to: pages size do:  [:i | allText at: i do: OrderedCollection new].
> > 
> > 
> > I *never* see this. So, I would *definitely say it's a Smalltalk
> > idiom. Frankly, I don't see a bit of clarity gained, and much lost.
> > 
> > After all, it's not like one is *confused* by the collect into thinking
> > it's somehow attending to the pages. It's perfectly *clear* what it's
> > doing and why (to initialize allText).
> > 
> > In the end, when it gets down to it, my rewrite is basically the
> > implementation of #collect:!
> > 
> > 	| newCollection |
> > 	newCollection _ self species new: self size.
> > 	1 to: self size do:
> > 		[:index |
> > 		newCollection at: index put: (aBlock value: (self 
> > 			at: index))].
> > 	^ newCollection
> > 
> > Why not reuse it? (Follow the once and only once rule ;))
> > 
> > Cheers,
> > Bijan Parsia.
> > 
> 
>





More information about the Squeak-dev mailing list