Cheap updates

Dan Ingalls Dan.Ingalls at disney.com
Wed Jun 7 03:59:09 UTC 2000


>Dan Ingalls pointed to String>>compressWithTable: and its senders, but said
>"It helps a bit, but I don't think it's really worth the added complexity."

"Richard A. O'Keefe" <ok at atlas.otago.ac.nz>wrote...
>compressWithTable: doesn't do quite the same thing.  It replaces _tokens_
>in the string with their index in a given table of tokens, should they be
>there.  Amongst other things, this means that if there is an instance
>variable "rate", the method name "rate" will be compressed to 1 byte, but
>the method name "rate:" will not be compressed at all.
>
>My suggestion was to prime the table of a dictionary-based compressor
>with the class and instance variable names (I didn't say "as one long string"
>but that's what I meant) and let it compress the text as a string, in which
>case "rate" and "rate:" would _both_ benefit, and so would "primeRate"
>(becase "ate" would be found).  It seems likely that this would do rather
>better than compressWithTable: followed by gzipping.  It would also be
>faster, because the tokenising step would be eliminated.  It'd just be a
>matter of starting the compressor with a non-empty dictionary (the same
>for all methods in the class) rather than an empty one.

Go for it!  I just found that the smarter I tried to get about this, the better I found the "work with what you got" philosophy of gzip compared.  My reason for pointing you at my experiments was that you could easily reassemble the pieces to try out some of the things you suggest (and it IS the case that you can do better than gzip by putting the two together, even as is).

Good luck!

	- Dan






More information about the Squeak-dev mailing list