[Challenge] large files smart compare (Includes new CSOTD extension to Celeste! :-)

goran.hultgren at bluefish.se goran.hultgren at bluefish.se
Tue Jan 29 09:43:45 UTC 2002


Yoel Jacobsen <yoel at emet.co.il> wrote:
[SNIP good description]
> Questions
> =========
> 0. Any good idea about how to make it practical for 450K entries (18M 
> lines)? What should I  use for persistence?

My first thought (since I am playing with them currently) was that
perhaps you could use ImageSegments and swap in/out different parts of
these things in order to work with them.

If you know how to do it (I posted a small howto yesterday) it's easy
and works quite fast I think. I swapped 200k objects at about 4-5Mb back
into the Squeak memory in about 220 ms. And I am guessing about
100-150ms is overhead. Writing them out takes a little bit more time.

Now - remember, I haven't read through your description so I have no
idea if this would be interesting...

> 1. I have some suggestions for additions to the String and FileStream 
> classes (like String formatting and looping over a file). What is the 
> right why to share my implementation?

The right way is probably to first present/discuss them here on the list
and if you get good response you should package them as described on the
swiki. See:

http://minnow.cc.gatech.edu/squeak/1385

...and related pages.

> 2. Does anyone tried to make a Berkeley DB plugin?

Actually, I happen to know that Stephen Pair has been playing with that
with some cool results. Stephen? How's it going? It looked trés cool at
OOPSLA. :-)

For scalable persistence there are also the interfaces for
MySQL/PostgreSQL which I haven't tried myself. And then I have always
thought that perhaps we could produce a neat interface/plugin for GOODS:
http://www.geocities.com/kknizhnik/goods.html

regards, Göran

PS Attaching a new version of my silly Celeste "CSOTDPITA extension".
This one is satisfied with one <CSOTD> per day. Thank god. And it adds a
Preference to turn it on/off. Cheat-enabled in other words. ;-) DS
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CSOTDPITA-gh.cs.gz
Type: application/octet-stream
Size: 1045 bytes
Desc: not available
Url : http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20020129/5c083bd8/CSOTDPITA-gh.cs.obj


More information about the Squeak-dev mailing list