I like this scheme.
This solves or eliminates a lot of the problems associated with the various other methods. I think it could have nice performance implications also.
With all of the other schemes you have issues with deleting messages and compacting the file(s).
With the multi-file system reorganizing your mailboxes, moving messages from one mailbox to another has its own set of issues. In Communicator when a mailbox reaches a certain size its performance regarding opening the message list, etc. starts to degrade. I'll often create an archive folder (Squeak-dev-archive, etc.) and put messages of a certain age in it. You can watch communicator as it transfers messages to the new files in batches of 200. :( With the messages in their own files all that needs updated is the index. Sweet. :)
The only drawbacks I see are space as Tim mentioned, which I agree in a minimal issue in light of current hard disks. The other is performance difference between searching a large file and searching thousands of files. This may not be an issue at all. After all grep, swish, htdig all seem to do okay.
While on that subject... What are thoughts about an indexing search engine? That way we wouldn't have to scan each file on a search. Or does Celeste already have such. Forgive my ignorance on Celeste, I haven't spend much time there yet. Oops. As I reread Tim's message below I see him mentioning a system at Interval. Would be nice. :)
Jimmie Houchin
Tim Rowledge wrote:
Another possibility is to make each message a file; my mail system on the Acorn does this. I has costs on some platforms I imagine, most seem to have a minimum file size of some sort so you can waste space easily. With disk prices as they are these days, I don'timagine that is a particularly worrisome mater. It does offer an interesting sort of security in that it is quite difficult to end up deleting parts of messages and there is something vaguely transactional about it. If the index file(s) stored the relative paths of the messages then that would presuambly map to message IDs reasonably. It might even help with avoiding the redundant loading of message left on the server.
Now of course if we had machines able to use the file system we designed at Interval, indexable by content and tags etc, this would all be much simpler. Sigh.
tim
Tim Rowledge, tim@sumeru.stanford.edu, http://sumeru.stanford.edu/tim Supercomputer: Turns CPU-bound problem into I/O-bound problem. - Ken Batcher