Celeste (was: Celeste and IRC stuff)

Jecel Assumpcao Jr jecel at merlintec.com
Mon Apr 18 19:34:07 UTC 2005


Lex Spoon wrote on Sat, 16 Apr 2005 11:23:04 -0400
> Cool, Jecel!  Be sure to send your Celeste patches to Giovanni, and your
> IRC patches to Frank, so they can consider including them in the next
> release.

Sure, though most would really have to be rewritten to be usable to
anybody else. For example: the ones that deal with badly formed emails
don't make much of an effort to get usable results because I noticed
these were always spams which I didn't care to read anyway. They just do
enough so Celeste doesn't crash and I can go on to the following emails.

> Do you mean MIMEDocument, or MailMessage?  If you meant MailMessage,
> then bravo!  If you meant MIMEDocument, then please let's not keep
> patching this method, but instead try to get rid of it.  Its own comment
> is begging for the misery to end:
> 
> 	"Return the parts of this message.  There is a far more reliable
> implementation of parts in MailMessage, but for now we are continuing to
> use this implementation"
> 
> The MIMEDocument version doesn't *really* know what the separator is,
> and instead has to guess.  Further, I can't think of a reason that a
> MIMEDocument would need to be asked for its parts directly; if it's
> multi-part, then its a mail message, and thus the MailMessage object
> ought to be handy.

All very good points, and I did notice the comment, but the fact is that
the debugger popped up in the MIMEDocument method and that is what I had
to patch to keep going. I should have traced back a bit to see why this
was being invoked instead of the MailMessage code but didn't have time
to do so.
 
> > To load the database from an older version I forced some files to open
> > as a StandardFileStream.
> 
> Certain minds think alike!  I sent such a patch to Giovani already. 
> Some day, it would be nice to fix Celeste up to use international
> characters, but for now this at least keeps it working.

I had hoped to do the patch, read in the files, undo the patch and then
write out the files again so that from now on the default stuff would
work. A proper method so that this could be redone whenever needed would
be much nicer, of course. 
 
> > To deal with international characters in general, removed #isoToSqueak
> > and #squeakToIso from all methods in Celeste, CelesteComposition and
> > MailMessage. The ones in CelesteComposition were already causing
> > problems in Squeak 3.7 since they were being applied twice.
> 
> Ummm, are you sure these are all correct?  The on-disk messages file
> should be in MacRoman for compatibility's sake.  Is Squeak now switching
> to latin-1 encodings internally?  If so, then yes, some of these sends
> should go away.  But step carefully.  Be sure to test with a
> pre-existing email database that has MacRoman 8-bit characters in it.

I am using the same VM in the exact same environment, so I am sure that
nothing has changed except for the image (3.7->3.8). Yet it does see
that MacRoman is no longer being used internally. I don't know if this
is something specific to my own setup or if it is a general change,
which I another reason I hesitate to recommend my patches.

The CelesteComposition problem was that isoToSqueak was being called
when reading from the database into a string, and then once again when
rendering to the screen. The result was that any international
characters in the quoted text from the original message was messed up.

> Another thing to think about, is that we probably want to switch to
> unicode mail databases in the long run.  That will be yet another format
> change.  I can live with one format change even if it breaks my existing
> files, but it would be nice to do that as rairly as possible.

I think the current system of storing the "wire representation" of the
email in the database is the best option. The untouched original
information is always there for you to process again as much as you need
to.

> > Something I have done messed up HTML rendering, which was working
> > before.
> 
> Ack!  Not that I love HTML emails, but it should still work....  Maybe
> look in #bodyTextFormatted and see if you see anything funny looking?

I eliminated a #isoToSqueak from there, but that is not it. Ah! It seems
that #collect: doesn't do such a good job of preserving the attributes
when applied to a Text, so it is the quick Unicode->$? fix that is
ruining the HTML.

-- Jecel



More information about the Squeak-dev mailing list