[BUG]Celeste: Endless loop in #tokensIn: with special characters

Lex Spoon lex at cc.gatech.edu
Sun Sep 15 18:21:10 UTC 2002


Bernhard Pieber <bernhard at pieber.com> wrote:
> Sigh. As usual I outsmarted myself. Try the following statement to
> really get the endless loop.
> 
> MailAddressTokenizer tokensIn: 'B' , 154 asCharacter asString , 'rnhard
> Pieber <bernhard at pieber.com>'

Well, first, the tokenizer shouldn't just loop.  However, the comment in
MailAddressTokenizer class>>initialize suggests that only select class
of characters is allowed in an atom.  Probably the nextToken method
should check whether the character is a legal atom character before
calling nextAtom and signal an error.  This would be easiest if there
was a CSAtoms in addition to CSNonAtoms.  (That statement will make
sense if you look into the class's code.)

This raises the question of what to do about illegal characters.  It
seems questionable to handle them, because they may get dumped on the
wire and passed on to other software that will break.  Does anyone know
the mail RFC's well enough to say what should be done in this case?  For
example, if you get mail from someone with a malformed address?  I don't
have time to dig through it in the near future.


(Of course, newer RFC's may allow 8-bit addresses, anyway.  That leaves
the general problem, but will help the specific case.  We could just add
more things to the allowed atom characters.)


Lex



More information about the Squeak-dev mailing list