[squeak-dev] #isBreakableAt:in:

Tue Sep 24 22:14:24 UTC 2013

At Tue, 24 Sep 2013 23:21:00 +0200,
Nicolas Cellier wrote:
> 
> 2013/9/21 tim Rowledge <tim at rowledge.org>
> 
> >
> > a) There are {language}environment classes and encoding classes. There is
> > #isBreakableAt:in: implemented in both but seemingly unused in the encoding
> > classes because it is just plain broken there. Should it be removed from
> > the encoders? In the language environment classes it is implemented to
> > return true for space and cr by default, but space, cr & lf in Latin1 and
> > Latin2. Is that as expected?
> >
> >
> >From what I understand:
> - no need to answer true for space, cr, lf since these are already handled
> in the CharacterScanner stopConditions, so default answer should be ^false
> (unless one of these is removed from stopConditions, I thought I saw that,
> but cannot remember...)
> - whether it should be in EncodedCharSet or LanguageEnvironment, I don't
> know...
> 
> I don't completely like the Multi* version...
> For example, when the last breakable char is not a space, there is no
> adjustment of space width.
> Maybe Justified makes no sense in Japanese?
> I'd very much like to have tests describing the exepectations...
> 

Having tests would have been good, yes.  For some reference this might
help a bit.  The page rightly mentions contradicting "House Rules" so
it is not clear cut.

http://en.wikipedia.org/wiki/Line_breaking_rules_in_East_Asian_languages

I'd support a rewrite of the whole thing, and perhaps would do more
"total rewrite" approach...

-- Yoshiki