[ANN] Yet Another TrueType support for Squeak

Tue May 6 18:24:51 UTC 2003

  Andreas,

> Yup. I've often been thinking about some "completely different" character
> scanning primitive which takes stuff like overhang, underhang and kerning
> pairs into account. It would be really good if this could be mocked up in
> Squeak first (screw the speed! ;-) and then, once we understand how it has
> to work, be brought down to a primitive implementation. 
> 
> Hm ... perhaps it is even wortwhile to make the actual scanning method
> depending on the font you use ... this could mean that we have different
> scanning methods based on the fonts we use and would make a nice pairing
> with the BitBlt support required... something along the lines of calling
> aFont>>scanCharactersIn: aScanner which then double dispatches back into the
> scanner (scanStrikeFont: etc). Yoshiki, what do you think? Would this make a
> nice match for some of the multi-lingual stuff?

  *Conceptually*, the way it is working now is that the Scanner
divides a long string into chunks, and a call to
#displayString:on:from:to:at:kern:, or the future family of this
method in StrikeFont or TTCFont renders a chunk into a Form.  It may
sound the opposite order of you're proposing, but actually not.
Inside of those #displayString:... there must be a bit of layout
algorithms in each of them.  But I would not call this inner thing a
scanner, mainly to avoid the confusion:-)

  The idea is that a "font" object takes the responsibility to render
the graphical representation of the characters the font knows and the
Scanner takes care of where the rendered graphics should go to.

  Upper level of this Scanner is the scanner dispatch mechanism based
on the script-block/code page/encoding tag.  So, there are three
levels of stuff.

> > > >> * Not Unicode
> > > >
> > > >   What do you mean by this?
> > > The primitives go only to 256 characters as well as the 
> > > font reading and converting it would'nt be too hard to support
> > > more chars but this would depend on fall-back code.
> > 
> >   The hard part is the scanning rule implementation, I think.
> 
> Really? I wouldn't think so. For example, consider that we organize all
> multi-byte character fonts into some sort of "code pages" (8 or 16 bit for
> example [*]) where the primitive would only work on the set of characters
> within the "current page". If it encounters any other it merely fails. Then,
> we back up on the ST side by handling the failure code (in this situation:
> look up the new code page for the prefix in question) and resume the
> primitive. This scheme could be implemented today (just use prefix zero
> which I think is the default in your multi-byte strings anyways) and would
> allow us to organize fonts efficiently around the pages which are actually
> used (e.g., you wouldn't even have to load parts of some TTF if you don't
> use Korean or somesuch).

  And yes, the dividing into chunks part is done in this manner.  A
particular scanner method for a scirpt *fails* when it encounters a
character it doesn't know, and the upper loop continues the scanning
with other appropriate actual scanner method.

  What I meant was that implementing the "inner" thing I mentioned
above one by one and testing it.  We need to see if the implementation
works in the context where many scripts are combined.  Maybe I
shouldn't have said *hard*, but this is time-consuming.

-- Yoshiki