Thank you; you should put this on a wiki or a blog somewhere. If you don't have time, I would gladly do that for you (and of course put your name on it.) This is some really good writing, and I really appreciate the work. It's easy to lose track of historical context amongst all of the ones and zeros.

Thank you.

On Sun, Aug 30, 2009 at 3:43 AM, K. K. Subramaniam <subbukk@gmail.com> wrote:
On Saturday 29 Aug 2009 8:48:18 pm Juan Vuletich wrote:
> I'll have time for this on Monday. I guess the first thing to do is to
> understand what is the correct font for '12 point'. I mean, the correct
> ascent/descent and line grid, and the correct shape of the glyphs. Some
> specification of mean kerning / length of strings would be nice too.
In traditional typography, the point referred to the size of the metal block
used to carve shapes (glyphs).  The size of a Point depended on
regional/political affiliations :-). Anglo-american point was about 72.27 points
to an inch. Not all shapes would be contained within its metal block base.
Some (see g in the picture http://ilovetypography.com/2008/03/23/sunday-type-
bright-type/
) would also extend (kern) beyond its base. String length would be
the sum of metal block widths. The sum of shape widths in a run of characters
could differ from the string length due to kerns.

Digital typography uses co-ordinate grids instead of a metal block. It defines
Point (aka DTP point or Postscript point) as 1/72th of an inch. Grid size
varies for different fonts. TTF uses grids of 512, 1024 or 2048. The point size
used in font names are 'design size'. i.e. a modern 12pt TTF contains glyphs
drawn inside a 2048x2048 grid that will *look like* a 12pt metal typeface when
scaled to various digital canvases like a 96dpi screen or 1200 dpi printer.

Though we continue using the term 'glyph' (carving), images are computed on
the grid using a 'pictal' process and then scaled to the target canvas. A
Truetype font is actually a bytecode program interpreted by a font engine
(e.g. Freetype) to scale glyphs at run time. Internally, glyphs are defined in
terms of lines and curves in a grid (called em-square) of size 512, 1024 or
2048. Given a canvas (dpi, depth), a glyph code and a point size "hint", the
font engine will scale a glyph and tweak them using 'hints' expressed in
bytecodes. For instance, stretching a '(' vertically may scale only the middle
part and leave the tips alone. The o in "xo" will slightly overshoot x for
good visual flow.

However, aesthetic rendering requires a context and a run of characters. The
spacing between one character to the next is dictated by kern and direction.
Sometimes, adjacent glyphs may coalesce into a more compact or even different
shape (e.g. ffi) called a ligature. Runs are handled by a text rendering engine
(e.g. Pango, Qt, ghostscript). Of course, the ligated glyph shape should exist
in the font.

Squeak's text printing algorithms only consider boxes and kerns - no
ligatures, no hyphenation, no direction. Building a true multilingual layout
engine is a non-trivial task. Mac and Wintel have only one shaping engine
while Linux has multiple options (Pango, Qt, ICU, m17n, Graphite). Pango and
Qt are widely used. Squeak should be able detect and load shaping engines on
the fly on Linux or allow command line options to pick a wrapper plugin.

There is also the blue plane approach proposed in FoNC paper - treat text
boxes like graphic objects. A Font engine is just a specialized vector graphic
editor that avoids intensive geometric computations by caching precomputed
shapes in a glyph table. Allow graphic objects to have class level editors.
Then when a glyph is missing for a character code, open a glyph editor and
allow a new shape to be defined on the fly or imported from a public font
definition file. Beats displaying $?.

Subbu




--
Ron