Document Crafting, Objectively

Marcel Weiher marcel at metaobject.com
Tue Mar 12 12:37:58 UTC 2002


On Monday, March 11, 2002, at 03:50 PM, Hannes Hirzel wrote:

>> At 12:57 PM +0100 3/11/02, goran.hultgren at bluefish.se wrote:
>>> I haven't followed the thread but I have tried Lout as an alternative 
>>> to
>>> LaTeX:
>>>
>>> -TeX algorithms rewritten in C (and reportedly even better).
>>> -Much simpler setup and IMHO to use overall.
>>> -Very small to download.
>>> -Produced PDF directly, no problem to use TrueType fonts.
>>> -Fast.
>>> -Has support for graphing, tables etc.
>>>
>>> Worth a look, I used it on NT4 to write a report for a company and I
>>> learned it very fast.
>>>
>>> regards, G–ran
>
> Coudn't we make a better use of the already existing
> Encapsulated PostScript (EPS) export support in Squeak?

Yes, sort of.

What is currently missing is some of the great typography that is in 
TeX.  For starters, the great line-breaking algorithm, the encoded 
know-how on what sort of space can stretch how much, how hanging 
punctuation works, how punctuation interacts with spacing (different 
rules for different type-setting traditions) etc.  I haven't even 
touched on Math.

Before you can even think about doing that, you need, at the very least, 
decent font support.  Bitmap fonts don't cut the custard, nor do integer 
character metrics.

The current Postscript output routines punt by just substituting 
'reasonably similar' (using the word 'reasonable' in the loosest 
possible manner) Postscript fonts, hacking those up with additional 
characters and then applying some nasty Postscript trickery to fudge the 
spacing so it doesn't look completely awful.  It's been a while, but I 
think I actually pretty much reformat fully justified lines of text 
(maybe centered ones as well) in Postscript, and just let non-justified 
lines of text do what they will.

At the very minimum, what we need is font support that doesn't just 
import fonts as bitmaps, but keeps the connection to the original fonts 
and their metrics.  If you want your screen output to look good, you us 
the integer metrics, if you want printed output to look good, you use 
the fractional metrics.  If you want *both* to look good, you have two 
choices for how you deal with text output to the screen:  you can either 
use fractional metrics with anti-aliasing and sub-pixel positioning, or 
keep track of both sets of metrics, using integer metrics for character 
layout, and adjusting spacing to account for the error relative to the 
true metrics.

You should probably also do something about character encodings.

Once you've got that licked, you have a solid basis for implementing 
device-independent output that will work without Postscript trickery.  
On top of that, you can then implement wonderful algorithms such as the 
ones TeX is using.


> For most people a "presentation" is just an OrderedCollection of
> slides (projects in Squeak) with an index so you can directly jump to a
> specific slide.
>
> A nice thing to have would be a Portable Document Format (PDF) export
> accessible from InternalThreadNavigationMorph to output a thread as a 
> PDF.

All you'd need is to (a) hook that up to the same logic that does 
multi-page Postscript printing of book-morphs and (b) implement a 
PDFCanvas similar to the PostscriptCanvas.

> PDF is AFAIK Postscript minus some things plus some other
> things like hyperlinking between pages.

Sort of.  PDF has a lot of additional structural information, a more 
compactly encode operator set but lacks a language.  So it has large and 
growing list of built-in operators and additional features... ;-)

>  A partial solution would probably
> just be to wrap up the EPS output differently.

The PostscriptCanvas has actually been written with alternate output 
methods (such as PDF) in mind.  I've written PDF generators in 
Objective-C before, it really isn't that hard.  However, keep in mind 
the issues pointed out above.  Right now, you can't easily duplicate 
what the Postscript hacks are doing Squeak-side, because the actual 
Postscript font-metric information it needs to reformat lines is 
missing.  Getting the special characters into PDF is also a wee-bit more 
difficult.

> But for that to do one should
>
> 1) have and understanding how the current Postscript export encoders 
> work
>    (hint: Use Robert Hirschfelds AspectS to discover what's really 
> happening)

Or ask me ;-)

> 2) know the difference between EPS and PDF code-wise.

Me too ;-)

Some of the main differences:

PDF has a parseable syntactic structure of 'objects' such as pages, 
fonts, etc.  These objects are cross-referenced, so you need ot keep 
track of their precies position in the file.  Actual content (such as 
the graphics that make up a page), are in special PDF 'streams', which 
encode arbitrary binary data within the overal structural framework of a 
PDF file.  Streams are usually compressed, filtered, sliced and diced...

A page-contents stream contains actual drawing instructions, encoded in 
a textual format very similar to Postscript (stack oriented, same 
imaging model).  However, one thing to be aware of is that some 
operators that seem to do the same have subtly different semantics!

> 3) Some additional time

ASMOP ;-)

Marcel


--
Marcel Weiher				Metaobject Software Technologies
marcel at metaobject.com		www.metaobject.com
Metaprogramming for the Graphic Arts.   HOM, IDEAs, MetaAd etc.




More information about the Squeak-dev mailing list