internationalisation of squeak: Bitmap font SofijaUC

Boris Gaertner Boris.Gaertner at gmx.net
Wed Dec 22 23:14:02 UTC 2004


"Shalabh Raizada" <shalabhraizada2004 at yahoo.co.in> wrote:


> Hi,
>
> Thanx for your pointers. I tried to test unicode features. It worked
> perfectly well.
Nice to read that.
> However, when i tried to do it for devanagri/telugu fonts i
> faced problems.
Yes, that is not a surprise. The internationalization features
 in Squeak 3.8 are currently restricted to scripts that do
not require script shaping.  Script shaping is required
for all Indic scripts and for Arabic and Syriac (and possibly
also for Tibetan and Mongolian)

> can u or boris send me the right code or tell me the
> changes for devanagiri/telugu required in the code used for rendering
> Xiaoing.txt .
That code is still not written - as fas as I understand the situation.
A year ago I had an experimental Unicode package for
Squeak 3.4 where I tried to do something for Devanagari and
other Indic scripts. The results were not very encouraging.


> also i donot really know how to read the font ascii codes
> (ie 16rFF etc.) for unicode glyphs. could u help me??

I do not really understand that question. Do you want
to say that you have problems to find out that  16r0924
is the Unicode codepoint that is associated with the
Devanagari letter TA ?

16rFF (or x00FF or u00FF, there are various notations in
use,  16rXXXX is Smalltalk notation) is the codepoint for the
LATIN SMALL LETTER Y WITH DIAERESIS
The devanagai letters are allocated to the codepoints
from 16r0901 to 16r097F.
You can use the Unicode Glyph Browser for Squeak 3.8
to find out such things, but if you do not have the glyph
charts from
http://www.unicode.org you should download at least
the parts that describe the scripts you want to use.

For those that are not familiar with the Indic scripts, I
recommend this page: http://tdil.mit.gov.in/uni.htm
On this page you find seven editions of the periodical
'Vishwa Bharat' in pdf-format. These seven brochures
contain detailed information about implementation
of Unicode support for all Brahmi scripts. The problem
of script shaping, which is quite difficult to understand
for europeans, is also explained in helpful detail.

I will try to review my work for Squeak 3.4 as soon as
possible, but for now i cannot give you a time schedule.


Attached you find an archive that contains a single
textfile:  Scriptshaping.txt.
This file should be read with Squeak 3.8 with
the Unicode font SofijaUC installed.
To read the file, please copy it into your
working directory and evaluate the following
code in a workspace:

   | f c txt |
f := FileDirectory default oldFileOrNoneNamed: 'ScriptShaping.txt'.
c := f contents.
txt := Text string: c
            attribute:
               (TextFontReference
                    toFont: ((TextConstants at: #SofijaUC) fontArray
first)).
Workspace new
   contents: txt;
   openLabel: 'Script Shaping'

The file explains demonstrates what we can and what we
cannot do at this very moment for the indic scripts.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ScriptShaping.zip
Type: application/octet-stream
Size: 2009 bytes
Desc: not available
Url : http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20041223/5bd3b62c/ScriptShaping.obj


More information about the Squeak-dev mailing list