[HACK] Unicode keyboard input and fonts

Bert Freudenberg bert at impara.de
Tue Jun 13 20:38:06 UTC 2006


We are "unicodized", sort-of ;-)

My cs is rather trivial, it works without a VM modification on the  
Mac, which provides Unicode keycodes in addition to the old MacRoman  
ones. Someone should try it on a Windows VM, I think it does the  
same. I hadn't seen your work, before, but my changes to TTFontReader  
and TTCFont are quite minimal. I changed the glyph cache to use a 64K  
WeakArray instead of 256 glyphs, which is way too inefficient I  
guess. Clipboard does not work, it's not Unicode-aware.

I have no idea how the Japanese folks do their input, I think they  
use specific code-tables, not Unicode. Their m17n stuff supports  
other encodings than Unicode, too. Also, I don't know which fonts  
they use, I have no idea how to use TTCFontSets and the TTFontReader  
subclasses etc.

The advantage of using Unicode is that it's nicely generic for many  
languages - I can type japanese, cyrillic, and greek text  
immediately, as the screenshot demonstrates. Maybe I should've posted  
the changeset header before:

"Change Set:		unicodeHack-bf
Date:			13 June 2006
Author:			Bert Freudenberg

Enables unicode input by VMs supporting the utf32 field in stroke  
events. Patches TTFontReader to read unicode glyph map. and retain  
all chars. Patches TTCFont to allow more than 256 characters.

You need to install Truetype fonts after loading this Changeset. This  
one is free with many glyphs: http://www.nongnu.org/freefont/"

- Bert -

Am 13.06.2006 um 20:47 schrieb danil osipchuk:

> Here is another one:
> http://map.squeak.org/account/package/ 
> 2c1a81e1-4e86-40c8-90b5-824adc4263c5
> (using it russian language support: http://minnow.cc.gatech.edu/ 
> squeak/5773 )
>
> I'm wondering for a quite a long time - are we 'unicodized' or not  
> yet? Is the current state of m17n kind of final and I just don't  
> get how to use it or something else should be done? What do other  
> non-latin folks think about it? Does the fact that nobody but me  
> complains mean that all is ok?
> A lot of questions and it is not a sarcasm.
>
> As for me, stock image and VMs definitely are not enabled for Russian.
>
> Danil
>
>
>
>
>
>> I just hacked a bit of Unicode support into the truetype fonts.  
>> See attachment.
>>
>> - Bert -
>>
>>
>>
>> --------------------------------------------------------------------- 
>> ---
>>
>>
>> Am 12.06.2006 um 15:35 schrieb Bert Freudenberg:
>>
>>> Jim,
>>>
>>> Squeak has switched from MacRoman encoding to Unicode in 3.8. The  
>>> 8-bit subset of unicode, iso-8859-1 (a.k.a. Latin1) is what can  
>>> be reached by old-style keyboard mappings and fonts. While  
>>> MacRoman *did* have english typographic quotes, Latin1 only has  
>>> french (?...?):
>>>
>>> http://en.wikipedia.org/wiki/MacRoman
>>>
>>> http://en.wikipedia.org/wiki/Latin1
>>>
>>> Indeed, I can type french quotes fine in a 3.8 Squeak.
>>>
>>> Note that Windows puts english typographic quotes into the  
>>> reserved slots of Latin1, which makes people use them, which  
>>> makes browser developers show them as if they were indeed valid  
>>> latin1 chars, which leads to severe confusion:
>>>
>>> http://en.wikipedia.org/wiki/Windows-1252
>>>
>>> While we could employ a similar hack, the *proper* way for us  
>>> would be to use unicode text, which Squeak should be able to do  
>>> since 3.8 thanks to the fine Japanese folkss:
>>>
>>> WideString from: #(16r2018 16r2019 16r201C 16r201D)
>>>
>>> This is displayed as '????', because we lack unicode fonts  
>>> containing the right glyphs.
>>>
>>> The Mac VM already provides Unicode keyboard input, as 6th  
>>> element of the event buffer. It's unused sofar. You can use it by  
>>> inserting the following line into  
>>> MacUnicodeInputInterpreter>>nextCharFrom:firstEvt:
>>>
>>> evtBuf fourth = EventKeyChar ifTrue: [^Unicode value: evtBuf sixth].
>>>
>>> Leave the rest of that method in place, even though it is  
>>> incorrect (#macToSqueak should produce Unicode chars nowadays,  
>>> but for historical reasons still produces an 8-bit mapping).
>>>
>>> Also, to make Morphic use that interpreter temporarily:
>>>
>>> LanguageEnvironment classPool at: #InputInterpreterClass put:  
>>> MacUnicodeInputInterpreter.
>>> ActiveHand clearKeyboardInterpreter.
>>>
>>> With that, Option-[ actually inserts the character valued  
>>> 16r201C, though it still is displayed as "?". You can verify that  
>>> using
>>>
>>> $? asUnicode hex
>>>
>>> (where the "?" is typed as "Option-[") which gives '16r201C'.
>>>
>>> There are free bitmapped (BDF) and vector (TTF) fonts, but I am  
>>> not sure if anyone has tried to make them available in Squeak, yet:
>>>
>>> http://freeunifont.sourceforge.net/
>>> http://www.nongnu.org/freefont/
>>>
>>> Also, I'm unsure about the unicode support of VMs other than John's.
>>>
>>> - Bert -
>>>
>>> Am 11.06.2006 um 07:34 schrieb John M McIntosh:
>>>
>>>> Works fine in a 3.5-5180 image, likely an artifact of:
>>>>
>>>> a) switch from original apple fonts to accufonts
>>>> b) multiple language support.
>>>>
>>>> likely someone else on the list can explain which of these two  
>>>> items is the culprit.
>>>>
>>>> I'll note if you type option-] in a workspace you see nothing,  
>>>> but if you copy/paste that nothing to TextEdit why you get the  
>>>> expected characters.
>>>>
>>>> On 10-Jun-06, at 4:01 PM, Jim Rosenberg wrote:
>>>>
>>>>>> You should indicate which macintosh VM and version you are using.
>>>>>
>>>>> VM is 3.8.6Beta6.app.
>>>>>
>>>>> Squeak is 3.8 update #6665.
>>>>>
>>>>>> You need to give a bit more information about the keyboard you  
>>>>>> have set
>>>>>> in os-x preferences, which keystrokes you are using, what you  
>>>>>> expect
>>>>>> that keystroke to give you say in TextEdit, and what happens  
>>>>>> in Squeak.
>>>>>
>>>>> I'm showing US keyboard script Roman.
>>>>>
>>>>> The "normal" keys for typographic quotes I expect are:
>>>>>
>>>>> Double-quotes: option-[, shift-option-[
>>>>>
>>>>> Single-quotes: option-], shift-option-]
>>>>>
>>>>> They work fine in every app except Squeak. (They work fine in  
>>>>> TextEdit.)
>>>>>
>>>>> When I bring down the character palette, highlight the right  
>>>>> single quote and click insert, Squeak shows it as simply a  
>>>>> question-mark. When I type shift-option-], instead of getting  
>>>>> the right single quote, I'm getting the character at position  
>>>>> 92 in the MorphicFontEditor.
>>>>>
>>>>> It sounds like my keyboard mapping has gotten confused somehow  
>>>>> -- if you can straighten me out I'd be much obliged!
>>>>>
>>>>> -Thanks, Jim





More information about the Squeak-dev mailing list