[etoys-dev] Re: [squeak-dev] Localization for Squeak products

Bert Freudenberg bert at freudenbergs.de
Tue Jan 19 09:50:30 UTC 2010


On 19.01.2010, at 09:14, Yoshiki Ohshima wrote:
> 
> At Mon, 18 Jan 2010 23:20:17 -0800,
> Andreas Raab wrote:
>> 
>> Hi Yoshiki -
>> 
>> A couple more questions:
>>>  The created .pot file is uploaded to an online translation site that
>>> uses Pootle.  The volunteers provides the translation and at the
>>> release build time, we collect the translation, compile them to the
>>> .mo files.  GetTextTranslator, a subclass of
>>> NaturalLanguageTranslator, opens the .mo file and looks up the
>>> translation of given string from it.
>> 
>> Can you point me to the site where the translations are hosted?
> 
> It is here:
> 
> http://translate.sugarlabs.org/
> 
>>> Also, a phrase is used in different way and not being able to
>>> translate it differently is a problem.  We thought of several ways to
>>> solve this problem...  One was to modify these words in the source
>>> code (e.g. a phrase like "start" to "start (verb)" and "start (noun)")
>>> but it would have resulted in invalidating a lot of volunteer work and
>>> having to provide the English translation.  If Pootle was flexible, we
>>> could have an annotation to each phrase to indicate its use (still
>>> would have required source change), but didn't happen.  Splitting the
>>> phrases into different text domains was another possibility and it is
>>> good for other reason (korakurider has the code even) but didn't
>>> happen for various reasons.
>> 
>> I'm not sure what "splitting this into different text domains" means in 
>> this context. How much of a problem is the issue of translating phrases 
>> differently in practice? Does it happen to pretty much everyone right 
>> away or is that an occasional gotcha that people just work around the 
>> best they can?
> 
>  The text domain is a feature of gettext.  We still stick all phrases
> into one domain but if there was a good boundary to split the phrases,
> it would have been useful for easing Pootle server workload,
> volunteers perception, and implicitly disambiguate some phrases.
> 
>  But this kind of disambiguation was not big, compared to the lack of
> inflection support it seems.  Workarounds included to edit one
> occurence of 'start' with ' start ' and provide different translations
> for 'start' and ' start ' etc. but real support for plural (and
> gender... which is much harder) would have been good.

See here for an example of context use:

http://www.gnu.org/software/gettext/manual/gettext.html#Contexts

In fact you should skim the whole gettext manual. It's the de-facto standard for localization in the open-source world. 

>>>  (I thought we started out from 4,000 or such phrases for Etoys, but
>>> now it appears to have 27,000 or such.  Not sure what it means...)
>> 
>> It's probably just more coverage.
> 
>  IIRC, I thought it was around 10,000 or such at one point.  Since
> it's been automatic, the increase from there is not clear to me why...

It's even simpler than that - some tools count strings, some words. Etoys currently has 4412 translatable strings with 27454 words in them. 

>>>> * How does localized deployment work with Etoys? I.e., what are the 
>>>> options for providing localized downloads vs. downloading all supported 
>>>> locales and switching dynamically upon startup?
>>> 
>>>  In general, we bundle these .mo files in the single release.  Upon
>>> startup, GetTextTranslator scans the specified directory for available
>>> languages and show them in the menu.  (And trys to switch to the
>>> system language.)
>> 
>> What are .mo files? How do they relate to .pot files?
> 
> .pot is the template file in text format.  A volunteer edits the file
> with a generic or special editor for editing .pot to make a .po file,
> which is in the same text format but with translation interleaved.
> Then a command compiles a .po file to a binary file (.mo file) for
> faster access.

The "msgfmt" program compiles these. For example, it puts a table at the file's beginning for rapid lookup of strings. We don't read the whole file at startup, but load strings on demand.

>>>> * Generally speaking, how do people feel about localization in Etoys? Is 
>>>> it considered to work well, or is it considered to be a painful process? 
>>>> Are there any obvious alternatives one should look at?
>>> 
>>>  This is subjective, but I thought the process overall worked pretty
>>> good, given that volunteers all over the place.
>> 
>> That's kind of what I was asking for. Put differently, if you had the 
>> choice between the current approach and some alternative, would you drop 
>> the current version no questions asked, or would you likely say "you 
>> know what, it's worked for us". From the sound of it it's the
>> latter.
> 
>  Yes.  For Etoys, if we had incoporated korakurider's other attempts
> to disambiguate phrases earlier it would have been better but without
> such it provided a usable system.  I had some reservations earlier
> when going to gettext; I didn't see the idea of getting translations
> from people who don't know Etoys a good one, but the scalability of
> workflow paid off.
> 
>  Again for Telespace, the translation will be done by people who know
> the application, but not necessarily people who can open the Squeak
> System Browser and look at the code.  In that setting, going to an
> external tool makes sense and then gettext is pretty solid.
> 
>  Supporting more gettext features such as plural support etc. would
> be a plus.
> 
>> Thanks this is all very useful info!
> 
>  No problem!

Have to agree with Yoshiki, overall it worked pretty well for us.

One thing we tried, abandoned, but might re-instantiate again is splitting the single large Etoys pot into several. There already is support for that there - a string is looked for in the domain file named like the Squeak package for the method sending #translated. If it is not found there, it uses the default translation file (in our case, the single Etoys file). This allows per-package translation files and IIRC it did work fine for Hilaire's DrGeoII. 

It would be very beneficial for Etoys because it would allow translators to prioritize their work. We could have a file with all the tile translations, one with the UI strings easily visible in Etoys, and then "the rest" which covers all Smalltalk tools and hidden dialogs etc. The single file we have is too daunting.

- Bert -





More information about the Squeak-dev mailing list