Need for a message catalog framework per application

Pascal Grossé pascal.grosse at gmail.com
Wed May 17 16:51:56 UTC 2006


Hilaire Fernandes wrote:

> The message catalogues -- for translation -- as handled right now come
> in the form of an unique dictionary per language in the Image. When
> developing a specific application where you want the interface to be
> translated in several languages it is a very big problem.
> Indeed there are *no standard* way in message catalogues transportation
> related to your package. Right now the application source code and the
> translation are located in different places. The first one in a
> Monticelly package and the other in the Image. Therefore it could be
> very easy mix up or even worst lose part of an application translation.
> 
> I read here and there about possible hacks to do it, but could we try to
> define once *a standard way* to define such message catalogue
> transportation.
> 
>  From my point of view -- application developer -- I will need:
> 
> - for one specific application, the message catalogues  should come with
> the source code (as it is done with the GNU Gettext system).
> - the message catalogues should be transportable with the source code.
> For example I read it could be integrated in the Monticello package and
> SqueakMap package
> - it should be easy for translators to translate one specific
> application, then commit back in the application repository the
> translated messages.
> 
> I don't think I can define such a framework alone, but if other people
> are interested for such a solution I am willing to help.
> 

Hi,

I completely support what Hilaire said: the catalog framework is one the
most urgent problem to solve with translations.

But while we are talking about making a standard way, why not handle also
all the other issues of the current translation scheme and doing the right
thing once and for all ?

1. Using the english string as the lookup key for translation can be a
problem:

- what about words having different translations according to the context ? 
Of course, this is much more likelly to happen with short words than with
long phrases, but there are many such words in the current message catalog

- what about english words denoting simultaneously verbs and nouns ? "start"
for example must be translated (in french) to "début" if a noun, but
to "démarrer" if a verb.

- when you change the english phrase (either for a genuine change or for
correcting an error), the translations are still associated to the old
phrase which remains in the catalog. Actually, the catalog is cluttered by
old phrases not used anymore, often differing only by a letter, or worse,
by some spaces.

Using special symbols as lookup keys is very tiresome (especially for
english people who do not see the benefit of such complexity), but IMHO
necessary, at least for ambiguous phrases (no need to change all the keys
at once).

2. Constructing phrases programatically is not very translation-friendly.
Words are not placed in the same order in every languages. I know it is
already possible to use strings such as "The files contains {1} lines" in
smalltalk, but maybe we should generalize the usage of such constructs in
the image. And the next point cannot be solved simply by this neat trick.


3. Plural handling can be very tricky. Not all languages use the distinction
singular/plural as in english. Look for example at russian (or other slave
languages), where plural categories are more numerous and subtle than
just "1"->singular, ">1"->plural. Have a look at
http://lists.kde.org/?l=kde-i18n-doc&m=98623103801458&w=2 for an overview
of how difficult it is to do the right thing.


I know those changes would be very huge and global, but not doing i18n right
is a huge brake to the implementation of squeak in non-english countries.

Of course, I'm offering my help to Hilaire and other who would like to join.

Pascal




More information about the Squeak-dev mailing list