Localization in code

Thu Sep 27 19:23:18 UTC 2001

Folks,

Goeran wrote:
> I know nothing about the scripting system, but it looks neat. I did
> notice though that this translation sofar does not cover all
> texts that show up, far from it.

I have been thinking about it various times and there's one big problem
associated with localization - it's basically that you don't want to have it
getting into your way when you work and still be very flexible about what
you can do. I really hate a working style where you write something like:

	self inform: MSG_VALUE_IS, value printString.

and then go into some external resource where you define

	MSG_VALUE_IS: 'The value is '.

So here's a different proposal (which reflects my latest thoughts on the
problem and I'd like to get some feedback on it). I'm deliberately not
looking at localizing strings that are created "free-style" such as the
"Welcome to" window but exclusively those that are embedded in code and
therefore - at least in theory - harder to translate. My proposal is simple:

What we need first of all is some sort of "named printf" - e.g., a print
operation that inserts formatted arguments but not by their respective
position (like the C printf where printf("The value of %s is %d", "cargo",
cargo) results in "The value of cargo is 5"). The reason why I really want
to have named arguments is because different languages may translate
arguments in different order and it may well be that an appropriate
localization of the above would be "5 is the value of cargo". Thus I'm
looking for something that can be used along the lines of

	'The value of <name> is <value>'
		printWith: {#name. 'cargo'. #value. cargo}.

[Note: The above is ugly as hell and one of the reasons why I haven't done
anything yet - I need help and creative ideas how this can be expressed in
an easier way. Please chime in!]

Then, using such a scheme (or a similar one) we could localize the system
based on some sort of localization browser. What I'm imagining here is that
we have a browser showing us all places where a) #printWith: and (at least
for a transition phase) b) literal strings are used. To localize a method
you'd just "rewrite" that method in the browser but rather than recompiling
it, the system would remember that "String literal A in method X with date
stamp Y was changed to B". The browser saves that information in some
localization file. When you want to switch from one language to another, the
system would load the appropriate localization file and replace all those
literals that it knows about (in other words, where the stamp matches the
actual method stamp so that we don't accidentally localize methods that have
changed in the meantime).

Once an initial localization is complete we can use the localization browser
to show us the methods that have changed in the mean time (based on their
stamps) and see if we need to do something about it (more often than not we
won't have to and based on the prior method versions we should even be able
to determine this). Which means that after an initial phase localization is
really an incremental thing with (hopefully) little manual effort.

What do you think?!

Cheers,
  - Andreas