Multilingual Squeak

Joachim Durchholz joachim.durchholz at munich.netsurf.de
Thu Mar 25 06:21:30 UTC 1999


Russell Allen wrote:
> 
> If you have a look at the babelfish translator that altavista uses
> (http://babelfish.alstavista.com) you will see the limits and
> possibilities of this approach...
> 
> In Squeak the strict "Object message: Object" grammer certainly
> simplifies the parsing of Squeak sentences. However, a translating
> browser would still have to cope with translating all of the symbols
> in the system (class names, messages, temp variables etc), and would
> be better if it could make at least a credible stab at translating
> comments by the user.

Forget this as fast as you can. I have worked with automatic
translators, and it just won't work; translating natural language is
actually *easier* than translating program texts (which are, from a
linguistic perspective, rudimentary natural language).

Long explanation:
Parsing is not the real problem in modern automatic translation. The
main problem is (and has always been from the beginning) disambiguation.
Automatic translators need the context to determine which of a dozen
possible meanings is the right one. Smalltalk messages provide less of
that context, so the problems are worse.
It *might* be feasible to translate the comments on a semi-automatic
basis. They are near to optimal for such a task: They (usually) use a
limited word set (cutting down on dictionary size), and they (usually)
have a limited universe of discourse (cutting down on the number of
ambiguous meanings to consider).
Automated translation technology isn't ready for prime time yet. The EU
as well as the UN spend millions to get a working system, and the best
that they got is a system that makes a professional translator more
productive. The gain was a factor of about two last time I checked
(which was about 5 years ago); I've been monitoring the proceedings in
the area of automated translation, and I think they managed to up the
factor a bit, but no substantial improvements seem to have occurred.

> However, it would be a great solution to the problem of a
> multi-language image - browse and make changes in whatever language
> you choose! :)

Not Technically Feasible. (Unfortunately.) Human language is just to
irregular to be accessible to volunteer effort; you need real money to
research problems of this complexity.
This doesn't mean that you need real money for implementing the
algorithms once they are known. And I sincerely hope that I'll see that
day!

Regards,
Joachim
-- 
Please don't send unsolicited ads.





More information about the Squeak-dev mailing list