Multilingual Squeak

Ivan Tomek ivan.tomek at acadiau.ca
Thu Mar 25 20:31:15 UTC 1999


I don't think that the problem is as hard if suboptimal translation is acceptable.
Two kinds of translation would be required  - comments, and Smalltalk words (selectors, class names, etc.).

Comments: These are 'natural language' but even rather poor 
translation would be better than none: If the original comment is in 
Chinese or Portuguese, even a very approximate translation to English 
is better (for me) than none.

Smalltalk words: There are not that many of these in the library and 
they are context free so they can be translated by look-up from a 
dictionary constructed in cooperation by the user and a language 
dictionary lookup. This would not require a large effort on the user's 
part.

Date forwarded: 	25 Mar 1999 19:08:57 -0000
Date sent:      	Thu, 25 Mar 1999 07:21:30 +0100
From:           	Joachim Durchholz <joachim.durchholz at munich.netsurf.de>
To:             	squeak at cs.uiuc.edu
Subject:        	Re: Multilingual Squeak
Forwarded by:   	squeak at cs.uiuc.edu
Send reply to:  	squeak at cs.uiuc.edu

> Russell Allen wrote:
> > 
> > If you have a look at the babelfish translator that altavista uses
> > (http://babelfish.alstavista.com) you will see the limits and
> > possibilities of this approach...
> > 
> > In Squeak the strict "Object message: Object" grammer certainly
> > simplifies the parsing of Squeak sentences. However, a translating
> > browser would still have to cope with translating all of the symbols
> > in the system (class names, messages, temp variables etc), and would
> > be better if it could make at least a credible stab at translating
> > comments by the user.
> 
> Forget this as fast as you can. I have worked with automatic
> translators, and it just won't work; translating natural language is
> actually *easier* than translating program texts (which are, from a
> linguistic perspective, rudimentary natural language).
> 
> Long explanation:
> Parsing is not the real problem in modern automatic translation. The
> main problem is (and has always been from the beginning) disambiguation.
> Automatic translators need the context to determine which of a dozen
> possible meanings is the right one. Smalltalk messages provide less of
> that context, so the problems are worse.
> It *might* be feasible to translate the comments on a semi-automatic
> basis. They are near to optimal for such a task: They (usually) use a
> limited word set (cutting down on dictionary size), and they (usually)
> have a limited universe of discourse (cutting down on the number of
> ambiguous meanings to consider).
> Automated translation technology isn't ready for prime time yet. The EU
> as well as the UN spend millions to get a working system, and the best
> that they got is a system that makes a professional translator more
> productive. The gain was a factor of about two last time I checked
> (which was about 5 years ago); I've been monitoring the proceedings in
> the area of automated translation, and I think they managed to up the
> factor a bit, but no substantial improvements seem to have occurred.
> 
> > However, it would be a great solution to the problem of a
> > multi-language image - browse and make changes in whatever language
> > you choose! :)
> 
> Not Technically Feasible. (Unfortunately.) Human language is just to
> irregular to be accessible to volunteer effort; you need real money to
> research problems of this complexity.
> This doesn't mean that you need real money for implementing the
> algorithms once they are known. And I sincerely hope that I'll see that
> day!
> 
> Regards,
> Joachim
> -- 
> Please don't send unsolicited ads.
> 



Ivan Tomek,

Jodrey School of Computer Science
Acadia University
Nova Scotia, Canada

fax: (902) 585-1067
voice: (902) 585-1467


Life would be so much easier if we could just look at the source code.

Elegance: The Mona Lisa has it, and so does the binary search algorithm. The Golden Gate
      Bridge has it, as do the World Wide Web, Visicalc, Smalltalk and the U.S. Constitution.
                 Public-key cryptography and Michelangelo's Pieta also have it." 
                                 - Gary H. Anthes , Computer World 

 "Beauty is more important in computing than anywhere else in technology because software is so
                complicated. Beauty is the ultimate defense against complexity." 
                      - David Gelernter, Professor of Computer Science, Yale University.





More information about the Squeak-dev mailing list