Hi guys,
I'm starting to work on the translation support for GSoC and since I know nothing of how it is implemented and I'm already delayed in the timeline I wanted to ask you if you have any pointers on what/where should I look. Thanks in advance!
Richo
Hi
most string you can just add translated after.
'Hello Etoy' translated
Karl
On Wed, Apr 28, 2010 at 4:27 AM, Ricardo Moran richi.moran@gmail.comwrote:
Hi guys,
I'm starting to work on the translation support for GSoC and since I know nothing of how it is implemented and I'm already delayed in the timeline I wanted to ask you if you have any pointers on what/where should I look. Thanks in advance!
Richo
etoys-dev mailing list etoys-dev@squeakland.org http://lists.squeakland.org/mailman/listinfo/etoys-dev
Hi Ricardo. I haven't noticed this requirement is covered in the GSoC project. Here are starting point.
1. Background In Etoys image translation support based on gettext emulation has been implemented. (You need to know basic concept of gettext; see http://www.gnu.org/software/gettext/)
+ POT Extraction: see class comment of GetTextExporter2 + Translation Engine: see #translated and NaturalLanguageTranslator/GetTextTranslator
2. Problem Current implementation is that all strings belong to the textdomain "etoys". By this huge number of strings (>4000) are included in one big PO/POT and translators have difficulty to translate it. We need to divide it to a few text domains with appropriate size and inclusion criteria/packaging scheme.
Actually the translation framework have feature to mark class category for some text domain. (review class side of TextDomainManager) But this has not been used because: + We haven't got agreement on what division would be good + There is risk that the framework cannot decide correct textdomain in some case. (see #translated) + So we couldn't implement the division in timeline of the past project.
3. Related problem SQ-139 in etoys bug tracker.
/Korakurider
On Wed, Apr 28, 2010 at 11:27 AM, Ricardo Moran richi.moran@gmail.com wrote:
Hi guys, I'm starting to work on the translation support for GSoC and since I know nothing of how it is implemented and I'm already delayed in the timeline I wanted to ask you if you have any pointers on what/where should I look. Thanks in advance! Richo _______________________________________________ etoys-dev mailing list etoys-dev@squeakland.org http://lists.squeakland.org/mailman/listinfo/etoys-dev
Great! Thanks for the explanation and the pointers, I'll look into it. I see that a lot of code is already there, that's a huge relief for me :)
Best regards Richo
On Wed, Apr 28, 2010 at 3:34 AM, Korakurider korakurider@gmail.com wrote:
Hi Ricardo. I haven't noticed this requirement is covered in the GSoC project. Here are starting point.
- Background In Etoys image translation support based on gettext emulation has
been implemented. (You need to know basic concept of gettext; see http://www.gnu.org/software/gettext/)
+ POT Extraction: see class comment of GetTextExporter2 + Translation Engine: see #translated and
NaturalLanguageTranslator/GetTextTranslator
- Problem Current implementation is that all strings belong to the
textdomain "etoys". By this huge number of strings (>4000) are included in one big PO/POT and translators have difficulty to translate it. We need to divide it to a few text domains with appropriate size and inclusion criteria/packaging scheme.
Actually the translation framework have feature to mark class
category for some text domain. (review class side of TextDomainManager) But this has not been used because:
- We haven't got agreement on what division would be good
- There is risk that the framework cannot decide correct textdomain
in some case. (see #translated)
- So we couldn't implement the division in timeline of the past project.
- Related problem
SQ-139 in etoys bug tracker.
/Korakurider
On Wed, Apr 28, 2010 at 11:27 AM, Ricardo Moran richi.moran@gmail.com wrote:
Hi guys, I'm starting to work on the translation support for GSoC and since I know nothing of how it is implemented and I'm already delayed in the timeline
I
wanted to ask you if you have any pointers on what/where should I look. Thanks in advance! Richo _______________________________________________ etoys-dev mailing list etoys-dev@squeakland.org http://lists.squeakland.org/mailman/listinfo/etoys-dev
Hi, I've been playing with the translation framework and I did the simplest thing I could think of: splitting the translations based on the package of each class. I'm sure you already know all of this but in the latest etoys-dev image I get the following info (package name - number of translations): * * Morphic 984 MorphicExtras 397 Etoys 358 Connectors 174 System 130 Tools 105 WS 80 Sound 70 Movies 60 Sugar 56 GStreamer 43 Multilingual 42 VideoForSqueak 41 Nebraska 29 Protocols 26 Flash 20 Network 18 Kernel 17 ST80 17 Graphics 15 Compression 8 Collections 7 Files 6 Monticello 4 BroomMorphs 3 Balloon 3 UserObjects 1 DAVServerDirectory 1 TrueType 1 SMLoader 1 FSM 1
One possible (and easy) division could be something like this: Morphic 984 MorphicExtras 397 Etoys 358 Connectors 174 System 130 Tools 105 others 570
I know this is not the best criteria but IMHO this is better than nothing. Maybe we should keep splitting the Morphic package, I don't know. I have some questions though: what would be an appropriate size for each po file and how many files are desirable?
Cheers Richo
P.s.: I attached the script I used to get the above mentioned data (I may have made a mistake).
On Wed, Apr 28, 2010 at 9:21 PM, Ricardo Moran richi.moran@gmail.comwrote:
Great! Thanks for the explanation and the pointers, I'll look into it. I see that a lot of code is already there, that's a huge relief for me :)
Best regards Richo
On Wed, Apr 28, 2010 at 3:34 AM, Korakurider korakurider@gmail.comwrote:
Hi Ricardo. I haven't noticed this requirement is covered in the GSoC project. Here are starting point.
- Background In Etoys image translation support based on gettext emulation has
been implemented. (You need to know basic concept of gettext; see http://www.gnu.org/software/gettext/)
+ POT Extraction: see class comment of GetTextExporter2 + Translation Engine: see #translated and
NaturalLanguageTranslator/GetTextTranslator
- Problem Current implementation is that all strings belong to the
textdomain "etoys". By this huge number of strings (>4000) are included in one big PO/POT and translators have difficulty to translate it. We need to divide it to a few text domains with appropriate size and inclusion criteria/packaging scheme.
Actually the translation framework have feature to mark class
category for some text domain. (review class side of TextDomainManager) But this has not been used because:
- We haven't got agreement on what division would be good
- There is risk that the framework cannot decide correct textdomain
in some case. (see #translated)
- So we couldn't implement the division in timeline of the past project.
- Related problem
SQ-139 in etoys bug tracker.
/Korakurider
On Wed, Apr 28, 2010 at 11:27 AM, Ricardo Moran richi.moran@gmail.com wrote:
Hi guys, I'm starting to work on the translation support for GSoC and since I
know
nothing of how it is implemented and I'm already delayed in the timeline
I
wanted to ask you if you have any pointers on what/where should I look. Thanks in advance! Richo _______________________________________________ etoys-dev mailing list etoys-dev@squeakland.org http://lists.squeakland.org/mailman/listinfo/etoys-dev
I think we do want to do something along those lines, yes.
Your expression misses some phrases, in particular the Etoys tile inscriptions:
"Count tile phrases" domains := Dictionary new. GetTextExporter2 new appendVocabularies: domains. phrases := Set new. domains do: [:translations | phrases addAll: translations keys]. phrases size "==> 748 phrases"
These tile phrases should be a separate pot IMHO, because those are the most important ones to translate.
Other than that, splitting by category seems reasonable. And since development is organized by packages now, maybe we should just have one po file per top-level category? Even if there is only 1 phrase in it?
I'm sure Korakurider has thought about that. Let's hear him :)
Here's my code to count phrases per package:
============== "Count non-tile phrases per category" domains := Dictionary new. GetTextExporter2 new appendStringReceivers: #translated into: domains; appendStringReceivers: #translatedNoop into: domains. phrases := Dictionary new. domains do: [:translations | translations keysAndValuesDo: [:phrase :mrefs | mrefs do: [:mref | category := (SystemOrganization categoryOfElement: mref classSymbol) copyUpTo: $-. (phrases at: category ifAbsentPut: [Set new]) add: phrase]]]. categories := Bag new. phrases keysAndValuesDo: [:cat :strings | categories add: cat withOccurrences: strings size]. categories sortedCounts ==============
There are 4251 non-tile phrases (sum of the below). Adding the 748 tile phrases and subtracting some duplicates this is in the ballpark of the 4412 phrases on pootle.
1427 Morphic 751 MorphicExtras 506 Etoys 262 System 252 Connectors 201 Tools 182 Sound 70 Movies 67 ST80 65 Protocols 62 Multilingual 60 Nebraska 55 Sugar 48 VideoForSqueak 38 WS 30 Graphics 28 Network 25 GStreamer 24 Kernel 20 Flash 19 Files 13 Compression 11 Collections 10 FSM 8 Balloon 7 BroomMorphs 5 Monticello 2 SMLoader 1 DAVServerDirectory 1 UserObjects 1 TrueType
If we wanted to split Morphic further, these would be the numbers (but I don't think we should):
569 Morphic-Kernel 251 Morphic-Worlds 161 Morphic-Mentoring 135 Morphic-Basic 100 Morphic-Games 67 Morphic-Widgets 48 Morphic-Experimental 47 Morphic-Windows 19 Morphic-Demo 18 Morphic-Scripting Tiles 13 Morphic-Components 13 Morphic-Support 12 Morphic-Text Support 8 Morphic-TrueType 7 Morphic-Menus 7 Morphic-Pluggable Widgets 6 Morphic-Books 5 Morphic-PDA 1 Morphic-PartsBin
There is a slight problem with extension methods (methods defined in *categories), #translated currently would look for those in the wrong package:
============== "Count non-tile phrases that are in extension methods" domains := Dictionary new. GetTextExporter2 new appendStringReceivers: #translated into: domains; appendStringReceivers: #translatedNoop into: domains. extensionMethods := Set new. MCWorkingCopy allManagers do: [:wc | extensionMethods addAll: wc packageInfo extensionMethods]. phrases := Dictionary new. domains do: [:translations | translations keysAndValuesDo: [:phrase :mrefs | mrefs do: [:mref | (extensionMethods includes: mref) ifTrue: [ category := (mref category copyUpTo: $-) asLowercase. (phrases at: category ifAbsentPut: [Set new]) add: phrase]]]]. categories := Bag new. phrases keysAndValuesDo: [:cat :strings | categories add: cat withOccurrences: strings size]. categories sortedCounts ==============
199 *etoys 83 *morphicextras 18 *connectors 8 *morphic 5 *sound 1 *pango
We need to make #translated deal with this. I can think of a simple but inefficient way to do it - maybe it wouldn't hurt that much?
- Bert -
On 02.05.2010, at 00:16, Bert Freudenberg wrote:
There is a slight problem with extension methods (methods defined in *categories), #translated currently would look for those in the wrong package:
We need to make #translated deal with this. I can think of a simple but inefficient way to do it - maybe it wouldn't hurt that much?
I wonder if method properties could come to the rescue here. We could simply store the translation domain as property of the compiled method itself. We have about 1500 methods that send #translated, so that seems not to be bad at all space-wise, and should perform very well.
Btw, I think in the current implementation there is even a problem without extension methods. It only looks at the class of the sender of #translated. But the method could actually be defined in a superclass, which might be in a different package. That's simple to fix though, instead of "context receiver class" use "context mclass". This looks up the actual class where the method is defined.
Also, I wonder if we should use PackageInfo instead of lists of categories in TextDomainManager. The extension method categories often differ in upper/lower case from the class categories. In fact, it is possible to define packages that do not rely on category names at all.
And Ted: was Text>>translated only meant for translating Quickguides? Can we remove that now, since we are not using the gettext translations for the guides?
In any case, the properties idea is growing on me. The only problem seems to be to keep the property up-to-date. Could be part of the release process, though something automatic would be nicer for development. But the runtime benefits might be worth it.
- Bert -
2010/5/2 Bert Freudenberg bert@freudenbergs.de
On 02.05.2010, at 00:16, Bert Freudenberg wrote:
There is a slight problem with extension methods (methods defined in
*categories), #translated currently would look for those in the wrong package:
We need to make #translated deal with this. I can think of a simple but
inefficient way to do it - maybe it wouldn't hurt that much?
I wonder if method properties could come to the rescue here. We could simply store the translation domain as property of the compiled method itself. We have about 1500 methods that send #translated, so that seems not to be bad at all space-wise, and should perform very well.
What are method properties? I don' t understand how it will work?
Hilaire
On 03.05.2010, at 12:58, Hilaire Fernandes hilaire.fernandes@edu.ge.ch wrote:
2010/5/2 Bert Freudenberg bert@freudenbergs.de On 02.05.2010, at 00:16, Bert Freudenberg wrote:
There is a slight problem with extension methods (methods defined
in *categories), #translated currently would look for those in the wrong package:
We need to make #translated deal with this. I can think of a
simple but inefficient way to do it - maybe it wouldn't hurt that much?
I wonder if method properties could come to the rescue here. We could simply store the translation domain as property of the compiled method itself. We have about 1500 methods that send #translated, so that seems not to be bad at all space-wise, and should perform very well.
What are method properties? I don' t understand how it will work?
Hilaire
I'm just talking about an implementation detail. You do not have to worry about it, just use #translated in your code as usual.
"Method properties" are additional state you can attach to a compiled method. Similar to pragmas, but invisible. They are not created by tags in the source code while compiling, but are attached later.
I'm just proposing to use them to cache the translation domain for a method. Figuring this out properly at runtime is expensive (the code needs to work its way from the compiled method to the package it belongs to).
With this I think we could even get rid of the class registration? The translation domain would just be the package name of the method that sent #translated. How does that sound?
- Bert -
Bert Freudenberg a écrit :
I'm just talking about an implementation detail. You do not have to worry about it, just use #translated in your code as usual.
But I am interested by the implementation detail. I also want it ported to pharo.
"Method properties" are additional state you can attach to a compiled method. Similar to pragmas, but invisible. They are not created by tags in the source code while compiling, but are attached later.
Ok, thanks to explain.
I'm just proposing to use them to cache the translation domain for a method.
Ok, what is exactly translation domain, is it text domain?
Figuring this out properly at runtime is expensive (the code needs to work its way from the compiled method to the package it belongs to).
Right. Is it really expensive ? I can't evaluate.
With this I think we could even get rid of the class registration? The translation domain would just be the package name of the method that sent #translated. How does that sound?
On Tue, May 4, 2010 at 2:26 PM, Hilaire Fernandes < hilaire.fernandes@edu.ge.ch> wrote:
Is it really expensive ? I can't evaluate.
PackageOrganizer default packageOfMethod: aMethodReference
It took in my machine 25 milliseconds average (tested with all the methods with phrases to translate).
On 04.05.2010, at 10:39, Ricardo Moran wrote:
On Tue, May 4, 2010 at 2:26 PM, Hilaire Fernandes hilaire.fernandes@edu.ge.ch wrote: Is it really expensive ? I can't evaluate.
PackageOrganizer default packageOfMethod: aMethodReference
It took in my machine 25 milliseconds average (tested with all the methods with phrases to translate).
But in #translated you do not have the method reference yet.
- Bert -
Oh, yes that's right. In #translated we should do something like this, right?
thisContext sender method methodReference
On Tue, May 4, 2010 at 2:41 PM, Bert Freudenberg bert@freudenbergs.dewrote:
On 04.05.2010, at 10:39, Ricardo Moran wrote:
On Tue, May 4, 2010 at 2:26 PM, Hilaire Fernandes < hilaire.fernandes@edu.ge.ch> wrote:
Is it really expensive ? I can't evaluate.
PackageOrganizer default packageOfMethod: aMethodReference
It took in my machine 25 milliseconds average (tested with all the methods with phrases to translate).
But in #translated you do not have the method reference yet.
- Bert -
etoys-dev mailing list etoys-dev@squeakland.org http://lists.squeakland.org/mailman/listinfo/etoys-dev
On 04.05.2010, at 10:48, Ricardo Moran wrote:
Oh, yes that's right. In #translated we should do something like this, right?
thisContext sender method methodReference
Yes. Maybe it is not so expensive nowadays. But look at the #methodReference method, it uses #who, which used to iterate over all classes to find the method ... But it seems Eliot's changes do already cache the class as well as the method selector.
Also, performance testing should be done on the XO-1 which is the slowest machine we need to support.
- Bert -
On Tue, May 4, 2010 at 2:41 PM, Bert Freudenberg bert@freudenbergs.de wrote: On 04.05.2010, at 10:39, Ricardo Moran wrote:
On Tue, May 4, 2010 at 2:26 PM, Hilaire Fernandes hilaire.fernandes@edu.ge.ch wrote: Is it really expensive ? I can't evaluate.
PackageOrganizer default packageOfMethod: aMethodReference
It took in my machine 25 milliseconds average (tested with all the methods with phrases to translate).
But in #translated you do not have the method reference yet.
- Bert -
etoys-dev mailing list etoys-dev@squeakland.org http://lists.squeakland.org/mailman/listinfo/etoys-dev
No, #methodReference is not slow. But the PackageOrganizer is really expensive. I simply added the following line to #translated
PackageOrganizer default packageOfMethod: thisContext sender method methodReference.
and opening the World menu took a few seconds (opening a viewer is imposible).
Richo
On Tue, May 4, 2010 at 3:21 PM, Bert Freudenberg bert@freudenbergs.dewrote:
On 04.05.2010, at 10:48, Ricardo Moran wrote:
Oh, yes that's right. In #translated we should do something like this, right?
thisContext sender method methodReference
Yes. Maybe it is not so expensive nowadays. But look at the #methodReference method, it uses #who, which used to iterate over all classes to find the method ... But it seems Eliot's changes do already cache the class as well as the method selector.
Also, performance testing should be done on the XO-1 which is the slowest machine we need to support.
- Bert -
On Tue, May 4, 2010 at 2:41 PM, Bert Freudenberg bert@freudenbergs.dewrote:
On 04.05.2010, at 10:39, Ricardo Moran wrote:
On Tue, May 4, 2010 at 2:26 PM, Hilaire Fernandes < hilaire.fernandes@edu.ge.ch> wrote:
Is it really expensive ? I can't evaluate.
PackageOrganizer default packageOfMethod: aMethodReference
It took in my machine 25 milliseconds average (tested with all the methods with phrases to translate).
But in #translated you do not have the method reference yet.
- Bert -
etoys-dev mailing list etoys-dev@squeakland.org http://lists.squeakland.org/mailman/listinfo/etoys-dev
etoys-dev mailing list etoys-dev@squeakland.org http://lists.squeakland.org/mailman/listinfo/etoys-dev
Good test :)
Yes, #methodReference is now thousands of times faster than in Etoys 4.0. Literally.
But iterating over the packages to find the right one is expensive. So we should use a method property as cache, and also pre-fill that cache in the release image (otherwise the first-time opening would be too slow).
- Bert -
On 04.05.2010, at 11:30, Ricardo Moran wrote:
No, #methodReference is not slow. But the PackageOrganizer is really expensive. I simply added the following line to #translated
PackageOrganizer default packageOfMethod: thisContext sender method methodReference.
and opening the World menu took a few seconds (opening a viewer is imposible).
Richo
On Tue, May 4, 2010 at 3:21 PM, Bert Freudenberg bert@freudenbergs.de wrote: On 04.05.2010, at 10:48, Ricardo Moran wrote:
Oh, yes that's right. In #translated we should do something like this, right?
thisContext sender method methodReference
Yes. Maybe it is not so expensive nowadays. But look at the #methodReference method, it uses #who, which used to iterate over all classes to find the method ... But it seems Eliot's changes do already cache the class as well as the method selector.
Also, performance testing should be done on the XO-1 which is the slowest machine we need to support.
- Bert -
On Tue, May 4, 2010 at 2:41 PM, Bert Freudenberg bert@freudenbergs.de wrote: On 04.05.2010, at 10:39, Ricardo Moran wrote:
On Tue, May 4, 2010 at 2:26 PM, Hilaire Fernandes hilaire.fernandes@edu.ge.ch wrote: Is it really expensive ? I can't evaluate.
PackageOrganizer default packageOfMethod: aMethodReference
It took in my machine 25 milliseconds average (tested with all the methods with phrases to translate).
But in #translated you do not have the method reference yet.
- Bert -
etoys-dev mailing list etoys-dev@squeakland.org http://lists.squeakland.org/mailman/listinfo/etoys-dev
etoys-dev mailing list etoys-dev@squeakland.org http://lists.squeakland.org/mailman/listinfo/etoys-dev
On 04.05.2010, at 10:26, Hilaire Fernandes wrote:
Bert Freudenberg a écrit :
I'm just talking about an implementation detail. You do not have to worry about it, just use #translated in your code as usual.
But I am interested by the implementation detail. I also want it ported to pharo.
Ah, okay.
"Method properties" are additional state you can attach to a compiled method. Similar to pragmas, but invisible. They are not created by tags in the source code while compiling, but are attached later.
Ok, thanks to explain.
I'm just proposing to use them to cache the translation domain for a method.
Ok, what is exactly translation domain, is it text domain?
Yes - this determines in which mo file this is looked up.
Figuring this out properly at runtime is expensive (the code needs to work its way from the compiled method to the package it belongs to).
Right. Is it really expensive ? I can't evaluate.
We should time building a viewer before and after the changes to see how much the changes affect this. I think a viewer is the most expensive thing to construct.
- Bert -
Hi Bert,
On Sun, 2 May 2010, Bert Freudenberg wrote:
I think we do want to do something along those lines, yes.
Your expression misses some phrases, in particular the Etoys tile inscriptions:
"Count tile phrases" domains := Dictionary new. GetTextExporter2 new appendVocabularies: domains. phrases := Set new. domains do: [:translations | phrases addAll: translations keys]. phrases size "==> 748 phrases"
These tile phrases should be a separate pot IMHO, because those are the most important ones to translate.
+1 - this would help a lot, especially with duplicates occuring now.
There are 4251 non-tile phrases (sum of the below). Adding the 748 tile phrases and subtracting some duplicates this is in the ballpark of the 4412 phrases on pootle.
1427 Morphic 751 MorphicExtras 506 Etoys
[...]
For translating I'd prefer to have all 748 tile phrases in the Etoys-pot. Not all of the duplicates in English might be duplicates in other languages as well. Tiles are for kids primarily, other phrases rather for average computer-users. These two groups might speek different tongues.
Markus
On 02.05.2010, at 22:29, Markus Schlager wrote:
Hi Bert,
On Sun, 2 May 2010, Bert Freudenberg wrote:
I think we do want to do something along those lines, yes.
Your expression misses some phrases, in particular the Etoys tile inscriptions:
"Count tile phrases" domains := Dictionary new. GetTextExporter2 new appendVocabularies: domains. phrases := Set new. domains do: [:translations | phrases addAll: translations keys]. phrases size "==> 748 phrases"
These tile phrases should be a separate pot IMHO, because those are the most important ones to translate.
+1 - this would help a lot, especially with duplicates occuring now.
There are 4251 non-tile phrases (sum of the below). Adding the 748 tile phrases and subtracting some duplicates this is in the ballpark of the 4412 phrases on pootle.
1427 Morphic 751 MorphicExtras 506 Etoys
[...]
For translating I'd prefer to have all 748 tile phrases in the Etoys-pot. Not all of the duplicates in English might be duplicates in other languages as well. Tiles are for kids primarily, other phrases rather for average computer-users. These two groups might speek different tongues.
Ah, that brings up an interesting issue. Previously we called the only translation file "etoys.po". After splitting we would have an "Etoys.po" with the 506 phrases from the Etoys category I mentioned above, and a different one with the 748 tile phrases that we might call "Tiles.po". But it would be much more obvious if the essential translations were in a file called "Etoys.po", not "Tiles.po".
The only simple solution I can see is to add the tile phrases to the phrases in the Etoys category. So we would have 748 tiles + 506 other phrases in it. Would that be okay, or do we need to figure out a way to split these?
- Bert -
On Mon, 3 May 2010, Bert Freudenberg wrote:
On 02.05.2010, at 22:29, Markus Schlager wrote:
Hi Bert,
On Sun, 2 May 2010, Bert Freudenberg wrote:
I think we do want to do something along those lines, yes.
Your expression misses some phrases, in particular the Etoys tile inscriptions:
"Count tile phrases" domains := Dictionary new. GetTextExporter2 new appendVocabularies: domains. phrases := Set new. domains do: [:translations | phrases addAll: translations keys]. phrases size "==> 748 phrases"
These tile phrases should be a separate pot IMHO, because those are the most important ones to translate.
+1 - this would help a lot, especially with duplicates occuring now.
There are 4251 non-tile phrases (sum of the below). Adding the 748 tile phrases and subtracting some duplicates this is in the ballpark of the 4412 phrases on pootle.
1427 Morphic 751 MorphicExtras 506 Etoys
[...]
For translating I'd prefer to have all 748 tile phrases in the Etoys-pot. Not all of the duplicates in English might be duplicates in other languages as well. Tiles are for kids primarily, other phrases rather for average computer-users. These two groups might speek different tongues.
Ah, that brings up an interesting issue. Previously we called the only translation file "etoys.po". After splitting we would have an "Etoys.po" with the 506 phrases from the Etoys category I mentioned above, and a different one with the 748 tile phrases that we might call "Tiles.po". But it would be much more obvious if the essential translations were in a file called "Etoys.po", not "Tiles.po".
The only simple solution I can see is to add the tile phrases to the phrases in the Etoys category. So we would have 748 tiles + 506 other phrases in it. Would that be okay, or do we need to figure out a way to split these?
At least, it would be a lot better than it is now. A way to distinguish the phrases could be message-context. I'm not sure, whether pootle supports this already, though. My preferred contexts: 'tile', 'menu', 'balloon help'.
Markus
Thank you for the correction of the code. I was missing the most important translations!
Correct me if I'm wrong please, you are proposing: 1) to change the TextDomainManager to link the domain with the package instead of the class category, 2) to store this information in the method properties, 3) and to make #translated aware of all this changes.
Right?
Richo
On Sat, May 1, 2010 at 7:16 PM, Bert Freudenberg bert@freudenbergs.dewrote:
I think we do want to do something along those lines, yes.
Your expression misses some phrases, in particular the Etoys tile inscriptions:
"Count tile phrases" domains := Dictionary new. GetTextExporter2 new appendVocabularies: domains. phrases := Set new. domains do: [:translations | phrases addAll: translations keys]. phrases size "==> 748 phrases"
These tile phrases should be a separate pot IMHO, because those are the most important ones to translate.
Other than that, splitting by category seems reasonable. And since development is organized by packages now, maybe we should just have one po file per top-level category? Even if there is only 1 phrase in it?
I'm sure Korakurider has thought about that. Let's hear him :)
Here's my code to count phrases per package:
============== "Count non-tile phrases per category" domains := Dictionary new. GetTextExporter2 new appendStringReceivers: #translated into: domains; appendStringReceivers: #translatedNoop into: domains. phrases := Dictionary new. domains do: [:translations | translations keysAndValuesDo: [:phrase :mrefs | mrefs do: [:mref | category := (SystemOrganization categoryOfElement: mref classSymbol) copyUpTo: $-. (phrases at: category ifAbsentPut: [Set new]) add: phrase]]]. categories := Bag new. phrases keysAndValuesDo: [:cat :strings | categories add: cat withOccurrences: strings size]. categories sortedCounts ==============
There are 4251 non-tile phrases (sum of the below). Adding the 748 tile phrases and subtracting some duplicates this is in the ballpark of the 4412 phrases on pootle.
1427 Morphic 751 MorphicExtras 506 Etoys 262 System 252 Connectors 201 Tools 182 Sound 70 Movies 67 ST80 65 Protocols 62 Multilingual 60 Nebraska 55 Sugar 48 VideoForSqueak 38 WS 30 Graphics 28 Network 25 GStreamer 24 Kernel 20 Flash 19 Files 13 Compression 11 Collections 10 FSM 8 Balloon 7 BroomMorphs 5 Monticello 2 SMLoader 1 DAVServerDirectory 1 UserObjects 1 TrueType
If we wanted to split Morphic further, these would be the numbers (but I don't think we should):
569 Morphic-Kernel 251 Morphic-Worlds 161 Morphic-Mentoring 135 Morphic-Basic 100 Morphic-Games 67 Morphic-Widgets 48 Morphic-Experimental 47 Morphic-Windows 19 Morphic-Demo 18 Morphic-Scripting Tiles 13 Morphic-Components 13 Morphic-Support 12 Morphic-Text Support 8 Morphic-TrueType 7 Morphic-Menus 7 Morphic-Pluggable Widgets 6 Morphic-Books 5 Morphic-PDA 1 Morphic-PartsBin
There is a slight problem with extension methods (methods defined in *categories), #translated currently would look for those in the wrong package:
============== "Count non-tile phrases that are in extension methods" domains := Dictionary new. GetTextExporter2 new appendStringReceivers: #translated into: domains; appendStringReceivers: #translatedNoop into: domains. extensionMethods := Set new. MCWorkingCopy allManagers do: [:wc | extensionMethods addAll: wc packageInfo extensionMethods]. phrases := Dictionary new. domains do: [:translations | translations keysAndValuesDo: [:phrase :mrefs | mrefs do: [:mref | (extensionMethods includes: mref) ifTrue: [ category := (mref category copyUpTo: $-) asLowercase. (phrases at: category ifAbsentPut: [Set new]) add: phrase]]]]. categories := Bag new. phrases keysAndValuesDo: [:cat :strings | categories add: cat withOccurrences: strings size]. categories sortedCounts ==============
199 *etoys 83 *morphicextras 18 *connectors 8 *morphic 5 *sound 1 *pango
We need to make #translated deal with this. I can think of a simple but inefficient way to do it - maybe it wouldn't hurt that much?
- Bert -
etoys-dev mailing list etoys-dev@squeakland.org http://lists.squeakland.org/mailman/listinfo/etoys-dev
Yes. Someone understands me :)
Thanks for rephrasing.
- Bert -
On 04.05.2010, at 07:11, Ricardo Moran wrote:
Thank you for the correction of the code. I was missing the most important translations!
Correct me if I'm wrong please, you are proposing:
- to change the TextDomainManager to link the domain with the package instead of the class category,
- to store this information in the method properties,
- and to make #translated aware of all this changes.
Right?
Richo
On Sat, May 1, 2010 at 7:16 PM, Bert Freudenberg bert@freudenbergs.de wrote: I think we do want to do something along those lines, yes.
Your expression misses some phrases, in particular the Etoys tile inscriptions:
"Count tile phrases" domains := Dictionary new. GetTextExporter2 new appendVocabularies: domains. phrases := Set new. domains do: [:translations | phrases addAll: translations keys]. phrases size "==> 748 phrases"
These tile phrases should be a separate pot IMHO, because those are the most important ones to translate.
Other than that, splitting by category seems reasonable. And since development is organized by packages now, maybe we should just have one po file per top-level category? Even if there is only 1 phrase in it?
I'm sure Korakurider has thought about that. Let's hear him :)
Here's my code to count phrases per package:
============== "Count non-tile phrases per category" domains := Dictionary new. GetTextExporter2 new appendStringReceivers: #translated into: domains; appendStringReceivers: #translatedNoop into: domains. phrases := Dictionary new. domains do: [:translations | translations keysAndValuesDo: [:phrase :mrefs | mrefs do: [:mref | category := (SystemOrganization categoryOfElement: mref classSymbol) copyUpTo: $-. (phrases at: category ifAbsentPut: [Set new]) add: phrase]]]. categories := Bag new. phrases keysAndValuesDo: [:cat :strings | categories add: cat withOccurrences: strings size]. categories sortedCounts ==============
There are 4251 non-tile phrases (sum of the below). Adding the 748 tile phrases and subtracting some duplicates this is in the ballpark of the 4412 phrases on pootle.
1427 Morphic 751 MorphicExtras 506 Etoys 262 System 252 Connectors 201 Tools 182 Sound 70 Movies 67 ST80 65 Protocols 62 Multilingual 60 Nebraska 55 Sugar 48 VideoForSqueak 38 WS 30 Graphics 28 Network 25 GStreamer 24 Kernel 20 Flash 19 Files 13 Compression 11 Collections 10 FSM 8 Balloon 7 BroomMorphs 5 Monticello 2 SMLoader 1 DAVServerDirectory 1 UserObjects 1 TrueType
If we wanted to split Morphic further, these would be the numbers (but I don't think we should):
569 Morphic-Kernel 251 Morphic-Worlds 161 Morphic-Mentoring 135 Morphic-Basic 100 Morphic-Games 67 Morphic-Widgets 48 Morphic-Experimental 47 Morphic-Windows 19 Morphic-Demo 18 Morphic-Scripting Tiles 13 Morphic-Components 13 Morphic-Support 12 Morphic-Text Support 8 Morphic-TrueType 7 Morphic-Menus 7 Morphic-Pluggable Widgets 6 Morphic-Books 5 Morphic-PDA 1 Morphic-PartsBin
There is a slight problem with extension methods (methods defined in *categories), #translated currently would look for those in the wrong package:
============== "Count non-tile phrases that are in extension methods" domains := Dictionary new. GetTextExporter2 new appendStringReceivers: #translated into: domains; appendStringReceivers: #translatedNoop into: domains. extensionMethods := Set new. MCWorkingCopy allManagers do: [:wc | extensionMethods addAll: wc packageInfo extensionMethods]. phrases := Dictionary new. domains do: [:translations | translations keysAndValuesDo: [:phrase :mrefs | mrefs do: [:mref | (extensionMethods includes: mref) ifTrue: [ category := (mref category copyUpTo: $-) asLowercase. (phrases at: category ifAbsentPut: [Set new]) add: phrase]]]]. categories := Bag new. phrases keysAndValuesDo: [:cat :strings | categories add: cat withOccurrences: strings size]. categories sortedCounts ==============
199 *etoys 83 *morphicextras 18 *connectors 8 *morphic 5 *sound 1 *pango
We need to make #translated deal with this. I can think of a simple but inefficient way to do it - maybe it wouldn't hurt that much?
- Bert -
etoys-dev mailing list etoys-dev@squeakland.org http://lists.squeakland.org/mailman/listinfo/etoys-dev
I'm watching the packages of all the methods with translations in it and I found some methods that don't belong to any package:
a MethodReference Flaps class >> newNCPartsBinFlap a MethodReference Flaps class >> newConnectorsFlap a MethodReference TestRunner class >> descriptionForPartsBin a MethodReference Flaps class >> newClassDiagramConnectorsFlap a MethodReference TestRunner class >> windowColorSpecification a MethodReference Flaps class >> quadsDefiningFSMConnectorsFlap a MethodReference Flaps class >> quadsDefiningConnectorsFlap a MethodReference TestRunner class >> registerInFlapsRegistry
I used the following code (again, I may have made a mistake): ============== domains := Dictionary new.
"Count tile phrases" GetTextExporter2 new appendVocabularies: domains.
"Count tile phrases" GetTextExporter2 new appendStringReceivers: #translated into: domains; appendStringReceivers: #translatedNoop into: domains.
phrases := Set new. domains do: [:domain | domain do: [:translations | translations do: [:each | phrases add: each]]].
phrases select: [:each| (PackageOrganizer default packageOfMethod: each ifNone: [nil]) isNil] ==============
The methods from Flaps clearly belong to "Connectors". But what should we do with the others? Move it to the same package of their class? Also, why they don't belong to any package? Isn't that an error?
Richo
On Tue, May 4, 2010 at 11:16 AM, Bert Freudenberg bert@freudenbergs.dewrote:
Yes. Someone understands me :)
Thanks for rephrasing.
- Bert -
On 04.05.2010, at 07:11, Ricardo Moran wrote:
Thank you for the correction of the code. I was missing the most important translations!
Correct me if I'm wrong please, you are proposing:
- to change the TextDomainManager to link the domain with the package
instead of the class category, 2) to store this information in the method properties, 3) and to make #translated aware of all this changes.
Right?
Richo
On Sat, May 1, 2010 at 7:16 PM, Bert Freudenberg bert@freudenbergs.dewrote:
I think we do want to do something along those lines, yes.
Your expression misses some phrases, in particular the Etoys tile inscriptions:
"Count tile phrases" domains := Dictionary new. GetTextExporter2 new appendVocabularies: domains. phrases := Set new. domains do: [:translations | phrases addAll: translations keys]. phrases size "==> 748 phrases"
These tile phrases should be a separate pot IMHO, because those are the most important ones to translate.
Other than that, splitting by category seems reasonable. And since development is organized by packages now, maybe we should just have one po file per top-level category? Even if there is only 1 phrase in it?
I'm sure Korakurider has thought about that. Let's hear him :)
Here's my code to count phrases per package:
============== "Count non-tile phrases per category" domains := Dictionary new. GetTextExporter2 new appendStringReceivers: #translated into: domains; appendStringReceivers: #translatedNoop into: domains. phrases := Dictionary new. domains do: [:translations | translations keysAndValuesDo: [:phrase :mrefs | mrefs do: [:mref | category := (SystemOrganization categoryOfElement: mref classSymbol) copyUpTo: $-. (phrases at: category ifAbsentPut: [Set new]) add: phrase]]]. categories := Bag new. phrases keysAndValuesDo: [:cat :strings | categories add: cat withOccurrences: strings size]. categories sortedCounts ==============
There are 4251 non-tile phrases (sum of the below). Adding the 748 tile phrases and subtracting some duplicates this is in the ballpark of the 4412 phrases on pootle.
1427 Morphic 751 MorphicExtras 506 Etoys 262 System 252 Connectors 201 Tools 182 Sound 70 Movies 67 ST80 65 Protocols 62 Multilingual 60 Nebraska 55 Sugar 48 VideoForSqueak 38 WS 30 Graphics 28 Network 25 GStreamer 24 Kernel 20 Flash 19 Files 13 Compression 11 Collections 10 FSM 8 Balloon 7 BroomMorphs 5 Monticello 2 SMLoader 1 DAVServerDirectory 1 UserObjects 1 TrueType
If we wanted to split Morphic further, these would be the numbers (but I don't think we should):
569 Morphic-Kernel 251 Morphic-Worlds 161 Morphic-Mentoring 135 Morphic-Basic 100 Morphic-Games 67 Morphic-Widgets 48 Morphic-Experimental 47 Morphic-Windows 19 Morphic-Demo 18 Morphic-Scripting Tiles 13 Morphic-Components 13 Morphic-Support 12 Morphic-Text Support 8 Morphic-TrueType 7 Morphic-Menus 7 Morphic-Pluggable Widgets 6 Morphic-Books 5 Morphic-PDA 1 Morphic-PartsBin
There is a slight problem with extension methods (methods defined in *categories), #translated currently would look for those in the wrong package:
============== "Count non-tile phrases that are in extension methods" domains := Dictionary new. GetTextExporter2 new appendStringReceivers: #translated into: domains; appendStringReceivers: #translatedNoop into: domains. extensionMethods := Set new. MCWorkingCopy allManagers do: [:wc | extensionMethods addAll: wc packageInfo extensionMethods]. phrases := Dictionary new. domains do: [:translations | translations keysAndValuesDo: [:phrase :mrefs | mrefs do: [:mref | (extensionMethods includes: mref) ifTrue: [ category := (mref category copyUpTo: $-) asLowercase. (phrases at: category ifAbsentPut: [Set new]) add: phrase]]]]. categories := Bag new. phrases keysAndValuesDo: [:cat :strings | categories add: cat withOccurrences: strings size]. categories sortedCounts ==============
199 *etoys 83 *morphicextras 18 *connectors 8 *morphic 5 *sound 1 *pango
We need to make #translated deal with this. I can think of a simple but inefficient way to do it - maybe it wouldn't hurt that much?
- Bert -
etoys-dev mailing list etoys-dev@squeakland.org http://lists.squeakland.org/mailman/listinfo/etoys-dev
etoys-dev mailing list etoys-dev@squeakland.org http://lists.squeakland.org/mailman/listinfo/etoys-dev
On 04.05.2010, at 10:14, Ricardo Moran wrote:
I'm watching the packages of all the methods with translations in it and I found some methods that don't belong to any package:
a MethodReference Flaps class >> newNCPartsBinFlap a MethodReference Flaps class >> newConnectorsFlap a MethodReference TestRunner class >> descriptionForPartsBin a MethodReference Flaps class >> newClassDiagramConnectorsFlap a MethodReference TestRunner class >> windowColorSpecification a MethodReference Flaps class >> quadsDefiningFSMConnectorsFlap a MethodReference Flaps class >> quadsDefiningConnectorsFlap a MethodReference TestRunner class >> registerInFlapsRegistry
I used the following code (again, I may have made a mistake):
domains := Dictionary new.
"Count tile phrases" GetTextExporter2 new appendVocabularies: domains.
"Count tile phrases" GetTextExporter2 new appendStringReceivers: #translated into: domains; appendStringReceivers: #translatedNoop into: domains.
phrases := Set new. domains do: [:domain | domain do: [:translations | translations do: [:each | phrases add: each]]].
phrases select: [:each| (PackageOrganizer default packageOfMethod: each ifNone: [nil]) isNil]
The methods from Flaps clearly belong to "Connectors". But what should we do with the others? Move it to the same package of their class? Also, why they don't belong to any package? Isn't that an error?
Richo
Yes, it's an error. I had some code that checked for unpackaged methods but it was faulty and did not find them:
| packaged | packaged := false. MCWorkingCopy managersForClass: Flaps class selector: #newConnectorsFlap do: [:wc | packaged := true ]. packaged
So that Monticello code is buggy :( Below I have a snippet that does work (but make sure there are no spurious MC packages in the MC browser before).
I will move the methods to their proper package.
Also, there are some packages registered that do not correspond to a Monticello package anymore:
PackageOrganizer default packages difference: (MCWorkingCopy allManagers collect: [:ea | ea packageInfo])
I just posted an update to fix that.
- Bert -
"Find all unpackaged classes and methods by category" | allClasses allMethods packagedClasses packagedMethods unpackagedClasses unpackagedMethods | (PackageOrganizer default packages difference: (MCWorkingCopy allManagers collect: [:wc | wc packageInfo])) do: [:pkg | PackageOrganizer default unregisterPackage: pkg]. allClasses := Smalltalk allClasses. allMethods := SystemNavigation default selectAllMethods: [:m | m selector ifNil: [false] ifNotNilDo: [:sel | sel isDoIt not]]. packagedClasses := Set new. packagedMethods := Set new. PackageOrganizer default packages do: [:pkg | packagedClasses addAll: pkg classes. packagedMethods addAll: pkg methods]. unpackagedClasses := Dictionary new. unpackagedMethods := Dictionary new. (allClasses difference: packagedClasses) do: [:cls | (unpackagedClasses at: cls category asString ifAbsentPut: [OrderedCollection new]) add: cls. allMethods := allMethods reject: [:m | cls name = m classSymbol]]. (allMethods difference: packagedMethods) do: [:mref | (unpackagedMethods at: mref category asString ifAbsentPut: [OrderedCollection new]) add: mref]. {unpackagedClasses. unpackagedMethods} explore
etoys-dev@lists.squeakfoundation.org