I think we do want to do something along those lines, yes.
Your expression misses some phrases, in particular the Etoys tile inscriptions:
"Count tile phrases" domains := Dictionary new. GetTextExporter2 new appendVocabularies: domains. phrases := Set new. domains do: [:translations | phrases addAll: translations keys]. phrases size "==> 748 phrases"
These tile phrases should be a separate pot IMHO, because those are the most important ones to translate.
Other than that, splitting by category seems reasonable. And since development is organized by packages now, maybe we should just have one po file per top-level category? Even if there is only 1 phrase in it?
I'm sure Korakurider has thought about that. Let's hear him :)
Here's my code to count phrases per package:
============== "Count non-tile phrases per category" domains := Dictionary new. GetTextExporter2 new appendStringReceivers: #translated into: domains; appendStringReceivers: #translatedNoop into: domains. phrases := Dictionary new. domains do: [:translations | translations keysAndValuesDo: [:phrase :mrefs | mrefs do: [:mref | category := (SystemOrganization categoryOfElement: mref classSymbol) copyUpTo: $-. (phrases at: category ifAbsentPut: [Set new]) add: phrase]]]. categories := Bag new. phrases keysAndValuesDo: [:cat :strings | categories add: cat withOccurrences: strings size]. categories sortedCounts ==============
There are 4251 non-tile phrases (sum of the below). Adding the 748 tile phrases and subtracting some duplicates this is in the ballpark of the 4412 phrases on pootle.
1427 Morphic 751 MorphicExtras 506 Etoys 262 System 252 Connectors 201 Tools 182 Sound 70 Movies 67 ST80 65 Protocols 62 Multilingual 60 Nebraska 55 Sugar 48 VideoForSqueak 38 WS 30 Graphics 28 Network 25 GStreamer 24 Kernel 20 Flash 19 Files 13 Compression 11 Collections 10 FSM 8 Balloon 7 BroomMorphs 5 Monticello 2 SMLoader 1 DAVServerDirectory 1 UserObjects 1 TrueType
If we wanted to split Morphic further, these would be the numbers (but I don't think we should):
569 Morphic-Kernel 251 Morphic-Worlds 161 Morphic-Mentoring 135 Morphic-Basic 100 Morphic-Games 67 Morphic-Widgets 48 Morphic-Experimental 47 Morphic-Windows 19 Morphic-Demo 18 Morphic-Scripting Tiles 13 Morphic-Components 13 Morphic-Support 12 Morphic-Text Support 8 Morphic-TrueType 7 Morphic-Menus 7 Morphic-Pluggable Widgets 6 Morphic-Books 5 Morphic-PDA 1 Morphic-PartsBin
There is a slight problem with extension methods (methods defined in *categories), #translated currently would look for those in the wrong package:
============== "Count non-tile phrases that are in extension methods" domains := Dictionary new. GetTextExporter2 new appendStringReceivers: #translated into: domains; appendStringReceivers: #translatedNoop into: domains. extensionMethods := Set new. MCWorkingCopy allManagers do: [:wc | extensionMethods addAll: wc packageInfo extensionMethods]. phrases := Dictionary new. domains do: [:translations | translations keysAndValuesDo: [:phrase :mrefs | mrefs do: [:mref | (extensionMethods includes: mref) ifTrue: [ category := (mref category copyUpTo: $-) asLowercase. (phrases at: category ifAbsentPut: [Set new]) add: phrase]]]]. categories := Bag new. phrases keysAndValuesDo: [:cat :strings | categories add: cat withOccurrences: strings size]. categories sortedCounts ==============
199 *etoys 83 *morphicextras 18 *connectors 8 *morphic 5 *sound 1 *pango
We need to make #translated deal with this. I can think of a simple but inefficient way to do it - maybe it wouldn't hurt that much?
- Bert -