Hi all, the m17n stuff is finally ready for prime time. I've uploaded the change sets plus a install do-it to:
http://impara.de/drop/m17n/install-m17n.zip
After some discussion Yoshiki and I decided that the best way to deal with translations would be to not have them in the standard image but have them as packages on SqueakMap. This way they are easy to update and load. Basically all that needs to be in such a packet is the translation file. I've uploaded the latest German translation file (I hope it's the latest) on SqueakMap as an example.
http://map1.squeakfoundation.org/sm/account/package/ecf67f1d-181e-4daa-b5ec-...
The translation files should be named according to the ISO language and country codes, e.g. fr, de, sv, es, es-cat, en-us, en-gb etc.
The translation files can be created with Diego's language editor (a preliminary unpackaged version is here (http://impara.de/drop/m17n/LanguageEditor.st). I've adopted it to work with the refactored locale and translation framework. Strings in the files need to be in UTF-8. For converting files created with older versions of Squeak see the method snippet below.
And thansk again for Yoshiki to making m17n possible!!!
Enjoy :-)
Michael
----------
convertTranslationToUTF: fileName "NaturalLanguageTranslator convertTranslationToUTF: 'de.translation' "
| stream refStream loadedArray translations untranslated converted | stream := FileStream readOnlyFileNamed: fileName. [refStream := ReferenceStream on: stream. loadedArray := refStream next] ensure: [stream close].
translations := Dictionary new: loadedArray first basicSize. loadedArray first keysAndValuesDo: [:key :value | translations at: key put: (value convertFromWithConverter: MacRomanTextConverter new)]. untranslated := Dictionary new: loadedArray second basicSize. loadedArray second keysAndValuesDo: [:key :value | converted := key convertToWithConverter: UTF8TextConverter new. untranslated at: converted put: converted].
stream := ReferenceStream fileNamed: fileName , '-utf8'. stream nextPut: {translations. untranslated}. stream close
Michael,
Thank you for the packaging effort!
The installation process into a flesh image asks me the author initial. I guess that this results in the different author initials for the methods of AbstractString, depending on who install it?
-- Yoshiki
Yoshiki Ohshima wrote:
The installation process into a flesh image asks me the author initial. I guess that this results in the different author initials for the methods of AbstractString, depending on who install it?
Hmm, I took out the do-it for setting your author initials in the installer as a reminder that in the update stream they should be set to "yo" :-) Doug, can you make sure that happens in the right places? Before the first and after the last set of changes?
Michael
Folks -
It's official now...
Alan's "Turing Award Lecture" will be delivered at OOPSLA (Vancouver) sometime later in the day on Tuesday, October 26th. This is the "real" first day of the Conference (workshops/tutorials are beforehand.)
As a reminder, early registration ends September 16th. http://www.oopsla.org/2004/ShowPage.do?id=Home
Should be a fun time -- it's certainly a fun place.
- Dan
Am 29.07.2004 um 23:39 schrieb Dan Ingalls:
Folks -
It's official now...
Alan's "Turing Award Lecture" will be delivered at OOPSLA (Vancouver) sometime later in the day on Tuesday, October 26th. This is the "real" first day of the Conference (workshops/tutorials are beforehand.)
As a reminder, early registration ends September 16th. http://www.oopsla.org/2004/ShowPage.do?id=Home
Should be a fun time -- it's certainly a fun place.
Will it be possible to put a video of Alan's talk online? That would be realy nice for those who can't be there (like myself).
Marcus
On Jul 29, 2004, at 11:55 AM, Michael Rueger wrote:
Yoshiki Ohshima wrote:
The installation process into a flesh image asks me the author initial. I guess that this results in the different author initials for the methods of AbstractString, depending on who install it?
Hmm, I took out the do-it for setting your author initials in the installer as a reminder that in the update stream they should be set to "yo" :-) Doug, can you make sure that happens in the right places? Before the first and after the last set of changes?
Yeah, I'll be sure to do that. That should look like this:
-------------------------------------- originalInitials _ Utilities authorInitialsPerSe. Utilities setAuthorInitials: 'yo'.
"... file in all the m17n changes here ..."
Utilities setAuthorInitials: originalInitials. --------------------------------------
- Doug
"Michael Rueger" michael@squeakland.org wrote:
<snip>
Hi all, the m17n stuff is finally ready for prime time.
</snip>
Hi all,
Installation of m17n works, but I have now two questions:
* What is the correct way to install the fonts in the font directory? The comment in StrikeFontSet>> installExternalFontFileName:encoding:encodingName:textStyleName:
is clearly not up to date, it contains the encoding classes that are now obsolete.
* Also: In Unicode class>>value: we find: Smalltalk systemLanguage We do not currently (in Squeak 3.7-m17n #0 on top of Squeak 3.8 alpha #5976) have an implementor of systemLanguage.
Further questions will certainly come when I progress with this stuff.
Greetings, Boris
Boris Gaertner wrote:
"Michael Rueger" michael@squeakland.org wrote:
- Also:
In Unicode class>>value: we find: Smalltalk systemLanguage We do not currently (in Squeak 3.7-m17n #0 on top of Squeak 3.8 alpha #5976) have an implementor of systemLanguage.
You have the "old" installer. The newest one that I announced last night has that fixed.
Michael
Boris,
- What is the correct way to install the fonts in the font directory?
The comment in StrikeFontSet>> installExternalFontFileName:encoding:encodingName:textStyleName:
is clearly not up to date, it contains the encoding classes that are now obsolete.
This needs to be revised. the manual call to installExternalFontFileName6:encoding:encodingName:textStyleName: works for Japanese, but this and the auto-load mechanism I've been thinking about should be combined...
-- Yoshiki
On Wednesday 28 July 2004 10:45 pm, Michael Rueger wrote:
Hi all, the m17n stuff is finally ready for prime time. I've uploaded the change sets plus a install do-it to:
Why not a SAR? It's almost there...
Also, the installer has two filenames that end in spaces. This causes problems because the zip doesn't have those files named that way:
#('1010converterRefacor.cs ' '1011FileStreamPrep.cs ')
Ned Konz wrote:
Why not a SAR? It's almost there...
Because in recent discussions the general consensus was to not use sar's in the update stream IIRC.
Also, the installer has two filenames that end in spaces. This causes problems because the zip doesn't have those files named that way:
#('1010converterRefacor.cs ' '1011FileStreamPrep.cs ')
Hmm, then why did it work on my machine? Windows just drops the trailing spaces?
Michael
On Jul 29, 2004, at 1:36 PM, Michael Rueger wrote:
Ned Konz wrote:
Why not a SAR? It's almost there...
Because in recent discussions the general consensus was to not use sar's in the update stream IIRC.
In this case a SAR would have been okay too, the important thing is that it is an ordered list of *only* .cs/.st files if we want to include them directly in the update stream. I just noticed now that this new .zip file is indeed such a group of files, so I don't have to worry about installing them via SqueakMap or some other hack. (Thanks Michael!)
- Doug
On Wednesday 28 July 2004 10:45 pm, Michael Rueger wrote:
Hi all, the m17n stuff is finally ready for prime time. I've uploaded the change sets plus a install do-it to:
You should try running with the deprecation warnings enabled...
On Jul 29, 2004, at 12:59 PM, Ned Konz wrote:
On Wednesday 28 July 2004 10:45 pm, Michael Rueger wrote:
Hi all, the m17n stuff is finally ready for prime time. I've uploaded the change sets plus a install do-it to:
You should try running with the deprecation warnings enabled...
This brings up an important side point... I haven't yet deleted the methods in 3.8alpha which were deprecated during 3.7alpha/beta. Normally I'd do this right at the beginning of 3.8alpha, but as Ned says at the moment it would probably break some things, and I wanted to let Michael & Yoshiki proceed with the m17n changes first.
I can go ahead and delete these deprecated methods after m17n has settled a bit. Well, as long as we don't deprecate new methods in 3.8alpha just yet. Actually, I guess I will just need to keep track of which methods were deprecated in 3.7, versus the ones newly deprecated in 3.8, that shouldn't be a big deal.
(And yes, I was about to put out a 3.7gamma candidate image, until getting sidetracked a bit with this m17n stuff. Will be arriving shortly...)
- Doug
Am 30.07.2004 um 07:27 schrieb Doug Way:
You should try running with the deprecation warnings enabled...
This brings up an important side point... I haven't yet deleted the methods in 3.8alpha which were deprecated during 3.7alpha/beta. Normally I'd do this right at the beginning of 3.8alpha, but as Ned says at the moment it would probably break some things, and I wanted to let Michael & Yoshiki proceed with the m17n changes first.
I can go ahead and delete these deprecated methods after m17n has settled a bit. Well, as long as we don't deprecate new methods in 3.8alpha just yet. Actually, I guess I will just need to keep track of which methods were deprecated in 3.7, versus the ones newly deprecated in 3.8, that shouldn't be a big deal.
(And yes, I was about to put out a 3.7gamma candidate image, until getting sidetracked a bit with this m17n stuff. Will be arriving shortly...)
and we should enable deprecation warnings while in alpha/beta...
Marcus
On Wednesday 28 July 2004 10:45 pm, Michael Rueger wrote:
Hi all, the m17n stuff is finally ready for prime time. I've uploaded the change sets plus a install do-it to:
I have a couple of problems with some of these changes:
* They change the default behavior of (non-binary) low-level streams, assuming that they contain text. However, that's not always the case.
As an example, the SqueakMap checkpoints are stored as compressed text. The SqueakMap loader does something like:
contents := (self directory oldFileNamed: fname) ascii upToEnd unzipped. stream := (RWBinaryOrTextStream with: contents) reset.
With these changes, though, oldFileNamed: returns a MultiByteFileStream. Which would be OK if its converter was the Latin1TextConverter (which maps bytes to characters 1:1), but it's not. It is, instead, a UTF8TextConverter.
This causes the compressed data in the gzipped file to be interpreted as UTF-8, which it isn't. And so the load fails.
How can we assume at open time that an arbitrary file does, in fact, contain text? Sure, many do, but not all.
This would seem to be knowledge that only the user of that file would have.
Similarly,
s _ MultiByteBinaryOrTextStream on: String new. s converter => an UTF8TextConverter
Again, the default assumption is that the String will hold text -- even though there's nothing in it yet! It seems to me that the default converter for this stream should be the Latin1TextConverter. If a particular user of a String has a need for or knowledge of a particular encoding, they can change the converter.
If there are cases where we're using files *as text* and this policy doesn't work, then they should be changed to specify their preferred encoding.
However, I don't think it's right to introduce new and incompatible character conversion semantics on the existing file API.
On Thursday 29 July 2004 11:16 am, Ned Konz wrote:
Again, the default assumption is that the String will hold text -- even though there's nothing in it yet! It seems to me that the default converter for this stream should be the Latin1TextConverter. If a particular user of a String has a need for or knowledge of a particular encoding, they can change the converter.
If there are cases where we're using files *as text* and this policy doesn't work, then they should be changed to specify their preferred encoding.
Going a bit further, we see that MultiByteBinaryOrTextStream is much too familiar with its clients:
open: fileName forWrite: writeMode
| result | result _ super open: fileName forWrite: writeMode. result ifNotNil: [ converter ifNil: [ self localName = (FileDirectory localNameFor: SmalltalkImage current sourcesName) ifTrue: [ converter _ MacRomanTextConverter new ] ifFalse: [ converter _ UTF8TextConverter new. ]. ]. self detectLineEndConvention. ]. ^result.
I *really* don't think that the stream class should be guessing what its uses are. Is there a good reason for doing this?
Hi folks,
I'm trying to learn how to use Monticello but minnow.cc.gatech.edu is down it seems. Is there any offline documentation for this somewhere else?
Steve
On Jul 29, 2004, at 1:14 PM, Steven Riggins wrote:
Hi folks,
I'm trying to learn how to use Monticello but minnow.cc.gatech.edu is down it seems. Is there any offline documentation for this somewhere else?
Yes, at www.wiresong.ca... which is also down. However, the Google cache is your friend:
http://66.102.7.104/search?q=cache:EIvPL_0_MOEJ:minnow.cc.gatech.edu/ squeak/3328++site minnow.cc.gatech.edu+squeak+swiki+versioning+with+monticello&hl=en&lr=&i e=UTF-8&strip=1 http://www.google.com/search?hl=en&lr=&ie=UTF-8&safe=off&q=+...: www.wiresong.ca+monticello+wiresong.ca
Quoting Avi Bryant avi@beta4.com:
On Jul 29, 2004, at 1:14 PM, Steven Riggins wrote:
Hi folks,
I'm trying to learn how to use Monticello but minnow.cc.gatech.edu is down it seems. Is there any offline documentation for this somewhere else?
Yes, at www.wiresong.ca... which is also down. However, the Google cache is your friend:
Sorry about that. Unfortunately I won't be able to fix it until Saturday. I'm in a sleepy little surf town at the moment, so I've got access to email and the web, but nothing else.
Colin
Ned,
Going a bit further, we see that MultiByteBinaryOrTextStream is much too familiar with its clients:
open: fileName forWrite: writeMode
| result | result _ super open: fileName forWrite: writeMode. result ifNotNil: [ converter ifNil: [ self localName = (FileDirectory localNameFor: SmalltalkImage current sourcesName) ifTrue: [ converter _ MacRomanTextConverter new ] ifFalse: [ converter _ UTF8TextConverter new. ]. ]. self detectLineEndConvention. ]. ^result.
I *really* don't think that the stream class should be guessing what its uses are. Is there a good reason for doing this?
It is a tentative hack until we update the .sources file. We could move this check to other places if you want.
-- Yoshiki
Ned Konz wrote:
On Wednesday 28 July 2004 10:45 pm, Michael Rueger wrote:
Hi all, the m17n stuff is finally ready for prime time. I've uploaded the change sets plus a install do-it to:
I have a couple of problems with some of these changes:
- They change the default behavior of (non-binary) low-level streams, assuming
that they contain text. However, that's not always the case.
As an example, the SqueakMap checkpoints are stored as compressed text. The SqueakMap loader does something like:
contents := (self directory oldFileNamed: fname) ascii upToEnd unzipped. stream := (RWBinaryOrTextStream with: contents) reset.
With these changes, though, oldFileNamed: returns a MultiByteFileStream. Which would be OK if its converter was the Latin1TextConverter (which maps bytes to characters 1:1), but it's not. It is, instead, a UTF8TextConverter.
Same thing happens in ChangeList when trying to read a gzipped file.
zipped _ GZipReadStream on: (FileStream readOnlyFileNamed: fullName). unzipped _ ReadStream on: zipped contents asString. ChangeList browseStream: unzipped
FileStream readOnlyFileNamed: returns a MultiByteFileStream and GZipReadStream fails.
Karl
This causes the compressed data in the gzipped file to be interpreted as UTF-8, which it isn't. And so the load fails.
How can we assume at open time that an arbitrary file does, in fact, contain text? Sure, many do, but not all.
This would seem to be knowledge that only the user of that file would have.
Similarly,
s _ MultiByteBinaryOrTextStream on: String new. s converter => an UTF8TextConverter
Again, the default assumption is that the String will hold text -- even though there's nothing in it yet! It seems to me that the default converter for this stream should be the Latin1TextConverter. If a particular user of a String has a need for or knowledge of a particular encoding, they can change the converter.
If there are cases where we're using files *as text* and this policy doesn't work, then they should be changed to specify their preferred encoding.
However, I don't think it's right to introduce new and incompatible character conversion semantics on the existing file API.
-- Ned Konz http://bike-nomad.com
Hello,
As an example, the SqueakMap checkpoints are stored as compressed text. The SqueakMap loader does something like:
contents := (self directory oldFileNamed: fname) ascii upToEnd unzipped. stream := (RWBinaryOrTextStream with: contents) reset.
With these changes, though, oldFileNamed: returns a MultiByteFileStream. Which would be OK if its converter was the Latin1TextConverter (which maps bytes to characters 1:1), but it's not. It is, instead, a UTF8TextConverter.
Same thing happens in ChangeList when trying to read a gzipped file.
zipped _ GZipReadStream on: (FileStream readOnlyFileNamed: fullName). unzipped _ ReadStream on: zipped contents asString. ChangeList browseStream: unzipped
FileStream readOnlyFileNamed: returns a MultiByteFileStream and GZipReadStream fails.
You can always specify your converter. In this case, something like
contents := (self directory oldFileNamed: fname) ascii upToEnd unzipped. stream := (MultiByteBinaryOrTextStream with: contents) reset. stream converter: Latin1TextConverter new.
should do it.
This would seem to be knowledge that only the user of that file would have.
And the user can specify it.
Again, the default assumption is that the String will hold text -- even though there's nothing in it yet! It seems to me that the default converter for this stream should be the Latin1TextConverter. If a particular user of a String has a need for or knowledge of a particular encoding, they can change the converter.
No. If the default is Latin1TextConverter, there would be more problems.
However, I don't think it's right to introduce new and incompatible character conversion semantics on the existing file API.
The rule of thumb is that if you open a file, you should think about it is text or binary, and if it is text, you should think about how it is interpreted.
-- Yoshiki
Yoshiki Ohshima Yoshiki.Ohshima@acm.org wrote:
Hello,
As an example, the SqueakMap checkpoints are stored as compressed text. The SqueakMap loader does something like:
contents := (self directory oldFileNamed: fname) ascii upToEnd unzipped. stream := (RWBinaryOrTextStream with: contents) reset.
[SNIP]
You can always specify your converter. In this case, something like
contents := (self directory oldFileNamed: fname) ascii upToEnd unzipped. stream := (MultiByteBinaryOrTextStream with: contents) reset. stream converter: Latin1TextConverter new.
should do it.
Eh... first of all - I am not sure why I send #ascii in that SqueakMap code, perhaps it shouldn't be there? I don't rememner, I think I copied that from CodeLoader>>installSegment: or something. Since it is an ImageSegment in there - not text - what *should* it say?
I understand this needs to be fixed ASAP, otherwise SM doesn't work.
regards, Göran
Göran,
Eh... first of all - I am not sure why I send #ascii in that SqueakMap code, perhaps it shouldn't be there? I don't rememner, I think I copied that from CodeLoader>>installSegment: or something. Since it is an ImageSegment in there - not text - what *should* it say?
I understand this needs to be fixed ASAP, otherwise SM doesn't work.
I believe that downloading and installing most of the packages from SM works, if the package contains only code.
It isn't still great, and of course the problem should be fixed as soon as possible...
-- Yoshiki
Hi Yoshiki!
Yoshiki Ohshima Yoshiki.Ohshima@acm.org wrote:
Göran,
Eh... first of all - I am not sure why I send #ascii in that SqueakMap code, perhaps it shouldn't be there? I don't rememner, I think I copied that from CodeLoader>>installSegment: or something. Since it is an ImageSegment in there - not text - what *should* it say?
I understand this needs to be fixed ASAP, otherwise SM doesn't work.
I believe that downloading and installing most of the packages from SM works, if the package contains only code.
It isn't still great, and of course the problem should be fixed as soon as possible...
Well, Ned was a bit unclear I think - the code he is talking about is when SMSqueakMap loads a new snapshot of the map itself. (SMSqueakMap>>reload) Not when you install packages.
So... well, I haven't loaded m17 in any image yet. Does it fail or what?
This code is run whenever you open the SqueakMap2 Package Loader or select "update map from the net" in the menu, or run "SMSqueakMap default loadUpdates" - *if* there is a newer snapshot on the server.
And well, you can of course test it in a workspace with:
SMSqueakMap default reload
-- Yoshiki
regards, Göran
goran.krampe@bluefish.se wrote:
Well, Ned was a bit unclear I think - the code he is talking about is when SMSqueakMap loads a new snapshot of the map itself. (SMSqueakMap>>reload) Not when you install packages.
So... well, I haven't loaded m17 in any image yet. Does it fail or what?
PackageLoader works, I've successfully loaded at least the Games and Connectors package. Connectors has a few conflicts with m17n changes, but the loading worked.
Michael
On Friday 30 July 2004 5:21 am, goran.krampe@bluefish.se wrote:
Eh... first of all - I am not sure why I send #ascii in that SqueakMap code, perhaps it shouldn't be there? I don't rememner,
First, #ascii doesn't mean that. What it means is "give me a String" (which is the same thing that #text means). If you want a ByteArray you say #binary.
Unfortunately, #unzipped only works with Strings. But there is an #asUnzippedStream which (now) gives you a MultiByteBinaryOrTextStream (after reading in the entire contents, unfortunately, which is what you were doing anyway).
Ideally, #unzipped should also work with ByteArrays and Streams as well (perhaps #unzippedContents)?
I think I copied that from CodeLoader>>installSegment: or something. Since it is an ImageSegment in there - not text - what *should* it say?
I understand this needs to be fixed ASAP, otherwise SM doesn't work.
I'm working on it.
On Thursday 29 July 2004 2:14 pm, Yoshiki Ohshima wrote:
Again, the default assumption is that the String will hold text -- even though there's nothing in it yet! It seems to me that the default converter for this stream should be the Latin1TextConverter. If a particular user of a String has a need for or knowledge of a particular encoding, they can change the converter.
No. If the default is Latin1TextConverter, there would be more problems.
Like what? If everyone who wants text is specifying the type (like you suggest below) there shouldn't be any problems.
However, I don't think it's right to introduce new and incompatible character conversion semantics on the existing file API.
The rule of thumb is that if you open a file, you should think about it is text or binary, and if it is text, you should think about how it is interpreted.
Sure. And the authors of the code that was broken had done that when they wrote it.
Ned,
No. If the default is Latin1TextConverter, there would be more problems.
Like what? If everyone who wants text is specifying the type (like you suggest below) there shouldn't be any problems.
My reasoning is that if a user is opening a file to write a text, I would like to provide a file stream that supports the all (most of) possible characters by default.
The rule of thumb is that if you open a file, you should think about it is text or binary, and if it is text, you should think about how it is interpreted.
Sure. And the authors of the code that was broken had done that when they wrote it.
I had some hard time to parse this sentense... What does this "that" denotes? ^^; I'd assume that "that" means "think about how it is interpreted", right? If so, I found that the assumptions by some of the authors when they wrote it needs to be changed.
-- Yoshiki
On Friday 30 July 2004 7:20 am, Yoshiki Ohshima wrote:
The rule of thumb is that if you open a file, you should think about it is text or binary, and if it is text, you should think about how it is interpreted.
Sure. And the authors of the code that was broken had done that when they wrote it.
I had some hard time to parse this sentense... What does this "that" denotes? ^^; I'd assume that "that" means "think about how it is interpreted", right? If so, I found that the assumptions by some of the authors when they wrote it needs to be changed.
Sorry. I was just saying that we have to be careful not to break the prior assumptions.
All previously-existing code expected that text streams would return Characters that had a 1:1 mapping to the bytes in the file.
Ned,
Sorry. I was just saying that we have to be careful not to break the prior assumptions.
All previously-existing code expected that text streams would return Characters that had a 1:1 mapping to the bytes in the file.
I would say that it should break. Given that the backward compatiblity comes with a price and not so many applications actually break, I'd rather set UTF-8 as the default converter for file streams and MultiByteBinaryOrTextStream.
We can do the reasonable care in the default image when loading packages and applications. The ones that break are the ones that do some "tricky" thing in it. I'd assume that the authors of such packages are capable to fix their package while a casual user who just wants to use non-latin1 characters may not able to figure out the text converter concept. I'd want them write working code, too.
-- Yoshiki
Ned Konz ned@bike-nomad.com wrote:
All previously-existing code expected that text streams would return Characters that had a 1:1 mapping to the bytes in the file.
Actually, there are CrLfFileStream and TextFile, both of which do not necessarily give 1:1 mappings. Those of us that use one of these classes think your discussions about CR versus LF are amusing. Those of you who do not use these classes think our complaints about the incorrect 1:1 assumption are amusing.
Anyway, the point is that those of us who sufered through using CrLfFileStream in a 1:1-assuming world, have already managed to get some things in Squeak fixed over time. Also, I can tell you that it's not too bad to just switch over like this. It's easy to fix the individual broken cases; it is just frustrating that the upstream developers frequently don't really care.
So, I vote to dive on into the icy stream. Most things will work, and most things that break will be easy to fix, and in the long run, much less broken code will be written.
Lex
lex@cc.gatech.edu wrote:
Ned Konz ned@bike-nomad.com wrote:
All previously-existing code expected that text streams would return Characters that had a 1:1 mapping to the bytes in the file.
Actually, there are CrLfFileStream and TextFile, both of which do not necessarily give 1:1 mappings. Those of us that use one of these classes think your discussions about CR versus LF are amusing. Those of you who do not use these classes think our complaints about the incorrect 1:1 assumption are amusing.
Anyway, the point is that those of us who sufered through using CrLfFileStream in a 1:1-assuming world, have already managed to get some things in Squeak fixed over time. Also, I can tell you that it's not too bad to just switch over like this. It's easy to fix the individual broken cases; it is just frustrating that the upstream developers frequently don't really care.
So, I vote to dive on into the icy stream. Most things will work, and most things that break will be easy to fix, and in the long run, much less broken code will be written.
Lex
I'll second that
On Friday 30 July 2004 10:49 am, lex@cc.gatech.edu wrote:
Ned Konz ned@bike-nomad.com wrote:
All previously-existing code expected that text streams would return Characters that had a 1:1 mapping to the bytes in the file.
Actually, there are CrLfFileStream and TextFile, both of which do not necessarily give 1:1 mappings. Those of us that use one of these classes think your discussions about CR versus LF are amusing.
Those of you who do not use these classes think our complaints about the incorrect 1:1 assumption are amusing.
Anyway, the point is that those of us who sufered through using CrLfFileStream in a 1:1-assuming world, have already managed to get some things in Squeak fixed over time.
You may recall that I was one of those who did suffer through using CrLfFileStream for a while and submitted fixes for it. This was in 2.9a, 3.4, and possibly later as well.
One of my sets of fixes, in fact, introduced a distinction between opening text streams and opening raw streams: by adding new stream creation methods that paralleled the existing ones, we could be specific about what we wanted.
Something like (I don't recall the names):
oldFileNamed: 'whatever' => text file, whatever the defaults are (encoding, translation ...)
oldRawFileNamed: 'whatever' => 1 character per byte, no translation
Also, I can tell you that it's not too bad to just switch over like this. It's easy to fix the individual broken cases; it is just frustrating that the upstream developers frequently don't really care.
Do you know of any CrLfFileStream related fixes that we need to apply?
Ned Konz ned@bike-nomad.com wrote:
One of my sets of fixes, in fact, introduced a distinction between opening text streams and opening raw streams: by adding new stream creation methods that paralleled the existing ones, we could be specific about what we wanted.
I had forgotten who proposed this, but I did like it better than the current situation where you change modes *after* you open it. The current way means that every file class has to support both text and binary modes, even though there are no *real* needs for this.
TextFile has the same idea of choosing modes at open time, though it uses a different sequence of messages.
Also, I can tell you that it's not too bad to just switch over like this. ÊIt's easy to fix the individual broken cases; it is just frustrating that the upstream developers frequently don't really care.
Do you know of any CrLfFileStream related fixes that we need to apply?
None that I know of, but I must admit that I gave up the CrLfFileStream battle a few months ago and have been using StandardFileStream. I like CLFS better but it takes too much time to fix the bugs when so few others are using it.
Lex
Hello,
At Fri, 30 Jul 2004 13:49:55 -0400 , lex@cc.gatech.edu wrote:
Ned Konz ned@bike-nomad.com wrote:
All previously-existing code expected that text streams would return Characters that had a 1:1 mapping to the bytes in the file.
Actually, there are CrLfFileStream and TextFile, both of which do not necessarily give 1:1 mappings. Those of us that use one of these classes think your discussions about CR versus LF are amusing. Those of you who do not use these classes think our complaints about the incorrect 1:1 assumption are amusing.
Anyway, the point is that those of us who sufered through using CrLfFileStream in a 1:1-assuming world, have already managed to get some things in Squeak fixed over time. Also, I can tell you that it's not too bad to just switch over like this. It's easy to fix the individual broken cases; it is just frustrating that the upstream developers frequently don't really care.
This is a good argument and I agree with it. The proper text/binary use, most of the Squeak code does today, is the way to go.
Just FYI, but CrLfFileStream is integrated with MultiByteFileStream now. Sending new to CrLfFileStream results in a MultiByteFileStream with a flag set.
-- Yoshiki
Hello,
Same thing happens in ChangeList when trying to read a gzipped file.
zipped _ GZipReadStream on: (FileStream readOnlyFileNamed: fullName). unzipped _ ReadStream on: zipped contents asString. ChangeList browseStream: unzipped
FileStream readOnlyFileNamed: returns a MultiByteFileStream and GZipReadStream fails.
Hmm. I thought I fixed this.
ZipArchiveMember>>contentStream does the right thing. This code above should follow this line of implementation.
-- Yoshiki
Hello, again,
Hmm. I thought I fixed this.
ZipArchiveMember>>contentStream does the right thing. This code above should follow this line of implementation.
It should look like the attachment.
I noticed that the 1016nonASCIIs.cs is in wrong encoding. I'm not sure where this happened, but should be reloaded from 0016nonASCIIs.cs on the update server.
-- Yoshiki
On Thursday 29 July 2004 3:03 pm, Yoshiki Ohshima wrote:
Hello, again,
Hmm. I thought I fixed this.
ZipArchiveMember>>contentStream does the right thing. This code above should follow this line of implementation.
It should look like the attachment.
I noticed that the 1016nonASCIIs.cs is in wrong encoding. I'm not sure where this happened, but should be reloaded from 0016nonASCIIs.cs on the update server.
And the following files contain linefeeds and should be converted:
005-postConv.cs 1016nonASCIIs.cs 209locale-etoy.st
Michael,
And the following files contain linefeeds and should be converted:
Hmm, the remove line feeds function does "interessting" things to these files. I think the lf conversion and utf-8 encoding don't mix well.
The attached changeset fix it.
It has nothing to do with utf-8. Now, the code shouldn't refers to StandardFileStream directly. Just use FileStream and let it do the right thing.
-- Yoshiki
Unicode class>>parseUnicodeDataFrom: wasn't working because the "upTo: Character cr" call in it returned the entire contents as the first line. In one way or another, it should be re-worked.
-- Yoshiki
Yoshiki Ohshima wrote:
Hello, again,
Hmm. I thought I fixed this.
ZipArchiveMember>>contentStream does the right thing. This code above should follow this line of implementation.
It should look like the attachment.
I should mention this happened while using BFAV and it downloaded some compressed attachments and they got corrupted. But replacing the corrupted one with a ok one, and with your fix it lets me open the ChangeListBrowser. Thanks. Karl
I noticed that the 1016nonASCIIs.cs is in wrong encoding. I'm not sure where this happened, but should be reloaded from 0016nonASCIIs.cs on the update server.
-- Yoshiki
Name: ChangeListclass-browseCompressedChangesFile.st
ChangeListclass-browseCompressedChangesFile.st Type: unspecified type (application/octet-stream) Encoding: base64
On Jul 29, 2004, at 1:45 AM, Michael Rueger wrote:
Hi all, the m17n stuff is finally ready for prime time. I've uploaded the change sets plus a install do-it to:
I had some problems installing this as-is.
If I installed this into a freshly downloaded Squeak3.8a-5976 image (saving a copy under a different name), it would go through what looked like all of the install (I think), but then it wouldn't be able to draw any TrueType fonts. E.g. I couldn't open any new windows because the window titlebars use a TT font.
(This is on OS X 10.3, running Squeak 3.7.2Beta1 VM.)
And if I tried cmd-. I would get the 'System error handling failed' box, with this stack: (typed by hand, since no SqueakDebug.log seems to be generated)
--------------------------------------- Original error: MessageNotUnderstood: UndefinedObject>>y. Debugger error: MessageNotUnderstood: UndefinedObject>>y: [] in Debugger class>>openOn:context:label:contents:fullView: {[:ex | self primitiveError: ...]} BlockContext>>valueWithPossibleArgs: [] in MethodContext(ContextPart)>>handleSignal: {[(self tempAt: 2) ...]} BlockContext>>ensure: MethodContext(ContextPart)>>handleSignal: MessageNotUnderstood(Exception)>>signal UndefinedObject(Object)>>doesNotUnderstand: #y TTCFont>>height StringMorph>>measureContents StringMorph>>fitContents StringMorph>>font:emphasis: PreDebugWindow(SystemWindow)>>initialize ... ---------------------------------------
However, I changed the 'install' file a bit to print some debugging information to the Transcript, and set and unset the author initials, and for some reason it installs successfully for me now.
Here's my modified 'install2' file which works for me:
--------------------------------------- | dir startTime | dir := FileDirectory default directoryNamed: 'install-m17n'. startTime _ DateAndTime current. Utilities setAuthorInitials: 'yo'. Transcript open.
#( '001-ucsTable.st' '002-bootstrap.st' '003-m17nBase.cs' '004-systemMod.cs' '005-postConv.cs' '006-incChanges.cs' '1001resetTTCFont.cs' '1002BVSInstall.cs' '1003BVSMInstall.cs' '1004BVSfInstall.cs' '1005Latin1Vera.cs' '1006languageTable.cs' '1007mergeJul2.cs' '1008deprecationFix.cs' '1009moreTranslation.cs' '1010converterRefacor.cs' '1011FileStreamPrep.cs' '1012FileOutEncoding.cs' '1013fileOutTest.cs' '1014m17nMcz.cs' '1015MacConverterFix.cs' '1016nonASCIIs.cs' '207-System-Localization.st' '208-locale-m17n.cs' '209locale-etoy.st' '210-locale-switch.cs' '211-locale-removals.cs' '212-localeFixes.cs' '3017contentStreamFix.cs' '3018changeListEncoding.cs' '3019changeListEncodingAgain.cs' '3020isoSqueakSenders.cs' '3021misc.cs' '3022SMLanguageInstaller.st' '3023systemLanguage.cs' ) do: [:fileName | FileStream fileIn: (dir fullNameFor: fileName). Transcript show: 'Filed in ', fileName; cr].
Utilities setAuthorInitials: ''. Transcript show: 'Install start time: ', startTime printString; cr. Transcript show: 'Install finish time: ', DateAndTime current printString; cr. ---------------------------------------
- Doug
Hi Doug,
Did you have a transcript open? I had an installation failure due to a transcript output in a "wrong place" (e.g., somewhere halfway through the installation). I think the only solution here is to close all open transcripts before filing it in (we can automate this in the install script) - clearly having text output going on while we're replacing half of the string hierarchy is bound to cause a few problems ;-)
Cheers, - Andreas
----- Original Message ----- From: "Doug Way" dway@mailcan.com To: "The general-purpose Squeak developers list" squeak-dev@lists.squeakfoundation.org Sent: Thursday, July 29, 2004 9:08 PM Subject: Errors installing m17n
On Jul 29, 2004, at 1:45 AM, Michael Rueger wrote:
Hi all, the m17n stuff is finally ready for prime time. I've uploaded the change sets plus a install do-it to:
I had some problems installing this as-is.
If I installed this into a freshly downloaded Squeak3.8a-5976 image (saving a copy under a different name), it would go through what looked like all of the install (I think), but then it wouldn't be able to draw any TrueType fonts. E.g. I couldn't open any new windows because the window titlebars use a TT font.
(This is on OS X 10.3, running Squeak 3.7.2Beta1 VM.)
And if I tried cmd-. I would get the 'System error handling failed' box, with this stack: (typed by hand, since no SqueakDebug.log seems to be generated)
Original error: MessageNotUnderstood: UndefinedObject>>y. Debugger error: MessageNotUnderstood: UndefinedObject>>y: [] in Debugger class>>openOn:context:label:contents:fullView: {[:ex | self primitiveError: ...]} BlockContext>>valueWithPossibleArgs: [] in MethodContext(ContextPart)>>handleSignal: {[(self tempAt: 2) ...]} BlockContext>>ensure: MethodContext(ContextPart)>>handleSignal: MessageNotUnderstood(Exception)>>signal UndefinedObject(Object)>>doesNotUnderstand: #y TTCFont>>height StringMorph>>measureContents StringMorph>>fitContents StringMorph>>font:emphasis: PreDebugWindow(SystemWindow)>>initialize ...
However, I changed the 'install' file a bit to print some debugging information to the Transcript, and set and unset the author initials, and for some reason it installs successfully for me now.
Here's my modified 'install2' file which works for me:
| dir startTime | dir := FileDirectory default directoryNamed: 'install-m17n'. startTime _ DateAndTime current. Utilities setAuthorInitials: 'yo'. Transcript open.
#( '001-ucsTable.st' '002-bootstrap.st' '003-m17nBase.cs' '004-systemMod.cs' '005-postConv.cs' '006-incChanges.cs' '1001resetTTCFont.cs' '1002BVSInstall.cs' '1003BVSMInstall.cs' '1004BVSfInstall.cs' '1005Latin1Vera.cs' '1006languageTable.cs' '1007mergeJul2.cs' '1008deprecationFix.cs' '1009moreTranslation.cs' '1010converterRefacor.cs' '1011FileStreamPrep.cs' '1012FileOutEncoding.cs' '1013fileOutTest.cs' '1014m17nMcz.cs' '1015MacConverterFix.cs' '1016nonASCIIs.cs' '207-System-Localization.st' '208-locale-m17n.cs' '209locale-etoy.st' '210-locale-switch.cs' '211-locale-removals.cs' '212-localeFixes.cs' '3017contentStreamFix.cs' '3018changeListEncoding.cs' '3019changeListEncodingAgain.cs' '3020isoSqueakSenders.cs' '3021misc.cs' '3022SMLanguageInstaller.st' '3023systemLanguage.cs' ) do: [:fileName | FileStream fileIn: (dir fullNameFor: fileName). Transcript show: 'Filed in ', fileName; cr].
Utilities setAuthorInitials: ''. Transcript show: 'Install start time: ', startTime printString; cr. Transcript show: 'Install finish time: ', DateAndTime current printString; cr.
- Doug
On Jul 30, 2004, at 1:12 AM, Andreas Raab wrote:
Hi Doug,
Did you have a transcript open? I had an installation failure due to a transcript output in a "wrong place" (e.g., somewhere halfway through the installation). I think the only solution here is to close all open transcripts before filing it in (we can automate this in the install script) - clearly having text output going on while we're replacing half of the string hierarchy is bound to cause a few problems ;-)
Yeah, it looks like it is probably transcript-related. The odd thing is, it seemed to work when I had a transcript open, and it didn't work when I did not have a transcript open! (I could test this a bit more, but it takes about 20 minutes to install these updates on my slowish 400MHz G4...)
- Doug
Cheers,
- Andreas
----- Original Message ----- From: "Doug Way" dway@mailcan.com To: "The general-purpose Squeak developers list" squeak-dev@lists.squeakfoundation.org Sent: Thursday, July 29, 2004 9:08 PM Subject: Errors installing m17n
On Jul 29, 2004, at 1:45 AM, Michael Rueger wrote:
Hi all, the m17n stuff is finally ready for prime time. I've uploaded the change sets plus a install do-it to:
I had some problems installing this as-is.
If I installed this into a freshly downloaded Squeak3.8a-5976 image (saving a copy under a different name), it would go through what looked like all of the install (I think), but then it wouldn't be able to draw any TrueType fonts. E.g. I couldn't open any new windows because the window titlebars use a TT font.
(This is on OS X 10.3, running Squeak 3.7.2Beta1 VM.)
And if I tried cmd-. I would get the 'System error handling failed' box, with this stack: (typed by hand, since no SqueakDebug.log seems to be generated)
Original error: MessageNotUnderstood: UndefinedObject>>y. Debugger error: MessageNotUnderstood: UndefinedObject>>y: [] in Debugger class>>openOn:context:label:contents:fullView: {[:ex | self primitiveError: ...]} BlockContext>>valueWithPossibleArgs: [] in MethodContext(ContextPart)>>handleSignal: {[(self tempAt: 2) ...]} BlockContext>>ensure: MethodContext(ContextPart)>>handleSignal: MessageNotUnderstood(Exception)>>signal UndefinedObject(Object)>>doesNotUnderstand: #y TTCFont>>height StringMorph>>measureContents StringMorph>>fitContents StringMorph>>font:emphasis: PreDebugWindow(SystemWindow)>>initialize ...
However, I changed the 'install' file a bit to print some debugging information to the Transcript, and set and unset the author initials, and for some reason it installs successfully for me now.
Here's my modified 'install2' file which works for me:
| dir startTime | dir := FileDirectory default directoryNamed: 'install-m17n'. startTime _ DateAndTime current. Utilities setAuthorInitials: 'yo'. Transcript open.
#( '001-ucsTable.st' '002-bootstrap.st' '003-m17nBase.cs' '004-systemMod.cs' '005-postConv.cs' '006-incChanges.cs' '1001resetTTCFont.cs' '1002BVSInstall.cs' '1003BVSMInstall.cs' '1004BVSfInstall.cs' '1005Latin1Vera.cs' '1006languageTable.cs' '1007mergeJul2.cs' '1008deprecationFix.cs' '1009moreTranslation.cs' '1010converterRefacor.cs' '1011FileStreamPrep.cs' '1012FileOutEncoding.cs' '1013fileOutTest.cs' '1014m17nMcz.cs' '1015MacConverterFix.cs' '1016nonASCIIs.cs' '207-System-Localization.st' '208-locale-m17n.cs' '209locale-etoy.st' '210-locale-switch.cs' '211-locale-removals.cs' '212-localeFixes.cs' '3017contentStreamFix.cs' '3018changeListEncoding.cs' '3019changeListEncodingAgain.cs' '3020isoSqueakSenders.cs' '3021misc.cs' '3022SMLanguageInstaller.st' '3023systemLanguage.cs' ) do: [:fileName | FileStream fileIn: (dir fullNameFor: fileName). Transcript show: 'Filed in ', fileName; cr].
Utilities setAuthorInitials: ''. Transcript show: 'Install start time: ', startTime printString; cr. Transcript show: 'Install finish time: ', DateAndTime current printString; cr.
- Doug
squeak-dev@lists.squeakfoundation.org