[squeak-dev] Survey | Default encoding for new (multi-byte) file streams

Tobias Pape Das.Linux at gmx.de
Mon Apr 11 12:55:52 UTC 2022


Hi


> On 11. Apr 2022, at 14:42, Marcel Taeumel <marcel.taeumel at hpi.de> wrote:
> 
> Hi all --
> 
> What do you think should be the default encoding when opening a new file?
> 
> FileStream readOnlyFileNamed: 'foo' do: [:stream | stream contents].

UTF-8.

Also, lets talk about other streams.
I have been bitten multiple times that for some data I put on the Interwebs™ via WebClient, when I tee it to a file also, the file is automagically encoded correctly, but the data on the wire is not.
That's surprising.

Best regards
	-Tobias


> 
> They are system/platform-specific for now, but effectively utf-8 on many (if not most?) platforms. The original idea seemed to be that platform files could be opened without hassle. By now, everybody expects utf-8, especially when preparing files for the Web or other external, os-independent storage. However, there are platforms (and maybe even working VM versions) where this may not be true.
> 
> See Latin1Environment class >> #systemConverterClass to get an idea about what can be special about a platform's converter.
> 
> Note that you can always make it explicit:
> 
> FileStream readOnlyFileNamed: 'foo' do: [:stream |
>    stream converter: UTF8TextConverter new.   
>    stream contents].
> 
> (This is not about Squeak's code files .st .cs .changes .source, which are utf-8 by default already.)
> 
> Best,
> Marcel
> 




More information about the Squeak-dev mailing list