[Newbies] Help me FileIn

Bert Freudenberg bert at freudenbergs.de
Mon Aug 17 06:59:53 UTC 2009


On 17.08.2009, at 05:34, K. K. Subramaniam wrote:

> On Monday 17 Aug 2009 7:50:35 am Randal L. Schwartz wrote:
>> K> On Monday 17 Aug 2009 12:17:25 am Randal L. Schwartz wrote:
>>>>> http://lists.squeakfoundation.org/pipermail/squeak-dev/2007-May/116683.html
>>>> That explanation is about .sources and .changes, not fileOuts.
>>
>> K> The class fileout (from browser into *.st files) uses the same  
>> format
>> K> (sequence of data chunks) as *.sources and *.changes files. Did  
>> you have
>> K> some other fileOuts in mind?
>>
>> Are you sure?
> Yes. fileOuts are binary streams, not ASCII (cf. Class>>fileOut and  
> FileStream>>
> writeSourceCodeFrom:.....).

Err, it's still text though.

>> It looks like the fileout I just got is classic ST-80 format, with  
>> "!"
>> delimiting Smalltalk code.  There's no "binary" data in here...  
>> it's all
>> human-readable text (chunks of smalltalk code).  That's different  
>> from the
>> .sources and .changes, because they have some binary data in them (I
>> thought).
> I made the same mistake a few years back. Just because a byteArray  
> contains
> readable text does not mean that it is becomes a string. Text is not  
> portable
> across platforms because of different line-ending conventions. If line
> conversions are done on fileOut before filing in then any string  
> literals with
> newlines will get corrupted.


You guys are confusing a couple of issues.

One is that the *.sources file and *.changes file is not simply a text  
file, because the image stores offsets into those files to find source  
code for a specific method. Hence you must not manually alter these  
files as you normally would edit source code, because that might  
invalidate the offsets - that's what I meant with "a database of text  
chunks" in the message linked above.

File-outs (*.st) and change-sets (*.cs) are a different matter. Here,  
no file offsets are stored anywhere so it is perfectly okay to edit  
them manually as text files.

The "binary data" I was talking about does not appear in regular file- 
outs or change-sets. But the file-in process is so flexible that it  
can even be used to read binary data. That's because filing-in  
actually executes code found in the file. The first part of the file  
can define a reader that reads the later part of the file.

This makes it actually an "object-oriented" file format: the file  
itself is an object that defines how it is to be read. (According to  
Alan this idea goes back to the 1950's B5000 tapes.)

But it also makes it impossible to know in advance what kind of data  
might be included when filing in something. So one must not do  
automatic transformations of the file contents, which might break that  
data.

However, as I wrote, if the file is indeed named *.st or *.cs, then no  
such arbitrary data is expected. These files contain only Smalltalk  
source code - it's not enforced, but it would be very atypical if they  
contained something else.

- Bert -




More information about the Beginners mailing list