[Newbies] Help me FileIn
Bert Freudenberg
bert at freudenbergs.de
Mon Aug 17 06:59:53 UTC 2009
On 17.08.2009, at 05:34, K. K. Subramaniam wrote:
> On Monday 17 Aug 2009 7:50:35 am Randal L. Schwartz wrote:
>> K> On Monday 17 Aug 2009 12:17:25 am Randal L. Schwartz wrote:
>>>>> http://lists.squeakfoundation.org/pipermail/squeak-dev/2007-May/116683.html
>>>> That explanation is about .sources and .changes, not fileOuts.
>>
>> K> The class fileout (from browser into *.st files) uses the same
>> format
>> K> (sequence of data chunks) as *.sources and *.changes files. Did
>> you have
>> K> some other fileOuts in mind?
>>
>> Are you sure?
> Yes. fileOuts are binary streams, not ASCII (cf. Class>>fileOut and
> FileStream>>
> writeSourceCodeFrom:.....).
Err, it's still text though.
>> It looks like the fileout I just got is classic ST-80 format, with
>> "!"
>> delimiting Smalltalk code. There's no "binary" data in here...
>> it's all
>> human-readable text (chunks of smalltalk code). That's different
>> from the
>> .sources and .changes, because they have some binary data in them (I
>> thought).
> I made the same mistake a few years back. Just because a byteArray
> contains
> readable text does not mean that it is becomes a string. Text is not
> portable
> across platforms because of different line-ending conventions. If line
> conversions are done on fileOut before filing in then any string
> literals with
> newlines will get corrupted.
You guys are confusing a couple of issues.
One is that the *.sources file and *.changes file is not simply a text
file, because the image stores offsets into those files to find source
code for a specific method. Hence you must not manually alter these
files as you normally would edit source code, because that might
invalidate the offsets - that's what I meant with "a database of text
chunks" in the message linked above.
File-outs (*.st) and change-sets (*.cs) are a different matter. Here,
no file offsets are stored anywhere so it is perfectly okay to edit
them manually as text files.
The "binary data" I was talking about does not appear in regular file-
outs or change-sets. But the file-in process is so flexible that it
can even be used to read binary data. That's because filing-in
actually executes code found in the file. The first part of the file
can define a reader that reads the later part of the file.
This makes it actually an "object-oriented" file format: the file
itself is an object that defines how it is to be read. (According to
Alan this idea goes back to the 1950's B5000 tapes.)
But it also makes it impossible to know in advance what kind of data
might be included when filing in something. So one must not do
automatic transformations of the file contents, which might break that
data.
However, as I wrote, if the file is indeed named *.st or *.cs, then no
such arbitrary data is expected. These files contain only Smalltalk
source code - it's not enforced, but it would be very atypical if they
contained something else.
- Bert -
More information about the Beginners
mailing list