[squeak-dev] Why is source code always in files only?

Levente Uzonyi leves at elte.hu
Mon Jan 19 20:51:38 UTC 2015


On Mon, 19 Jan 2015, Tobias Pape wrote:

>
> On 19.01.2015, at 18:34, Chris Muller <asqueaker at gmail.com> wrote:
>
>> On Mon, Jan 19, 2015 at 6:45 AM, Tobias Pape <Das.Linux at gmx.de> wrote:
>> Hi all,
>>
>>
>> We store method source _solely_ in files (.sources/.changes).
>> Why? We have means to attach it to Compiled methods, in fact, more than one:
>>
>>
>> CompiledMethod allInstances size. "57766."
>> CompiledMethod allInstances count: [:m | m properties includesKey: #source].  "0."
>> CompiledMethod allInstances count: [:m | m trailer sourceCode notNil]. "0."
>> CompiledMethod allInstances count: [:m | m trailer hasSourcePointer]. "57700."
>>
>>
>> " also interesting "
>> (CompiledMethod allInstances collect: [:m | m trailer kind] as: Bag) sortedCounts
>>  {57701->#SourcePointer . 65->#NoTrailer . 14->#TempsNamesQCompress . 2->#TempsNamesZip}
>>
>>
>> When doing some analysis on source code, it is a pain to _either_
>> always go to disk for the source _or_ cache the code myself (which may
>> get out of sync sooon).
>>
>> If you're sending messages instead of viewing private innards, why is it a pain?
>
> What do you mean?
>
> Calling getSource on a CM goes 300km to disk instead of 1m to memory (metaphorically spoken)
> and when I do analysis on source code I typically do stuff like that a lot.
> And as developer I really dislike that I have to choose between either
>
> a) bad performance due to excessive IO (yes I want to access the source a lot)
> b) caching things myself when already two ways of storing them are available.

On today's machines you don't have to. Once you read the data from the 
disk, it'll be cached in memory. It would be faster to access the sources, 
if they were stored in a trailer, but that would bump the image size by 
about 15 MB (uncompressed), or 9 MB (compressed):

| size compressedSize |
size := compressedSize := 0.
CurrentReadOnlySourceFiles cacheDuring: [
 	SystemNavigation default allSelectorsAndMethodsDo: [ :behavior 
:selector :method |
  		| string compressed |
 		string := method getSource asString.
 		compressed := string squeakToUtf8 zipped.
 		size := size + string byteSize + ((string size > 255) 
asBit + 1 * 4).
 		compressedSize := compressedSize + compressed byteSize + 
((compressed size > 255) asBit + 1 * 4) ] ].
{ size. compressedSize }.

"==> #(15003880 9057408)"

>
>>
>>   Can't we just save the source code either via trailer or properties
>> on first access?
>>
>> -1.  Why do I want all of those String's in my image?
>
> To do stuff to them.
> Like, analysing how many dots are in them, or how often someone crafts a Symbol.
> Analysis stuff.
>  Currently, I have a separate structure that holds onto the code once retrieved
> from disk. But once the method change (eg, recompilation) I have to first detect,
> that it happened, and second flush and refill this cache. I find this tiresome.

Do you flush your cache selectively?

Scanning all source code for a given pattern takes less than a second 
(~800 ms) on my machine. What's your performance goal?

Levente

>
> Best
> 	-Tobias
>
>>
>>
>>
>>
>> Best
>>         -Tobias
>
>
>
>
>
> PS: HTML-mails f*ck up quotation levels in replies :(
>    Apple mail just flattens them when I reply. Anyone knows a workaround?
>


More information about the Squeak-dev mailing list