[squeak-dev] Why is source code always in files only?

Tobias Pape Das.Linux at gmx.de
Mon Jan 19 22:17:39 UTC 2015


Hey

On 19.01.2015, at 22:56, Levente Uzonyi <leves at elte.hu> wrote:

> On Mon, 19 Jan 2015, Tobias Pape wrote:
> 
>> 
>> On 19.01.2015, at 21:51, Levente Uzonyi <leves at elte.hu> wrote:
>> 
>>> On Mon, 19 Jan 2015, Tobias Pape wrote:
>>> 
>> […]
>> What I am actually wondering about,
>> there are two completely different ways to _access_ source stored in the image
>> but no way to actually _store_ it there.
> 
> You can use #dropSourcePointer to embed the source of a method in the image. For 15k methods you better swap the methods with custom code which converts them in a single batch.

ah. This goes into the trailer then?
I think I actually misread some parts in CompiledMethod>>#sourceCode: and
thought that code was dysfunctional. My bad.
Thanks for resolving this mystery.

The second one still remains:

CompiledMethod>>
getSourceFor: selector in: class
	"Retrieve or reconstruct the source code for this method."
	| trailer source |
	(self properties includesKey: #source) ifTrue:
		[^self properties at: #source].
	trailer := self trailer.
	" ... "

Judging from my image, this is never written, right?



> 
>> 
>>> 
>>>> 
>>>>> 
>>>>> Can't we just save the source code either via trailer or properties
>>>>> on first access?
>>>>> 
>>>>> -1.  Why do I want all of those String's in my image?
>>>> 
>>>> To do stuff to them.
>>>> Like, analysing how many dots are in them, or how often someone crafts a Symbol.
>>>> Analysis stuff.
>>>> Currently, I have a separate structure that holds onto the code once retrieved
>>>> from disk. But once the method change (eg, recompilation) I have to first detect,
>>>> that it happened, and second flush and refill this cache. I find this tiresome.
>>> 
>>> Do you flush your cache selectively?
>> 
>> No, I can't for reasons :)
>> 
>>> 
>>> Scanning all source code for a given pattern takes less than a second (~800 ms) on my machine. What's your performance goal?
>> 
>> I have ~15.000 Methods that I have to compare line by line against each other.
>> Doing that by going to the filesystem just kills it.
> 
> It's hard to tell much without knowing the exact problem. If you want to take a method and compare it with all previously processed methods line by line, then you can create a dictionary which maps lines to methods (or method-line number pairs).
> 

Well what I did in the meantime (besides caching the source)
was introducing an intermediate object that
a) holds onto the string for a line of source code
b) compares by identity and not its string's content and
c) is interned via a Dictionary on the class side.
(a bit like symbols but I didn't want to misuse them)

That way, I can resort to identity based duplicate checking :)

Best
	-Tobias

> Levente
> 
>> 
>> 
>> Best
>> 	-Tobias




More information about the Squeak-dev mailing list