[squeak-dev] Why is source code always in files only?

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Mon Jan 19 21:31:01 UTC 2015


Hi Tobias,
are you aware of CurrentReadOnlySourceFiles cacheDuring: [...]
This is to workaround the readOnlyCopy used for thread safety which is the
main killer of performance...

2015-01-19 22:10 GMT+01:00 Tobias Pape <Das.Linux at gmx.de>:

>
> On 19.01.2015, at 21:51, Levente Uzonyi <leves at elte.hu> wrote:
>
> > On Mon, 19 Jan 2015, Tobias Pape wrote:
> >
> >>
> >> On 19.01.2015, at 18:34, Chris Muller <asqueaker at gmail.com> wrote:
> >>
> >>> On Mon, Jan 19, 2015 at 6:45 AM, Tobias Pape <Das.Linux at gmx.de> wrote:
> >>> Hi all,
> >>>
> >>>
> >>> We store method source _solely_ in files (.sources/.changes).
> >>> Why? We have means to attach it to Compiled methods, in fact, more
> than one:
> >>>
> >>>
> >>> CompiledMethod allInstances size. "57766."
> >>> CompiledMethod allInstances count: [:m | m properties includesKey:
> #source].  "0."
> >>> CompiledMethod allInstances count: [:m | m trailer sourceCode notNil].
> "0."
> >>> CompiledMethod allInstances count: [:m | m trailer hasSourcePointer].
> "57700."
> >>>
> >>>
> >>> " also interesting "
> >>> (CompiledMethod allInstances collect: [:m | m trailer kind] as: Bag)
> sortedCounts
> >>> {57701->#SourcePointer . 65->#NoTrailer . 14->#TempsNamesQCompress .
> 2->#TempsNamesZip}
> >>>
> >>>
> >>> When doing some analysis on source code, it is a pain to _either_
> >>> always go to disk for the source _or_ cache the code myself (which may
> >>> get out of sync sooon).
> >>>
> >>> If you're sending messages instead of viewing private innards, why is
> it a pain?
> >>
> >> What do you mean?
> >>
> >> Calling getSource on a CM goes 300km to disk instead of 1m to memory
> (metaphorically spoken)
> >> and when I do analysis on source code I typically do stuff like that a
> lot.
> >> And as developer I really dislike that I have to choose between either
> >>
> >> a) bad performance due to excessive IO (yes I want to access the source
> a lot)
> >> b) caching things myself when already two ways of storing them are
> available.
> >
> > On today's machines you don't have to. Once you read the data from the
> disk, it'll be cached in memory. It would be faster to access the sources,
> if they were stored in a trailer, but that would bump the image size by
> about 15 MB (uncompressed), or 9 MB (compressed):
> >
>
> I understand. But for a development image, I'd take that burden.
>
> > | size compressedSize |
> > size := compressedSize := 0.
> > CurrentReadOnlySourceFiles cacheDuring: [
> >       SystemNavigation default allSelectorsAndMethodsDo: [ :behavior
> :selector :method |
> >               | string compressed |
> >               string := method getSource asString.
> >               compressed := string squeakToUtf8 zipped.
> >               size := size + string byteSize + ((string size > 255)
> asBit + 1 * 4).
> >               compressedSize := compressedSize + compressed byteSize +
> ((compressed size > 255) asBit + 1 * 4) ] ].
> > { size. compressedSize }.
> >
> > "==> #(15003880 9057408)"
>
>
> What I am actually wondering about,
> there are two completely different ways to _access_ source stored in the
> image
> but no way to actually _store_ it there.
>
> >
> >>
> >>>
> >>>  Can't we just save the source code either via trailer or properties
> >>> on first access?
> >>>
> >>> -1.  Why do I want all of those String's in my image?
> >>
> >> To do stuff to them.
> >> Like, analysing how many dots are in them, or how often someone crafts
> a Symbol.
> >> Analysis stuff.
> >> Currently, I have a separate structure that holds onto the code once
> retrieved
> >> from disk. But once the method change (eg, recompilation) I have to
> first detect,
> >> that it happened, and second flush and refill this cache. I find this
> tiresome.
> >
> > Do you flush your cache selectively?
>
> No, I can't for reasons :)
>
> >
> > Scanning all source code for a given pattern takes less than a second
> (~800 ms) on my machine. What's your performance goal?
>
> I have ~15.000 Methods that I have to compare line by line against each
> other.
> Doing that by going to the filesystem just kills it.
>
>
> Best
>         -Tobias
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20150119/be40e5c4/attachment.htm


More information about the Squeak-dev mailing list