Trailers speedup (Re: [squeak-dev] The Trunk: Kernel-ul.362.mcz)

Levente Uzonyi leves at elte.hu
Sun Jan 3 18:26:41 UTC 2010


On Sun, 3 Jan 2010, Igor Stasenko wrote:

> Levente,
> could you give us a comparison , how much this speeding up the
> source code fetching?
>
> [ Object selectors do: [:each | Object sourceCodeAt: each ] ] timeToRun
>
> Image with no trailers:
> 482  481 478

This must be a really old image (or non-trunk image).

>
> Image with trailers:
> 196  197 206
>
> (i tested against the rather old image, which seems having different
> number of selectors in Object , and
> places where it fetching them, of course).
>
> But it actually shows that your efforts to get speed there is likely
> will be unnoticed, because most of the time
> is consumed by file operations, which working with orders of magnitude
> slower. So, no matter how fast a compiled method trailers will work,
> all such optimizations will be unnoticeable.

Sure. I used the following benchmark:
[
    SystemNavigation default
       allMethodsWithSourceString: '== 0'
       matchCase: true ] timeToRun

I don't have the exact numbers (~12 seconds before my changes and ~9.5 
after), but the speedup was 1.39x. The reason for this was that 3 trailer 
objects were created for one method. Trailer creation took ~30% of the 
total runtime, because of the #asSymbol send.

(Note that file operations are not that slow since the FileStreams are 
read buffered)

>
> And the amount of introduced code bloat , just to make
> self perform: 'foo', x  look like:   self perform: x
> is too much price for it, as to me.
>

See above. I removed the #asSymbol send, which is not cheap. Btw 3 new 
methods (one generated) is not code bloat IMO.

> If you want to get things done right, then instead,
> deal with senders of #getSourceFromFileAt: and #getSourceFromFile
> which accessing a source pointer & breaking trailer encapsulation.
> There is a space for optimization to streamline the source code accessing,
> by putting all behavior inside a CompiledMethodTrailer , which could
> answer a source code
> based on its own data, without the need of having extra things to
> handle this in CompiledMethod.
>
> Also, some code accessing a file index & method's source pointer
> outside a compiled method,
> which leads to creation a fresh CompiledMethodTrailer instance each
> time for such ways of accessing.
> By rewriting that code, you could save a lot more.
>
> I haven't done that,  because at initial stage, i wanted to make sure
> that trailers is 100% compatible with existing code.
> But sure thing, there's a lot space for cleanup. And cleanup means
> removing unnecessary code, rather than adding it :)
>

Cleaning up code and encapsulating source access is a good thing. But I 
think can't save the cost of #asSymbol in another way.


Levente

> 2009/12/31  <commits at source.squeak.org>:
>> Levente Uzonyi uploaded a new version of Kernel to project The Trunk:
>> http://source.squeak.org/trunk/Kernel-ul.362.mcz
>>
>> ==================== Summary ====================
>>
>> Name: Kernel-ul.362
>> Author: ul
>> Time: 31 December 2009, 7:04:23 am
>> UUID: 96615f68-2456-7745-9ecb-335973913252
>> Ancestors: Kernel-ul.361
>>
>> - speed up method trailer creation
>> - speed up source fetching from source files
>>
>> =============== Diff against Kernel-ar.360 ===============
>>
>> Item was added:
>> + ----- Method: CompiledMethod>>getSourceFromFileAt: (in category 'source code management') -----
>> + getSourceFromFileAt: sourcePointer
>> +
>> +       | position index |
>> +       position := SourceFiles filePositionFromSourcePointer: sourcePointer.
>> +       position = 0 ifTrue: [ ^nil ].
>> +       index := SourceFiles fileIndexFromSourcePointer: sourcePointer.
>> +       ^(RemoteString newFileNumber: index position: position)
>> +               text!
>>
>> Item was added:
>> + ----- Method: CompiledMethodTrailer class>>trailerKindDecoders (in category 'generated') -----
>> + trailerKindDecoders
>> +
>> +       ^#(#decodeNoTrailer #decodeClearedTrailer #decodeTempsNamesQCompress #decodeTempsNamesZip #decodeSourceBySelector #decodeSourceByStringIdentifier #decodeEmbeddedSourceQCompress #decodeEmbeddedSourceZip #decodeVarLengthSourcePointer #decodeExtendedKind #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeUndefined #decodeSourcePointer)!
>>
>> Item was changed:
>>  ----- Method: CompiledMethod>>getSourceFor:in: (in category 'source code management') -----
>>  getSourceFor: selector in: class
>>        "Retrieve or reconstruct the source code for this method."
>>        | trailer source |
>>        trailer := self trailer.
>>
>>        trailer tempNames ifNotNil: [:namesString |
>>                "Magic sources -- decompile with temp names"
>>                ^ ((class decompilerClass new withTempNames: namesString)
>>                                decompile: selector in: class method: self)
>>                        decompileString].
>>
>>        trailer sourceCode ifNotNil: [:code | ^ code ].
>>
>>        trailer hasSourcePointer ifFalse: [
>>                "No source pointer -- decompile without temp names"
>>                ^ (class decompilerClass new decompile: selector in: class method: self)
>>                        decompileString].
>>
>>        "Situation normal;  read the sourceCode from the file"
>> +       source := [self getSourceFromFileAt: trailer sourcePointer]
>> -       source := [self getSourceFromFile]
>>                                on: Error
>>                "An error can happen here if, for example, the changes file has been truncated by an aborted download.  The present solution is to ignore the error and fall back on the decompiler.  A more thorough solution should probably trigger a systematic invalidation of all source pointers past the end of the changes file.  Consider that, as time goes on, the changes file will eventually grow large enough to cover the lost code, and then instead of falling into this error case, random source code will get returned."
>>                                do: [ :ex | ex return: nil].
>>
>>        ^source ifNil: [
>>                        "Something really wrong -- decompile blind (no temps)"
>>                         (class decompilerClass new decompile: selector in: class method: self)
>>                                decompileString]!
>>
>> Item was changed:
>>  ----- Method: CompiledMethod>>getSourceFromFile (in category 'source code management') -----
>>  getSourceFromFile
>> +
>> +       ^self getSourceFromFileAt: self sourcePointer!
>> -       "Read the source code from file, determining source file index and
>> -       file position from the last 3 bytes of this method."
>> -       | position |
>> -       (position := self filePosition) = 0 ifTrue: [^ nil].
>> -       ^ (RemoteString newFileNumber: self fileIndex position: position)
>> -                       text!
>>
>> Item was changed:
>>  ----- Method: CompiledMethodTrailer>>method: (in category 'initialize-release') -----
>>  method: aMethod
>>
>> +       | flagByte index |
>> -       | flagByte |
>> -
>>        data := size := nil.
>>        method := aMethod.
>> +       flagByte := method at: method size.
>> -       flagByte := method at: (method size).
>>
>>        "trailer kind encoded in 6 high bits of last byte"
>> +       index := flagByte >> 2 + 1.
>> +       kind := self class trailerKinds at: index.
>> -       kind := self class trailerKinds at: 1+(flagByte>>2).
>>
>>        "decode the trailer bytes"
>> +       self perform: (self class trailerKindDecoders at: index).
>> -       self perform: ('decode' , kind) asSymbol.
>>
>>        "after decoding the trailer, size must be set"
>> +       self assert: size notNil
>> -       self assert: (size notNil).
>>
>>  !
>>
>> Item was changed:
>>  ----- Method: CompiledMethodTrailer class>>trailerKinds (in category 'as yet unclassified') -----
>>  trailerKinds
>> +       " see class comment for description. If you change this method, evaluate this:
>> +       self generateTrailerKindDecoders"
>> +
>> +       ^#(
>> -       " see class comment for description"
>> - ^#(
>>  "000000" #NoTrailer
>>  "000001" #ClearedTrailer
>>  "000010" #TempsNamesQCompress
>>  "000011" #TempsNamesZip
>>  "000100" #SourceBySelector
>>  "000101" #SourceByStringIdentifier
>>  "000110" #EmbeddedSourceQCompress
>>  "000111" #EmbeddedSourceZip
>>  "001000" #VarLengthSourcePointer
>>  "001001" #ExtendedKind
>>  "001010" #Undefined
>>  "001011" #Undefined
>>  "001100" #Undefined
>>  "001101" #Undefined
>>  "001110" #Undefined
>>  "001111" #Undefined
>>  "010000" #Undefined
>>  "010001" #Undefined
>>  "010010" #Undefined
>>  "010011" #Undefined
>>  "010100" #Undefined
>>  "010101" #Undefined
>>  "010110" #Undefined
>>  "010111" #Undefined
>>  "011000" #Undefined
>>  "011001" #Undefined
>>  "011010" #Undefined
>>  "011011" #Undefined
>>  "011100" #Undefined
>>  "011101" #Undefined
>>  "011110" #Undefined
>>  "011111" #Undefined
>>  "100000" #Undefined
>>  "100001" #Undefined
>>  "100010" #Undefined
>>  "100011" #Undefined
>>  "100100" #Undefined
>>  "100101" #Undefined
>>  "100110" #Undefined
>>  "100111" #Undefined
>>  "101000" #Undefined
>>  "101001" #Undefined
>>  "101010" #Undefined
>>  "101011" #Undefined
>>  "101100" #Undefined
>>  "101101" #Undefined
>>  "101110" #Undefined
>>  "101111" #Undefined
>>  "110000" #Undefined
>>  "110001" #Undefined
>>  "110010" #Undefined
>>  "110011" #Undefined
>>  "110100" #Undefined
>>  "110101" #Undefined
>>  "110110" #Undefined
>>  "110111" #Undefined
>>  "111000" #Undefined
>>  "111001" #Undefined
>>  "111010" #Undefined
>>  "111011" #Undefined
>>  "111100" #Undefined
>>  "111101" #Undefined
>>  "111110" #Undefined
>>  "111111" #SourcePointer
>> +       )!
>> - )!
>>
>> Item was changed:
>>  Object subclass: #CompiledMethodTrailer
>>        instanceVariableNames: 'data encodedData kind size method'
>>        classVariableNames: ''
>>        poolDictionaries: ''
>>        category: 'Kernel-Methods'!
>>
>> + !CompiledMethodTrailer commentStamp: 'ul 12/31/2009 19:03' prior: 0!
>> - !CompiledMethodTrailer commentStamp: 'Igor.Stasenko 12/13/2009 12:53' prior: 0!
>>  I am responsible for encoding and decoding various kinds of compiled method trailer data.
>>  I should not expose any binary data outside of myself, so all tools which working with compiled methods
>>  should ask me to encode the meta-data, they want to be added to the compiled method trailer, as well as retrieve it.
>>
>>  To add a new kind of trailer, you should give it a proper name and define it in the #trailerKinds method at my class side.
>> + Then you need to implement a corresponding #encode<your name> and #decode<your name> methods at instance side. Then add any public accessor methods, which will use a newly introduced trailer kind for communicating with outer layer(s). And finally evaluate: self generateTrailerKindDecoders.
>> - Then you need to implement a corresponding #encode<your name> and #decode<your name> methods at instance side. Then add any public accessor methods, which will use a newly introduced trailer kind for communicating with outer layer(s).
>>
>>  An encodeXXX methods should store result (byte array) into encodedData instance variable.
>>
>>  A decodeXXX methods should read the data from compiled method instance, held by 'method' ivar,
>>  and always set 'size' ivar (denoting a total length of trailer in compiled method) and optionally 'data' ivar which should keep a decoded data, ready to be used by outer layer(s) using accessor method(s) you providing.
>>
>>  The kind of compiled method trailer is determined by the last byte of compiled method.
>>
>>  The byte format used is following:
>>        "2rkkkkkkdd"
>>
>>  where 'k' bits stands for 'kind' , allowing totally 64 different kinds of method trailer
>>  and 'd' bits is data.
>>
>>  Following is the list of currently defined trailer kinds:
>>
>>  NoTrailer , k = 000000, dd unused
>>  method has no trailer, and total trailer size bytes is always 1
>>
>>  ClearedTrailer, k = 000001,
>>  method has cleared trailer (it was set to something else, but then cleared)
>>  dd+1  determines the number of bytes for size field, and size is a total length of trailer bytes
>>  So a total length of trailer is: 1 + (dd + 1) + size
>>
>>  TempsNamesQCompress, k = 000010
>>  the trailer contains a list of method temp names,  compressed using qCompress: method.
>>  dd+1  determines the number of bytes for size field, and size is a number of bytes of compressed buffer.
>>  So a total length of trailer is:  1 + (dd + 1) + size
>>
>>  TempsNamesZip, k = 000011
>>  the trailer contains a list of method temp names,  compressed using GZIP compression method.
>>  dd+1  determines the number of bytes for size field, and size is a number of bytes of compressed buffer
>>  So a total length of trailer is: 1 + (dd + 1) + size
>>
>>  SourceBySelector, k = 000100
>>  the trailer indicates , that method source is determined by a class + selector where it is installed to.
>>  Trailer size = 1.
>>
>>  SourceByStringIdentifier, k = 000101
>>  the trailer indicates , that method source is determined by a class + some ByteString identifier.
>>  dd+1  determines the number of bytes for size of ByteString identifier, and size is number of bytes of string.
>>  A total length of trailer is:  1 + (dd + 1) + size
>>
>>  EmbeddedSourceQCompress, k = 000110
>>  the trailer contains an utf-8 encoded method source code, compressed using qCompress method
>>  dd+1  determines the number of bytes for size field, and size is a number of bytes of compressed source code
>>  A total length of trailer is:  1 + (dd + 1) + size
>>
>>  EmbeddedSourceZip, k = 000111
>>  the trailer contains an utf-8 encoded method source code, comressed using GZIP
>>  dd+1  determines the number of bytes for size field, and size is a number of bytes of compressed buffer
>>  A total length of trailer is:  1 + (dd + 1) + size
>>
>>  VarLengthSourcePointer, k = 001000
>>  the trailer is variable-length encoded source pointer.
>>  dd bits is unused.
>>
>>  ExtendedKind, k = 001001
>>  the next byte of trailer (one that prepends the last byte of compiled method)
>>  denotes an extended kind of trailer, allowing to use additional 256 kinds of encoding method's trailer in future.
>>
>>  SourcePointer, k = 111111
>>  the trailer is encoded source pointer. Total trailer size is 4-bytes
>>  (this kind of encoding is backwards compatible with most of existing compiled methods)
>>
>>  !
>>
>> Item was added:
>> + ----- Method: CompiledMethodTrailer class>>generateTrailerKindDecoders (in category 'as yet unclassified') -----
>> + generateTrailerKindDecoders
>> +
>> +       self class
>> +               compile: (String streamContents: [ :stream |
>> +                       stream
>> +                               nextPutAll: 'trailerKindDecoders'; cr;
>> +                               cr;
>> +                               tab; nextPut: $^; print: (
>> +                                       self trailerKinds collect: [ :each |
>> +                                               ('decode', each) asSymbol ]) ])
>> +               classified: 'generated'
>> +               !
>>
>>
>>
>
>
>
> -- 
> Best regards,
> Igor Stasenko AKA sig.
>
>


More information about the Squeak-dev mailing list