[squeak-dev] Do upcoming double byte strings require Spur? (was: The Trunk: Kernel-eem.980.mcz)

David T. Lewis lewis at mail.msen.com
Wed Jan 27 01:38:13 UTC 2016


Thanks Eliot, that makes perfect sense.

Dave


> Hi David,
>
>
>> On Jan 26, 2016, at 3:13 PM, David T. Lewis <lewis at mail.msen.com> wrote:
>>
>> Do the upcoming changes for double-byte strings require the Spur object
>> format, or are they something that would also work on a V3 object
>> memory?
>
> Well, it would be possible to implement them in a different way, by
> adding 16-bit character access primitives, but it can't be done
> "naturally".  The V3 object representation has a 4 bit format field in
> every object (it is also in the class's format field, the instSpec
> subfield).  This is organized as follows:
>
> ObjectMemory methods for object access
> formatOf: oop
> "       0      no fields
>         1      fixed fields only (all containing pointers)
>         2      indexable fields only (all containing pointers)
>         3      both fixed and indexable fields (all containing pointers)
>         4      both fixed and indexable weak fields (all containing
> pointers).
>
>         5      unused (reserved for ephemerons?)
>         6      indexable word fields only (no pointers)
>         7      indexable long (64-bit) fields (only in 64-bit images)
>
>     8-11      indexable byte fields only (no pointers) (low 2 bits are
> low 2 bits of size)
>    12-15     compiled methods:
>                    # of literal oops specified in method header,
>                    followed by indexable bytes (same interpretation of
> low 2 bits as above)
> "
> <inline: true>
> ^((self baseHeader: oop) >> self instFormatFieldLSB) bitAnd: 16rF
>
> The low two bits of formats 8-11 and 12-15 are subtracted from the
> object's slot size to derive the size in bytes.  So there is only one
> unused format, 7, and that isn't enough to express two short formats,
> one with the lsb set to indicate subtracting 2 bytes from the size.
>
> When I designed the Spur format field I made sure it would scale to
> 64-bits and provide 8- 16- 32- and 64-bit indexing.  It is organized
> as follows:
>
> SpurMemoryManager methods for object access
> formatOf: objOop
> "               0 = 0 sized objects (UndefinedObject True False et al)
>                 1 = non-indexable objects with inst vars (Point et al)
>                 2 = indexable objects with no inst vars (Array et al)
>                 3 = indexable objects with inst vars (MethodContext
> AdditionalMethodState et al)
>                 4 = weak indexable objects with inst vars (WeakArray et
> al)
>                 5 = weak non-indexable objects with inst vars
> (ephemerons) (Ephemeron)
>                 6 unused, reserved for exotic pointer objects?
>                 7 Forwarded Object, 1st field is pointer, rest of
> fields are ignored
>                 8 unused, reserved for exotic non-pointer objects?
>                 9 (?) 64-bit indexable
>        10 - 11 32-bit indexable (11 unused in 32 bits)
>        12 - 15 16-bit indexable (14 & 15 unused in 32-bits)
>        16 - 23 byte indexable (20-23 unused in 32-bits)
>        24 - 31 compiled method (28-31 unused in 32-bits)"
>     ^(self longAt: objOop) >> self formatShift bitAnd: self formatMask
>
>
> So this makes it natural to have the at:[put:] primitives in both
> normal (60 & 61, objects for pointer objects or integers for bits
> objects) and string (63 & 64, characters for bits objects) variations.
> And indeed when I finally get round to building new Spur VMs the
> primitives do already access 16-bits, and extending the ClassBuilder
> to allow specifying those classes is straight-forward.
>
>> Thanks,
>> Dave
>
> _,,,^..^,,,_
> Eliot
>
>>
>>
>>> On Mon, Jan 18, 2016 at 03:32:21AM +0000, commits at source.squeak.org
>>> wrote:
>>> Eliot Miranda uploaded a new version of Kernel to project The Trunk:
>>> http://source.squeak.org/trunk/Kernel-eem.980.mcz
>>>
>>> ==================== Summary ====================
>>>
>>> Name: Kernel-eem.980
>>> Author: eem
>>> Time: 18 January 2016, 7:32:08.6746 pm
>>> UUID: 95543c3e-e2d1-4ec8-8ad2-a90fca9a2b06
>>> Ancestors: Kernel-tpr.979
>>>
>>> Lay the ground work for double-byte strings (and possibly a 64-bit
>>> indexable bits type).
>>>
>>> Test shallowCopy's error code and report primitive failure if it has
>>> failed for other than being out of memory (which may simply imply the
>>> need for a GC).
>>>
>>> =============== Diff against Kernel-tpr.979 ===============
>>>
>>> Item was changed:
>>>  ----- Method: Behavior>>format (in category 'accessing') -----
>>>  format
>>>      "Answer an Integer that encodes the kinds and numbers of variables
>>> of
>>> +     instances of the receiver.  The format is (currently) composed of
>>> two fields,
>>> +     a 16-bit instSize, in the least significant bits, specifying the
>>> number of named
>>> +     inst vars, if any, and a 5-bit format field, describing the kind
>>> of class.  c.f. instSpec.
>>> +        (msb)<5 bit format><16 bit #fixed fields>(lsb)"
>>> -    instances of the receiver."
>>>
>>>      ^format!
>>>
>>> Item was changed:
>>>  ----- Method: Behavior>>instSpec (in category 'testing') -----
>>>  instSpec
>>>      "Answer the instance specification part of the format that defines
>>> what kind of object
>>>       an instance of the receiver is.  The formats are
>>>              0    = 0 sized objects (UndefinedObject True False et al)
>>>              1    = non-indexable objects with inst vars (Point et al)
>>>              2    = indexable objects with no inst vars (Array et al)
>>>              3    = indexable objects with inst vars (MethodContext
>>> AdditionalMethodState et al)
>>>              4    = weak indexable objects with inst vars (WeakArray et
>>> al)
>>>              5    = weak non-indexable objects with inst vars
>>> (ephemerons) (Ephemeron)
>>>              6    = unused
>>>              7    = immediates (SmallInteger, Character)
>>>              8    = unused
>>>              9    = 64-bit indexable
>>> +        10-11    = 32-bit indexable (Bitmap)                    (plus
>>> one odd bit, unused in 32-bits)
>>> +        12-15    = 16-bit indexable                            (plus
>>> two odd bits, one unused in 32-bits)
>>> +        16-23    = 8-bit indexable                            (plus
>>> three odd bits, one unused in 32-bits)
>>> +        24-31    = compiled methods (CompiledMethod)    (plus three
>>> odd bits, one unused in 32-bits)
>>> +     Note that in the VM instances also have a 5 bit format field that
>>> relates to their class's format.
>>> +     Formats 11, 13-15, 17-23 & 25-31 are unused in classes but used
>>> in instances to define the
>>> +     number of elements missing up to the slot size.  For example, a
>>> 2-byte ByteString instance
>>> +     has format 18 in 32-bits, since its size is one 32-bit slot - 2
>>> bytes ((18 bitAnd: 3) = 2), and
>>> +     22 in 64 bits, since its size is one 64-bit slot - 6 bytes ((22
>>> bitAnd: 7) = 6)."
>>> -        10-11    = 32-bit indexable (Bitmap)
>>> -        12-15    = 16-bit indexable
>>> -        16-23    = 8-bit indexable
>>> -        24-31    = compiled methods (CompiledMethod)"
>>>      ^(format bitShift: -16) bitAnd: 16r1F!
>>>
>>> Item was changed:
>>>  ----- Method: Behavior>>kindOfSubclass (in category 'testing class
>>> hierarchy') -----
>>>  kindOfSubclass
>>>      "Answer a String that is the keyword that describes the receiver's
>>> kind of subclass,
>>>       either a regular subclass, a variableSubclass, a
>>> variableByteSubclass,
>>>       a variableWordSubclass, a weakSubclass, an ephemeronSubclass or
>>> an immediateSubclass.
>>> +     c.f. typeOfClass & instSpec"
>>> +    ^(#(' subclass: '
>>> +        ' subclass: '
>>> +        ' variableSubclass: '
>>> +        ' variableSubclass: '
>>> +        ' weakSubclass: '
>>> +        ' ephemeronSubclass: '
>>> +        nil
>>> +        ' immediateSubclass: '
>>> +        nil
>>> +        ' variableDoubleWordSubclass: '
>>> +        ' variableWordSubclass: '        nil
>>> +        ' variableDoubleByteSubclass: '    nil nil nil
>>> +        ' variableByteSubclass: '        nil nil nil nil nil nil nil
>>> +        ' variableByteSubclass: '        nil nil nil nil nil nil nil )
>>> +            at: self instSpec + 1) ifNil:
>>> +                [self error: 'invalid class type']!
>>> -     c.f. typeOfClass"
>>> -    ^self isVariable
>>> -        ifTrue:
>>> -            [self isBits
>>> -                ifTrue:
>>> -                    [self isBytes
>>> -                        ifTrue: [' variableByteSubclass: ']
>>> -                        ifFalse: [' variableWordSubclass: ']]
>>> -                ifFalse:
>>> -                    [self isWeak
>>> -                        ifTrue: [' weakSubclass: ']
>>> -                        ifFalse: [' variableSubclass: ']]]
>>> -        ifFalse:
>>> -            [self isImmediateClass
>>> -                ifTrue: [' immediateSubclass: ']
>>> -                ifFalse:
>>> -                    [self isEphemeronClass
>>> -                        ifTrue: [' ephemeronSubclass: ']
>>> -                        ifFalse: [' subclass: ']]]!
>>>
>>> Item was changed:
>>>  ----- Method: Behavior>>typeOfClass (in category 'accessing') -----
>>>  typeOfClass
>>> +    "Answer a symbol uniquely describing the type of the receiver.
>>> c.f. kindOfSubclass, instSpec"
>>> +    ^(#(normal
>>> +        normal
>>> +        variable
>>> +        variable
>>> +        weak
>>> +        ephemeron
>>> +        nil
>>> +        immediate
>>> +        nil
>>> +        longs
>>> +        words                nil
>>> +        shorts                nil nil nil
>>> +        bytes                nil nil nil nil nil nil nil
>>> +        compiledMethod    nil nil nil nil nil nil nil)
>>> +            at: self instSpec + 1) ifNil:
>>> +                [self error: 'invalid class type']!
>>> -    "Answer a symbol uniquely describing the type of the receiver.
>>> c.f. kindOfSubclass"
>>> -    self isBytes ifTrue:
>>> -        [^self instSpec = CompiledMethod instSpec
>>> -            ifTrue: [#compiledMethod] "Very special!!"
>>> -            ifFalse: [#bytes]].
>>> -    (self isWords and: [self isPointers not]) ifTrue:
>>> -        [^self instSpec = SmallInteger instSpec
>>> -            ifTrue: [#immediate] "Very special!!"
>>> -            ifFalse: [#words]].
>>> -    self isWeak ifTrue: [^#weak].
>>> -    self isVariable ifTrue: [^#variable].
>>> -    self isEphemeronClass ifTrue: [^#ephemeron].
>>> -    ^#normal!
>>>
>>> Item was changed:
>>>  ----- Method: ClassBuilder>>computeFormat:instSize:forSuper: (in
>>> category 'class format') -----
>>>  computeFormat: type instSize: newInstSize forSuper: newSuper
>>>      "Compute the new format for making oldClass a subclass of
>>> newSuper.
>>>       Answer the format or nil if there is any problem."
>>> +    | instSize isVar isPointers isWeak bitsUnitSize |
>>> -    | instSize isVar isWords isPointers isWeak |
>>>      type == #compiledMethod ifTrue:
>>>          [newInstSize > 0 ifTrue:
>>>              [self error: 'A compiled method class cannot have named
>>> instance variables'.
>>>              ^nil].
>>>          ^CompiledMethod format].
>>>      instSize := newInstSize + (newSuper ifNil:[0] ifNotNil:[newSuper
>>> instSize]).
>>>      instSize > 65535 ifTrue:
>>>          [self error: 'Class has too many instance variables (',
>>> instSize printString,')'.
>>>          ^nil].
>>> +    type == #normal ifTrue:[isVar := isWeak := false. isPointers :=
>>> true].
>>> +    type == #bytes ifTrue:[isVar := true. bitsUnitSize := 1.
>>> isPointers := isWeak := false].
>>> +    type == #shorts ifTrue:[isVar := true. bitsUnitSize := 2.
>>> isPointers := isWeak := false].
>>> +    type == #words ifTrue:[isVar := true. bitsUnitSize := 4.
>>> isPointers := isWeak := false].
>>> +    type == #longs ifTrue:[isVar := true. bitsUnitSize := 8.
>>> isPointers := isWeak := false].
>>> +    type == #variable ifTrue:[isVar := isPointers := true. isWeak :=
>>> false].
>>> +    type == #weak ifTrue:[isVar := isWeak := isPointers := true].
>>> +    type == #ephemeron ifTrue:[isVar := false. isWeak := isPointers :=
>>> true].
>>> +    type == #immediate ifTrue:[isVar := isWeak := isPointers :=
>>> false].
>>> -    type == #normal ifTrue:[isVar := isWeak := false. isWords :=
>>> isPointers := true].
>>> -    type == #bytes ifTrue:[isVar := true. isWords := isPointers :=
>>> isWeak := false].
>>> -    type == #words ifTrue:[isVar := isWords := true. isPointers :=
>>> isWeak := false].
>>> -    type == #variable ifTrue:[isVar := isPointers := isWords := true.
>>> isWeak := false].
>>> -    type == #weak ifTrue:[isVar := isWeak := isWords := isPointers :=
>>> true].
>>> -    type == #ephemeron ifTrue:[isVar := false. isWeak := isWords :=
>>> isPointers := true].
>>> -    type == #immediate ifTrue:[isVar := isWeak := isPointers := false.
>>> isWords := true].
>>>      (isPointers not and: [instSize > 0]) ifTrue:
>>>          [self error: 'A non-pointer class cannot have named instance
>>> variables'.
>>>          ^nil].
>>> +    ^self format: instSize variable: isVar bitsUnitSize: bitsUnitSize
>>> pointers: isPointers weak: isWeak!
>>> -    ^self format: instSize variable: isVar words: isWords pointers:
>>> isPointers weak: isWeak!
>>>
>>> Item was added:
>>> + ----- Method:
>>> ClassBuilder>>format:variable:bitsUnitSize:pointers:weak: (in category
>>> 'class format') -----
>>> + format: nInstVars variable: isVar bitsUnitSize: bitsUnitSize
>>> pointers: isPointers weak: isWeak
>>> +    "Compute the format for the given instance specfication.
>>> +     Above Cog Spur the class format is
>>> +        <5 bits inst spec><16 bits inst size>
>>> +     where the 5-bit inst spec is
>>> +            0    = 0 sized objects (UndefinedObject True False et al)
>>> +            1    = non-indexable objects with inst vars (Point et al)
>>> +            2    = indexable objects with no inst vars (Array et al)
>>> +            3    = indexable objects with inst vars (MethodContext
>>> AdditionalMethodState et al)
>>> +            4    = weak indexable objects with inst vars (WeakArray et
>>> al)
>>> +            5    = weak non-indexable objects with inst vars
>>> (ephemerons) (Ephemeron)
>>> +            6    = unused
>>> +            7    = immediates (SmallInteger, Character, SmallFloat64)
>>> +            8    = unused
>>> +            9    = 64-bit indexable
>>> +        10-11    = 32-bit indexable (Bitmap, WideString)
>>> +        12-15    = 16-bit indexable
>>> +        16-23    = 8-bit indexable (ByteString)
>>> +        24-31    = compiled methods (CompiledMethod)"
>>> +    | instSpec |
>>> +    instSpec := isWeak
>>> +                    ifTrue:
>>> +                        [isVar
>>> +                            ifTrue: [4]
>>> +                            ifFalse: [5]]
>>> +                    ifFalse:
>>> +                        [isPointers
>>> +                            ifTrue:
>>> +                                [isVar
>>> +                                    ifTrue: [nInstVars > 0 ifTrue: [3]
>>> ifFalse: [2]]
>>> +                                    ifFalse: [nInstVars > 0 ifTrue:
>>> [1] ifFalse: [0]]]
>>> +                            ifFalse:
>>> +                                [isVar
>>> +                                    ifTrue: [bitsUnitSize caseOf: {
>>> +                                            [1] -> [16].
>>> +                                            [2] -> [12].
>>> +                                            [4] -> [10].
>>> +                                            [8] -> [9] }]
>>> +                                    ifFalse: [7]]].
>>> +    ^(instSpec bitShift: 16) + nInstVars!
>>>
>>> Item was changed:
>>>  ----- Method: Object>>shallowCopy (in category 'copying') -----
>>>  shallowCopy
>>>      "Answer a copy of the receiver which shares the receiver's
>>> instance variables."
>>>      | class newObject index |
>>> +    <primitive: 148 error: ec>
>>> +    ec == #'insufficient object memory' ifFalse:
>>> +        [^self primitiveFailed].
>>> -    <primitive: 148>
>>>      class := self class.
>>>      class isVariable
>>>          ifTrue:
>>>              [index := self basicSize.
>>> +             newObject := class basicNew: index.
>>> +             [index > 0] whileTrue:
>>> +                [newObject basicAt: index put: (self basicAt: index).
>>> +                 index := index - 1]]
>>> -            newObject := class basicNew: index.
>>> -            [index > 0]
>>> -                whileTrue:
>>> -                    [newObject basicAt: index put: (self basicAt:
>>> index).
>>> -                    index := index - 1]]
>>>          ifFalse: [newObject := class basicNew].
>>>      index := class instSize.
>>> +    [index > 0] whileTrue:
>>> +        [newObject instVarAt: index put: (self instVarAt: index).
>>> +         index := index - 1].
>>> +    ^newObject!
>>> -    [index > 0]
>>> -        whileTrue:
>>> -            [newObject instVarAt: index put: (self instVarAt: index).
>>> -            index := index - 1].
>>> -    ^ newObject!
>>
>




More information about the Squeak-dev mailing list