[squeak-dev] Do upcoming double byte strings require Spur? (was: The Trunk: Kernel-eem.980.mcz)

Eliot Miranda eliot.miranda at gmail.com
Wed Jan 27 01:28:42 UTC 2016


Hi David,


> On Jan 26, 2016, at 3:13 PM, David T. Lewis <lewis at mail.msen.com> wrote:
>
> Do the upcoming changes for double-byte strings require the Spur object
> format, or are they something that would also work on a V3 object memory?

Well, it would be possible to implement them in a different way, by
adding 16-bit character access primitives, but it can't be done
"naturally".  The V3 object representation has a 4 bit format field in
every object (it is also in the class's format field, the instSpec
subfield).  This is organized as follows:

ObjectMemory methods for object access
formatOf: oop
"       0      no fields
        1      fixed fields only (all containing pointers)
        2      indexable fields only (all containing pointers)
        3      both fixed and indexable fields (all containing pointers)
        4      both fixed and indexable weak fields (all containing pointers).

        5      unused (reserved for ephemerons?)
        6      indexable word fields only (no pointers)
        7      indexable long (64-bit) fields (only in 64-bit images)

    8-11      indexable byte fields only (no pointers) (low 2 bits are
low 2 bits of size)
   12-15     compiled methods:
                   # of literal oops specified in method header,
                   followed by indexable bytes (same interpretation of
low 2 bits as above)
"
<inline: true>
^((self baseHeader: oop) >> self instFormatFieldLSB) bitAnd: 16rF

The low two bits of formats 8-11 and 12-15 are subtracted from the
object's slot size to derive the size in bytes.  So there is only one
unused format, 7, and that isn't enough to express two short formats,
one with the lsb set to indicate subtracting 2 bytes from the size.

When I designed the Spur format field I made sure it would scale to
64-bits and provide 8- 16- 32- and 64-bit indexing.  It is organized
as follows:

SpurMemoryManager methods for object access
formatOf: objOop
"               0 = 0 sized objects (UndefinedObject True False et al)
                1 = non-indexable objects with inst vars (Point et al)
                2 = indexable objects with no inst vars (Array et al)
                3 = indexable objects with inst vars (MethodContext
AdditionalMethodState et al)
                4 = weak indexable objects with inst vars (WeakArray et al)
                5 = weak non-indexable objects with inst vars
(ephemerons) (Ephemeron)
                6 unused, reserved for exotic pointer objects?
                7 Forwarded Object, 1st field is pointer, rest of
fields are ignored
                8 unused, reserved for exotic non-pointer objects?
                9 (?) 64-bit indexable
       10 - 11 32-bit indexable (11 unused in 32 bits)
       12 - 15 16-bit indexable (14 & 15 unused in 32-bits)
       16 - 23 byte indexable (20-23 unused in 32-bits)
       24 - 31 compiled method (28-31 unused in 32-bits)"
    ^(self longAt: objOop) >> self formatShift bitAnd: self formatMask


So this makes it natural to have the at:[put:] primitives in both
normal (60 & 61, objects for pointer objects or integers for bits
objects) and string (63 & 64, characters for bits objects) variations.
And indeed when I finally get round to building new Spur VMs the
primitives do already access 16-bits, and extending the ClassBuilder
to allow specifying those classes is straight-forward.

> Thanks,
> Dave

_,,,^..^,,,_
Eliot

>
>
>> On Mon, Jan 18, 2016 at 03:32:21AM +0000, commits at source.squeak.org wrote:
>> Eliot Miranda uploaded a new version of Kernel to project The Trunk:
>> http://source.squeak.org/trunk/Kernel-eem.980.mcz
>>
>> ==================== Summary ====================
>>
>> Name: Kernel-eem.980
>> Author: eem
>> Time: 18 January 2016, 7:32:08.6746 pm
>> UUID: 95543c3e-e2d1-4ec8-8ad2-a90fca9a2b06
>> Ancestors: Kernel-tpr.979
>>
>> Lay the ground work for double-byte strings (and possibly a 64-bit indexable bits type).
>>
>> Test shallowCopy's error code and report primitive failure if it has failed for other than being out of memory (which may simply imply the need for a GC).
>>
>> =============== Diff against Kernel-tpr.979 ===============
>>
>> Item was changed:
>>  ----- Method: Behavior>>format (in category 'accessing') -----
>>  format
>>      "Answer an Integer that encodes the kinds and numbers of variables of
>> +     instances of the receiver.  The format is (currently) composed of two fields,
>> +     a 16-bit instSize, in the least significant bits, specifying the number of named
>> +     inst vars, if any, and a 5-bit format field, describing the kind of class.  c.f. instSpec.
>> +        (msb)<5 bit format><16 bit #fixed fields>(lsb)"
>> -    instances of the receiver."
>>
>>      ^format!
>>
>> Item was changed:
>>  ----- Method: Behavior>>instSpec (in category 'testing') -----
>>  instSpec
>>      "Answer the instance specification part of the format that defines what kind of object
>>       an instance of the receiver is.  The formats are
>>              0    = 0 sized objects (UndefinedObject True False et al)
>>              1    = non-indexable objects with inst vars (Point et al)
>>              2    = indexable objects with no inst vars (Array et al)
>>              3    = indexable objects with inst vars (MethodContext AdditionalMethodState et al)
>>              4    = weak indexable objects with inst vars (WeakArray et al)
>>              5    = weak non-indexable objects with inst vars (ephemerons) (Ephemeron)
>>              6    = unused
>>              7    = immediates (SmallInteger, Character)
>>              8    = unused
>>              9    = 64-bit indexable
>> +        10-11    = 32-bit indexable (Bitmap)                    (plus one odd bit, unused in 32-bits)
>> +        12-15    = 16-bit indexable                            (plus two odd bits, one unused in 32-bits)
>> +        16-23    = 8-bit indexable                            (plus three odd bits, one unused in 32-bits)
>> +        24-31    = compiled methods (CompiledMethod)    (plus three odd bits, one unused in 32-bits)
>> +     Note that in the VM instances also have a 5 bit format field that relates to their class's format.
>> +     Formats 11, 13-15, 17-23 & 25-31 are unused in classes but used in instances to define the
>> +     number of elements missing up to the slot size.  For example, a 2-byte ByteString instance
>> +     has format 18 in 32-bits, since its size is one 32-bit slot - 2 bytes ((18 bitAnd: 3) = 2), and
>> +     22 in 64 bits, since its size is one 64-bit slot - 6 bytes ((22 bitAnd: 7) = 6)."
>> -        10-11    = 32-bit indexable (Bitmap)
>> -        12-15    = 16-bit indexable
>> -        16-23    = 8-bit indexable
>> -        24-31    = compiled methods (CompiledMethod)"
>>      ^(format bitShift: -16) bitAnd: 16r1F!
>>
>> Item was changed:
>>  ----- Method: Behavior>>kindOfSubclass (in category 'testing class hierarchy') -----
>>  kindOfSubclass
>>      "Answer a String that is the keyword that describes the receiver's kind of subclass,
>>       either a regular subclass, a variableSubclass, a variableByteSubclass,
>>       a variableWordSubclass, a weakSubclass, an ephemeronSubclass or an immediateSubclass.
>> +     c.f. typeOfClass & instSpec"
>> +    ^(#(' subclass: '
>> +        ' subclass: '
>> +        ' variableSubclass: '
>> +        ' variableSubclass: '
>> +        ' weakSubclass: '
>> +        ' ephemeronSubclass: '
>> +        nil
>> +        ' immediateSubclass: '
>> +        nil
>> +        ' variableDoubleWordSubclass: '
>> +        ' variableWordSubclass: '        nil
>> +        ' variableDoubleByteSubclass: '    nil nil nil
>> +        ' variableByteSubclass: '        nil nil nil nil nil nil nil
>> +        ' variableByteSubclass: '        nil nil nil nil nil nil nil )
>> +            at: self instSpec + 1) ifNil:
>> +                [self error: 'invalid class type']!
>> -     c.f. typeOfClass"
>> -    ^self isVariable
>> -        ifTrue:
>> -            [self isBits
>> -                ifTrue:
>> -                    [self isBytes
>> -                        ifTrue: [' variableByteSubclass: ']
>> -                        ifFalse: [' variableWordSubclass: ']]
>> -                ifFalse:
>> -                    [self isWeak
>> -                        ifTrue: [' weakSubclass: ']
>> -                        ifFalse: [' variableSubclass: ']]]
>> -        ifFalse:
>> -            [self isImmediateClass
>> -                ifTrue: [' immediateSubclass: ']
>> -                ifFalse:
>> -                    [self isEphemeronClass
>> -                        ifTrue: [' ephemeronSubclass: ']
>> -                        ifFalse: [' subclass: ']]]!
>>
>> Item was changed:
>>  ----- Method: Behavior>>typeOfClass (in category 'accessing') -----
>>  typeOfClass
>> +    "Answer a symbol uniquely describing the type of the receiver. c.f. kindOfSubclass, instSpec"
>> +    ^(#(normal
>> +        normal
>> +        variable
>> +        variable
>> +        weak
>> +        ephemeron
>> +        nil
>> +        immediate
>> +        nil
>> +        longs
>> +        words                nil
>> +        shorts                nil nil nil
>> +        bytes                nil nil nil nil nil nil nil
>> +        compiledMethod    nil nil nil nil nil nil nil)
>> +            at: self instSpec + 1) ifNil:
>> +                [self error: 'invalid class type']!
>> -    "Answer a symbol uniquely describing the type of the receiver. c.f. kindOfSubclass"
>> -    self isBytes ifTrue:
>> -        [^self instSpec = CompiledMethod instSpec
>> -            ifTrue: [#compiledMethod] "Very special!!"
>> -            ifFalse: [#bytes]].
>> -    (self isWords and: [self isPointers not]) ifTrue:
>> -        [^self instSpec = SmallInteger instSpec
>> -            ifTrue: [#immediate] "Very special!!"
>> -            ifFalse: [#words]].
>> -    self isWeak ifTrue: [^#weak].
>> -    self isVariable ifTrue: [^#variable].
>> -    self isEphemeronClass ifTrue: [^#ephemeron].
>> -    ^#normal!
>>
>> Item was changed:
>>  ----- Method: ClassBuilder>>computeFormat:instSize:forSuper: (in category 'class format') -----
>>  computeFormat: type instSize: newInstSize forSuper: newSuper
>>      "Compute the new format for making oldClass a subclass of newSuper.
>>       Answer the format or nil if there is any problem."
>> +    | instSize isVar isPointers isWeak bitsUnitSize |
>> -    | instSize isVar isWords isPointers isWeak |
>>      type == #compiledMethod ifTrue:
>>          [newInstSize > 0 ifTrue:
>>              [self error: 'A compiled method class cannot have named instance variables'.
>>              ^nil].
>>          ^CompiledMethod format].
>>      instSize := newInstSize + (newSuper ifNil:[0] ifNotNil:[newSuper instSize]).
>>      instSize > 65535 ifTrue:
>>          [self error: 'Class has too many instance variables (', instSize printString,')'.
>>          ^nil].
>> +    type == #normal ifTrue:[isVar := isWeak := false. isPointers := true].
>> +    type == #bytes ifTrue:[isVar := true. bitsUnitSize := 1. isPointers := isWeak := false].
>> +    type == #shorts ifTrue:[isVar := true. bitsUnitSize := 2. isPointers := isWeak := false].
>> +    type == #words ifTrue:[isVar := true. bitsUnitSize := 4. isPointers := isWeak := false].
>> +    type == #longs ifTrue:[isVar := true. bitsUnitSize := 8. isPointers := isWeak := false].
>> +    type == #variable ifTrue:[isVar := isPointers := true. isWeak := false].
>> +    type == #weak ifTrue:[isVar := isWeak := isPointers := true].
>> +    type == #ephemeron ifTrue:[isVar := false. isWeak := isPointers := true].
>> +    type == #immediate ifTrue:[isVar := isWeak := isPointers := false].
>> -    type == #normal ifTrue:[isVar := isWeak := false. isWords := isPointers := true].
>> -    type == #bytes ifTrue:[isVar := true. isWords := isPointers := isWeak := false].
>> -    type == #words ifTrue:[isVar := isWords := true. isPointers := isWeak := false].
>> -    type == #variable ifTrue:[isVar := isPointers := isWords := true. isWeak := false].
>> -    type == #weak ifTrue:[isVar := isWeak := isWords := isPointers := true].
>> -    type == #ephemeron ifTrue:[isVar := false. isWeak := isWords := isPointers := true].
>> -    type == #immediate ifTrue:[isVar := isWeak := isPointers := false. isWords := true].
>>      (isPointers not and: [instSize > 0]) ifTrue:
>>          [self error: 'A non-pointer class cannot have named instance variables'.
>>          ^nil].
>> +    ^self format: instSize variable: isVar bitsUnitSize: bitsUnitSize pointers: isPointers weak: isWeak!
>> -    ^self format: instSize variable: isVar words: isWords pointers: isPointers weak: isWeak!
>>
>> Item was added:
>> + ----- Method: ClassBuilder>>format:variable:bitsUnitSize:pointers:weak: (in category 'class format') -----
>> + format: nInstVars variable: isVar bitsUnitSize: bitsUnitSize pointers: isPointers weak: isWeak
>> +    "Compute the format for the given instance specfication.
>> +     Above Cog Spur the class format is
>> +        <5 bits inst spec><16 bits inst size>
>> +     where the 5-bit inst spec is
>> +            0    = 0 sized objects (UndefinedObject True False et al)
>> +            1    = non-indexable objects with inst vars (Point et al)
>> +            2    = indexable objects with no inst vars (Array et al)
>> +            3    = indexable objects with inst vars (MethodContext AdditionalMethodState et al)
>> +            4    = weak indexable objects with inst vars (WeakArray et al)
>> +            5    = weak non-indexable objects with inst vars (ephemerons) (Ephemeron)
>> +            6    = unused
>> +            7    = immediates (SmallInteger, Character, SmallFloat64)
>> +            8    = unused
>> +            9    = 64-bit indexable
>> +        10-11    = 32-bit indexable (Bitmap, WideString)
>> +        12-15    = 16-bit indexable
>> +        16-23    = 8-bit indexable (ByteString)
>> +        24-31    = compiled methods (CompiledMethod)"
>> +    | instSpec |
>> +    instSpec := isWeak
>> +                    ifTrue:
>> +                        [isVar
>> +                            ifTrue: [4]
>> +                            ifFalse: [5]]
>> +                    ifFalse:
>> +                        [isPointers
>> +                            ifTrue:
>> +                                [isVar
>> +                                    ifTrue: [nInstVars > 0 ifTrue: [3] ifFalse: [2]]
>> +                                    ifFalse: [nInstVars > 0 ifTrue: [1] ifFalse: [0]]]
>> +                            ifFalse:
>> +                                [isVar
>> +                                    ifTrue: [bitsUnitSize caseOf: {
>> +                                            [1] -> [16].
>> +                                            [2] -> [12].
>> +                                            [4] -> [10].
>> +                                            [8] -> [9] }]
>> +                                    ifFalse: [7]]].
>> +    ^(instSpec bitShift: 16) + nInstVars!
>>
>> Item was changed:
>>  ----- Method: Object>>shallowCopy (in category 'copying') -----
>>  shallowCopy
>>      "Answer a copy of the receiver which shares the receiver's instance variables."
>>      | class newObject index |
>> +    <primitive: 148 error: ec>
>> +    ec == #'insufficient object memory' ifFalse:
>> +        [^self primitiveFailed].
>> -    <primitive: 148>
>>      class := self class.
>>      class isVariable
>>          ifTrue:
>>              [index := self basicSize.
>> +             newObject := class basicNew: index.
>> +             [index > 0] whileTrue:
>> +                [newObject basicAt: index put: (self basicAt: index).
>> +                 index := index - 1]]
>> -            newObject := class basicNew: index.
>> -            [index > 0]
>> -                whileTrue:
>> -                    [newObject basicAt: index put: (self basicAt: index).
>> -                    index := index - 1]]
>>          ifFalse: [newObject := class basicNew].
>>      index := class instSize.
>> +    [index > 0] whileTrue:
>> +        [newObject instVarAt: index put: (self instVarAt: index).
>> +         index := index - 1].
>> +    ^newObject!
>> -    [index > 0]
>> -        whileTrue:
>> -            [newObject instVarAt: index put: (self instVarAt: index).
>> -            index := index - 1].
>> -    ^ newObject!
>


More information about the Squeak-dev mailing list