On Mon, Aug 31, 2015 at 11:35 AM, Chris Muller <asqueaker@gmail.com> wrote:

Sometimes the number of bytes is only known in a variable, so would it
be possible to do 4 primitives which accept the number of bits (or
bytes) as an argument?  (uint:at: uint:at:put:) * (big endian, little
endian)

Of course its possible, but such an architecture can hardly be quick.  If one needs the flexible primitives then use them, but don't hobble the system by only providing them.  Having a real 64-bit VM means that the use of 2 32-bit accesses is unnecessarily slow.

Which would you rather, and which would you think would be faster (I don't know, but I have my suspicions):

Expand the existing flexible integerAt: prims to integerAt:put:bytes:signed:bigEndian: (yuck), or implement this in terms of a wrapper something like

ByteArray>>integerAt: index bytes: numBytes signed: signed bigEndian: bigEndian

    ^size >= 4
        ifTrue:
            [size = 8 ifTrue:
                [value := self unsignedLong64At: index.
                 bigEndian ifTrue:
                    [value := self byteReverseEightBytes: value].
                 (sign := value bitShift: -63) ~= 0 ifTrue: "if the VM is intelligent about left shift of zero then this test is unnecessary..."
                    [value := value - ((sign bitAnd: 1) bitShift: 64)].
                 ^value].
             size = 4 ifTrue:
                [value := self unsignedLong32At: index.
                 bigEndian ifTrue:
                    [value := self byteReverseFourBytes: value].
                 (sign := value bitShift: -31) ~= 0 ifTrue: "if the VM is intelligent about left shift of zero then this test is unnecessary..."
                    [value := value - ((sign bitAnd: 1) bitShift: 32)].
                 ^value].
             ^self error: 'size must be a power of two from 1 to 8']
        ifFalse:
...



On Mon, Aug 31, 2015 at 12:25 PM, Eliot Miranda <eliot.miranda@gmail.com> wrote:
> Hi Chrises,
>
>     my vote would be to write these as 12 numbered primitives, (2,4 & 8
> bytes) * (at: & at:put:) * (big & little endian) because they can be
> performance critical and implementing them like this means the maximum
> efficiency in both 32-bit and 64-bit Spur, plus the possibility of the JIT
> implementing the primitives.
>
> On Sun, Aug 30, 2015 at 10:01 PM, Chris Cunningham <cunningham.cb@gmail.com>
> wrote:
>>
>> Hi Chris,
>>
>> I'm all for having the fastest that in the image that works.  If you could
>> make your version handle endianess, then I'm all for including it (at least
>> in the 3 variants that are faster).  My first use for this (interface for
>> KAFKA) apparently requires bigEndianess, so I really want that supported.
>>
>> It might be best to keep my naming, though - it follows the name pattern
>> that is already in the class.  Or will yours also support 128?
>>
>> -cbc
>>
>> On Sun, Aug 30, 2015 at 2:38 PM, Chris Muller <asqueaker@gmail.com> wrote:
>>>
>>> Hi Chris, I think these methods belong in the image with the fastest
>>> implementation we can do.
>>>
>>> I implemented 64-bit unsigned access for Ma Serializer back in 2005.
>>> I modeled my implementation after Andreas' original approach which
>>> tries to avoid LI arithmetic.  I was curious whether your
>>> implementations would be faster, because if they are then it could
>>> benefit Magma.  After loading "Ma Serializer" 1.5 (or head) into a
>>> trunk image, I used the following script to take comparison
>>> measurements:
>>>
>>> | smallN largeN maBa cbBa |  smallN := ((2 raisedTo: 13) to: (2
>>> raisedTo: 14)) atRandom.
>>> largeN := ((2 raisedTo: 63) to: (2 raisedTo: 64)) atRandom.
>>> maBa := ByteArray new: 8.
>>> cbBa := ByteArray new: 8.
>>> maBa maUint: 64 at: 0 put: largeN.
>>> cbBa unsignedLong64At: 1 put: largeN bigEndian: false.
>>> self assert: (cbBa maUnsigned64At: 1) = (maBa unsignedLong64At: 1
>>> bigEndian: false).
>>> { 'cbc smallN write' -> [ cbBa unsignedLong64At: 1 put: smallN
>>> bigEndian: false] bench.
>>> 'ma smallN write' -> [cbBa maUint: 64 at: 0 put: smallN ] bench.
>>> 'cbc smallN access' -> [ cbBa unsignedLong64At: 1 bigEndian: false. ]
>>> bench.
>>> 'ma smallN access' -> [ cbBa maUnsigned64At: 1] bench.
>>> 'cbc largeN write' -> [ cbBa unsignedLong64At: 1 put: largeN
>>> bigEndian: false] bench.
>>> 'ma largeN write' -> [cbBa maUint: 64 at: 0 put: largeN ] bench.
>>> 'cbc largeN access' -> [ cbBa unsignedLong64At: 1 bigEndian: false ]
>>> bench.
>>> 'ma largeN access' -> [ cbBa maUnsigned64At: 1] bench.
>>>  }
>>>
>>> Here are the results:
>>>
>>> 'cbc smallN write'->'3,110,000 per second.  322 nanoseconds per run.' .
>>> 'ma smallN write'->'4,770,000 per second.  210 nanoseconds per run.' .
>>> 'cbc smallN access'->'4,300,000 per second.  233 nanoseconds per run.' .
>>> 'ma smallN access'->'16,400,000 per second.  60.9 nanoseconds per run.' .
>>> 'cbc largeN write'->'907,000 per second.  1.1 microseconds per run.' .
>>> 'ma largeN write'->'6,620,000 per second.  151 nanoseconds per run.' .
>>> 'cbc largeN access'->'1,900,000 per second.  527 nanoseconds per run.' .
>>> 'ma largeN access'->'1,020,000 per second.  982 nanoseconds per run.'
>>>
>>> It looks like your 64-bit access is 86% faster for accessing the
>>> high-end of the 64-bit range, but slower in the other 3 metrics.
>>> Noticeably, it was only 14% as fast for writing the high-end of the
>>> 64-bit range, and similarly as much slower for small-number access..
>>>
>>>
>>> On Fri, Aug 28, 2015 at 6:01 PM, Chris Cunningham
>>> <cunningham.cb@gmail.com> wrote:
>>> > Hi.
>>> >
>>> > I've committed a change to the inbox with changes to allow
>>> > getting/putting
>>> > 64bit values to ByteArrays (similar to 32 and 16 bit accessors).  Could
>>> > this
>>> > be added to trunk?
>>> >
>>> > Also, first time I used the selective commit function - very nice!  the
>>> > changes I didn't want committed didn't, in fact, get commited.  Just
>>> > the
>>> > desirable bits!
>>> >
>>> > -cbc
>>> >
>>> >
>>> >
>>>
>>
>>
>>
>>
>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>
>
>



--
_,,,^..^,,,_
best, Eliot