[Vm-dev] Re: [squeak-dev] ByteArray accessors for 64-bit manipulation

Eliot Miranda eliot.miranda at gmail.com
Mon Aug 31 19:21:19 UTC 2015


On Mon, Aug 31, 2015 at 11:35 AM, Chris Muller <asqueaker at gmail.com> wrote:

>
> Sometimes the number of bytes is only known in a variable, so would it
> be possible to do 4 primitives which accept the number of bits (or
> bytes) as an argument?  (uint:at: uint:at:put:) * (big endian, little
> endian)
>

Of course its possible, but such an architecture can hardly be quick.  If
one needs the flexible primitives then use them, but don't hobble the
system by only providing them.  Having a real 64-bit VM means that the use
of 2 32-bit accesses is unnecessarily slow.

Which would you rather, and which would you think would be faster (I don't
know, but I have my suspicions):

Expand the existing flexible integerAt: prims to
integerAt:put:bytes:signed:bigEndian: (yuck), or implement this in terms of
a wrapper something like

ByteArray>>integerAt: index bytes: numBytes signed: signed bigEndian:
bigEndian

    ^size >= 4
        ifTrue:
            [size = 8 ifTrue:
                [value := self unsignedLong64At: index.
                 bigEndian ifTrue:
                    [value := self byteReverseEightBytes: value].
                 (sign := value bitShift: -63) ~= 0 ifTrue: "if the VM is
intelligent about left shift of zero then this test is unnecessary..."
                    [value := value - ((sign bitAnd: 1) bitShift: 64)].
                 ^value].
             size = 4 ifTrue:
                [value := self unsignedLong32At: index.
                 bigEndian ifTrue:
                    [value := self byteReverseFourBytes: value].
                 (sign := value bitShift: -31) ~= 0 ifTrue: "if the VM is
intelligent about left shift of zero then this test is unnecessary..."
                    [value := value - ((sign bitAnd: 1) bitShift: 32)].
                 ^value].
             ^self error: 'size must be a power of two from 1 to 8']
        ifFalse:
...


>
> On Mon, Aug 31, 2015 at 12:25 PM, Eliot Miranda <eliot.miranda at gmail.com>
> wrote:
> > Hi Chrises,
> >
> >     my vote would be to write these as 12 numbered primitives, (2,4 & 8
> > bytes) * (at: & at:put:) * (big & little endian) because they can be
> > performance critical and implementing them like this means the maximum
> > efficiency in both 32-bit and 64-bit Spur, plus the possibility of the
> JIT
> > implementing the primitives.
> >
> > On Sun, Aug 30, 2015 at 10:01 PM, Chris Cunningham <
> cunningham.cb at gmail.com>
> > wrote:
> >>
> >> Hi Chris,
> >>
> >> I'm all for having the fastest that in the image that works.  If you
> could
> >> make your version handle endianess, then I'm all for including it (at
> least
> >> in the 3 variants that are faster).  My first use for this (interface
> for
> >> KAFKA) apparently requires bigEndianess, so I really want that
> supported.
> >>
> >> It might be best to keep my naming, though - it follows the name pattern
> >> that is already in the class.  Or will yours also support 128?
> >>
> >> -cbc
> >>
> >> On Sun, Aug 30, 2015 at 2:38 PM, Chris Muller <asqueaker at gmail.com>
> wrote:
> >>>
> >>> Hi Chris, I think these methods belong in the image with the fastest
> >>> implementation we can do.
> >>>
> >>> I implemented 64-bit unsigned access for Ma Serializer back in 2005.
> >>> I modeled my implementation after Andreas' original approach which
> >>> tries to avoid LI arithmetic.  I was curious whether your
> >>> implementations would be faster, because if they are then it could
> >>> benefit Magma.  After loading "Ma Serializer" 1.5 (or head) into a
> >>> trunk image, I used the following script to take comparison
> >>> measurements:
> >>>
> >>> | smallN largeN maBa cbBa |  smallN := ((2 raisedTo: 13) to: (2
> >>> raisedTo: 14)) atRandom.
> >>> largeN := ((2 raisedTo: 63) to: (2 raisedTo: 64)) atRandom.
> >>> maBa := ByteArray new: 8.
> >>> cbBa := ByteArray new: 8.
> >>> maBa maUint: 64 at: 0 put: largeN.
> >>> cbBa unsignedLong64At: 1 put: largeN bigEndian: false.
> >>> self assert: (cbBa maUnsigned64At: 1) = (maBa unsignedLong64At: 1
> >>> bigEndian: false).
> >>> { 'cbc smallN write' -> [ cbBa unsignedLong64At: 1 put: smallN
> >>> bigEndian: false] bench.
> >>> 'ma smallN write' -> [cbBa maUint: 64 at: 0 put: smallN ] bench.
> >>> 'cbc smallN access' -> [ cbBa unsignedLong64At: 1 bigEndian: false. ]
> >>> bench.
> >>> 'ma smallN access' -> [ cbBa maUnsigned64At: 1] bench.
> >>> 'cbc largeN write' -> [ cbBa unsignedLong64At: 1 put: largeN
> >>> bigEndian: false] bench.
> >>> 'ma largeN write' -> [cbBa maUint: 64 at: 0 put: largeN ] bench.
> >>> 'cbc largeN access' -> [ cbBa unsignedLong64At: 1 bigEndian: false ]
> >>> bench.
> >>> 'ma largeN access' -> [ cbBa maUnsigned64At: 1] bench.
> >>>  }
> >>>
> >>> Here are the results:
> >>>
> >>> 'cbc smallN write'->'3,110,000 per second.  322 nanoseconds per run.' .
> >>> 'ma smallN write'->'4,770,000 per second.  210 nanoseconds per run.' .
> >>> 'cbc smallN access'->'4,300,000 per second.  233 nanoseconds per run.'
> .
> >>> 'ma smallN access'->'16,400,000 per second.  60.9 nanoseconds per
> run.' .
> >>> 'cbc largeN write'->'907,000 per second.  1.1 microseconds per run.' .
> >>> 'ma largeN write'->'6,620,000 per second.  151 nanoseconds per run.' .
> >>> 'cbc largeN access'->'1,900,000 per second.  527 nanoseconds per run.'
> .
> >>> 'ma largeN access'->'1,020,000 per second.  982 nanoseconds per run.'
> >>>
> >>> It looks like your 64-bit access is 86% faster for accessing the
> >>> high-end of the 64-bit range, but slower in the other 3 metrics.
> >>> Noticeably, it was only 14% as fast for writing the high-end of the
> >>> 64-bit range, and similarly as much slower for small-number access..
> >>>
> >>>
> >>> On Fri, Aug 28, 2015 at 6:01 PM, Chris Cunningham
> >>> <cunningham.cb at gmail.com> wrote:
> >>> > Hi.
> >>> >
> >>> > I've committed a change to the inbox with changes to allow
> >>> > getting/putting
> >>> > 64bit values to ByteArrays (similar to 32 and 16 bit accessors).
> Could
> >>> > this
> >>> > be added to trunk?
> >>> >
> >>> > Also, first time I used the selective commit function - very nice!
> the
> >>> > changes I didn't want committed didn't, in fact, get commited.  Just
> >>> > the
> >>> > desirable bits!
> >>> >
> >>> > -cbc
> >>> >
> >>> >
> >>> >
> >>>
> >>
> >>
> >>
> >>
> >
> >
> >
> > --
> > _,,,^..^,,,_
> > best, Eliot
> >
> >
> >
>



-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20150831/f5830957/attachment.htm


More information about the Squeak-dev mailing list