[squeak-dev] [Pharo-dev] Integer arithmetic and bit counting performance in Squeak and Pharo

Eliot Miranda eliot.miranda at gmail.com
Wed Feb 15 18:21:57 UTC 2017


Hi Benoit,

   the upshot is that in 32-bits one should probably model a chess board
bit mask with three or four SmallIntegers, for example 16 squares per
integer, but in 64-bits one can use two SmallIntegers, with 32 squares per
integer.  You could construct a family of class that hide the
representational details.  A ChessBoardBitMap abstract class with
subclasses for the 4 32-bit SmallIntegers representation, the 2 64-bit
SmallIntegers representation, and the polymorphic single integer that will
be either a SmallInteger or a LargePositiveInteger, depending on the board
and the word size of the system.  You can then play with the representation
as your program evolves and as e.g. the Sista optimizer evolves, which
should be abel to optimize away some of the overhead of using this "boxed"
ChessBoardBitMap approach.

On Fri, Feb 10, 2017 at 4:21 AM, Benoit St-Jean via Pharo-dev <
pharo-dev at lists.pharo.org> wrote:

>
>
> ---------- Forwarded message ----------
> From: Benoit St-Jean <bstjean at yahoo.com>
> To: "Clément Bera" <bera.clement at gmail.com>, Pharo Development List <
> pharo-dev at lists.pharo.org>
> Cc: The General-purpose Squeak Developers List <
> squeak-dev at lists.squeakfoundation.org>
> Date: Fri, 10 Feb 2017 12:21:33 +0000 (UTC)
> Subject: Re: [Pharo-dev] Integer arithmetic and bit counting performance
> in Squeak and Pharo
> Oh!  Many thanks for all the details!  I thought about special selectors
> and remembered that #bitAnd: and #bitOr: were those special kind of
> "beasts" but I falsely assumed that #bitXor: was also a special selector...
> :(  I should have looked at the code!
>
> So that also explains the "strange" behavior of the 64bit image.  Lots of
> the tests in the 32bit image fell into the LargeInteger "range" and then,
> executing the same code on the 64bit image, those numbers suddenly became
> considered SmallInteger...  That explains it all and why some operations
> "suddenly" became so fast for no apparent reason !!!
>
> Now, just for my personal curiosity, what is going to be the impact of a
> 64bit VM on such code? (Unfortunately, I am on Windows so I will have to
> wait...) ?
>
> P.S. I should have persevered in my google searches as I just came across
> this *excellent* post :  Arithmetic, inlined and special selectors
> <https://clementbera.wordpress.com/2014/08/12/arithmetic-inlined-and-special-selectors/>
>
> :)
>
> Thanks a lot!
>
> Arithmetic, inlined and special selectors
> Today we are going to discuss how message sends are dispatched in the VM
> and/or compiled in the image for arithm...
>
> <https://clementbera.wordpress.com/2014/08/12/arithmetic-inlined-and-special-selectors/>
>
>
> -----------------
> Benoît St-Jean
> Yahoo! Messenger: bstjean
> Twitter: @BenLeChialeux
> Pinterest: benoitstjean
> Instagram: Chef_Benito
> IRC: lamneth
> Blogue: endormitoire.wordpress.com
> "A standpoint is an intellectual horizon of radius zero".  (A. Einstein)
>
>
> ------------------------------
> *From:* Clément Bera <bera.clement at gmail.com>
> *To:* Benoit St-Jean <bstjean at yahoo.com>; Pharo Development List <
> pharo-dev at lists.pharo.org>
> *Cc:* The General-purpose Squeak Developers List <squeak-dev at lists.
> squeakfoundation.org>
> *Sent:* Friday, February 10, 2017 5:14 AM
> *Subject:* Re: [Pharo-dev] Integer arithmetic and bit counting
> performance in Squeak and Pharo
>
> Hi,
>
> This is a lovely post. Thanks for posting about Smalltalk / Pharo / Squeak.
>
>
>    1. Why *exactly* does the override work (in 32bit images)?
>    2. What changed so that things are different in Squeak 5.1 64bit image
>    (overrides partially work)?
>
> If I am correct your questions are exclusively for bitOr: and bitAnd:. I
> don't know what you are aware of and what you're not aware of, hence I am
> going to give you some details on how bitAnd: is executed (bitOr: is done
> exactly the same way), and hopefully there will be something you don't know
> yet that should help you understand what's going on.
>
> *Execution of bitAnd:*
>
> bitAnd: a special selector. If you run Smalltalk specialSelectors you get
> this array:  #(#+ 1 #- 1 #< 1 #> 1 #<= 1 #>= 1 #= 1 #~= 1 #* 1 #/ 1 #\\ 1
> #@ 1 #bitShift: 1 #// 1 #bitAnd: 1 #bitOr: 1 #at: 1 #at:put: 2 #size 0
> #next 0 #nextPut: 1 #atEnd 0 #== 1 #class 0 #blockCopy: 1 #value 0 #value:
> 1 #do: 1 #new 0 #new: 1 #x 0 #y 0) which includes bitAnd: .
>
> Special selectors are encoded in the bytecode differently to save memory,
> so they can be executed differently without inducing overhead. In Cog,
> bitAnd: is executed differently using type-prediction.
>
> In the Stack Interpreter, the execution of bitAnd: is done as follows:
>     If both operands are SmallIntegers,
>         push the bitAnd: value of the 2 SmallIntegers
>     else
>         Try the primitive BitAnd for SmallIntegers (14)
>         on success
>             push the result
>         on failure
>             perform the send.
>
> In the JIT, bitAnd: sends are compiled differently that normal send, the
> compilation logic is as follows:
>     If both operands are SmallInteger literals
>         compute the result during compilation and use this value
>     else
>         If one of the 2 operands is a SmallInteger literal
>             generate 2 paths, a quick inlined path testing at runtime that
> the other operand is a SmallInteger, and if so, Ands the operands and use
> that result, else use the slow path performing a send
>         else
>             generate a normal send
>
> *Primitives*
>
> primitive 14 -> Works for SmallIntegers and LargePositiveInteger fitting
> in an unsigned machine word. (Slower than direct SmallInteger bitAnd
> performed by the special selector)
> primitive 34 -> Works for SmallIntegers and LargePositiveInteger fitting
> in an unsigned int64. (Slower than primitive 14 in 32 bits but covering
> more cases, equivalent in 64 bits)
> primDigitBitAnd (in Integer) -> Works for any integer. (Slower than the
> primitive 14 and 34)
>
> *SmallInteger encoding*
>
> In 32 bits, SmallIntegers are signed 31bits integer.
> In 64 bits, SmallIntegers are signed 61 bits integer.
>
> *Discussion*
>
> Let's make some assertions based on previous information:
>
> 1) If you have the exact same bitAnd: implementation for #bitAnd: and
> #bitAnd2:, #bitAnd2: is slower because it is not a special selector.
>
> 2) Still with the exact same bitAnd: implementation, using the JIT, things
> like that:
> 16r1 bitAnd: 1. "1 bits"
> 16r2 bitAnd: 2. "2 bit"
> 16r4 bitAnd: 4. "3 bit"
> 16r8 bitAnd: 3. "4 bit"
> Are bitAnd: operations between SmallInteger constants and the result is
> unused. Hence this whole portion of code does not generate any native
> instructions.
> On the other hand, things like that:
> 16r1 bitAnd2: 1. "1 bits"
> 16r2 bitAnd2: 2. "2 bit"
> 16r4 bitAnd2: 4. "3 bit"
> 16r8 bitAnd2: 3. "4 bit"
> Are sends, generating multiple native instructions including calls.
>
> 3) Because of the SmallInteger encoding,
> 16r200000000000000 bitAnd: 98490667780264321. "58 bit"
> is compiled by the JIT to nothing in 64 bits but to a send in 32 bits (the
> JIT can't solve LargeInteger arithmetic at compilation time).
>
> 4) primitive 34 provides some speed-up over primitive 14 in 32 bits,
> covering bitAnd: operations from unsigned 33 bits integer to unsigned 64
> bits integer, but makes no difference in 64 bits. In 64bits all the cases
> covered by primitive 34 are covered by primitive 14, so it should even slow
> down things.
>
> *So...*
>
> Does this answer your questions ?
> Don't hesitate to ask further questions.
>
> If you want more details, I wrote something about special selectors a
> while ago: https://clementbera.wordpress.com/2014/08/12/
> arithmetic-inlined-and-special-selectors/ .
>
> Best,
>
> Clement
>
> PS1: If I'm correct, you referenced my blog a couple times, thank you for
> doing this, I really appreciate that.
> PS2: I enjoy chess myself, so if you have an open-source/MIT working
> algorithm at some point please send it to me, I will play against the AI
> and look at the code.
>
>
> On Fri, Feb 10, 2017 at 8:18 AM, Benoit St-Jean via Pharo-dev <
> pharo-dev at lists.pharo.org> wrote:
>
>
>
> ---------- Forwarded message ----------
> From: Benoit St-Jean <bstjean at yahoo.com>
> To: The General-purpose Squeak Developers List <squeak-dev at lists.
> squeakfoundation.org <squeak-dev at lists.squeakfoundation.org>>, "
> pharo-dev at lists.pharo.org" <pharo-dev at lists.pharo.org>
> Cc:
> Date: Fri, 10 Feb 2017 07:18:02 +0000 (UTC)
> Subject: Integer arithmetic and bit counting performance in Squeak and
> Pharo
> For those interested in performance of bit operations and integer
> arithmetic in Squeak and Pharo :
>
> Bit operations in general : https://endormitoire.wordpress
> .com/2017/02/10/bits-and- pieces/
> <https://endormitoire.wordpress.com/2017/02/10/bits-and-pieces/>
>
> Bit counting algorithms : https://endormitoire.wordpress
> .com/2017/02/10/1001001-sos/
> <https://endormitoire.wordpress.com/2017/02/10/1001001-sos/>
>
> Some questions, some results, some answers...
>
> -----------------
> Benoît St-Jean
> Yahoo! Messenger: bstjean
> Twitter: @BenLeChialeux
> Pinterest: benoitstjean
> Instagram: Chef_Benito
> IRC: lamneth
> Blogue: endormitoire.wordpress.com
> "A standpoint is an intellectual horizon of radius zero".  (A. Einstein)
>
>
>
>
>
>


-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20170215/1bff07ec/attachment.html>


More information about the Squeak-dev mailing list