<p>I came up with a first implementation in InterpreterPrimitives</p>
<pre><code>primitiveHighBit
        | integerReceiverOop leadingZeroCount highestBitZeroBased indexWasSet integerValue |
        integerReceiverOop := self stackTop.
        self cppIf: #'__GNUC__' defined
                ifTrue:
                        ["Note: in gcc, result is undefined if input is zero (for compatibility with BSR(RBIT) fallback when no CLZ instruction available).
                        but input is never zero because we pass the oop with tag bits set, so we are safe"
                        self cppIf: objectMemory wordSize = 4
                                ifTrue:
                                        [leadingZeroCount := self __builtin_clz: integerReceiverOop..
                                        leadingZeroCount = 0
                                                ifTrue:
                                                        ["highBit is not defined for negative Integer"
                                                        self primitiveFail]
                                                ifFalse: [self pop: 1 thenPushInteger: (leadingZeroCount bitXor: 31)]]
                                ifFalse:
                                        ["Setting all the tag bits is necessary so as to have the right answer for 0 highBit."
                                        leadingZeroCount := self __builtin_clzll: (integerReceiverOop bitOr: (1 << objectMemory numTagBits - 1))].
                                        leadingZeroCount = 0
                                                ifTrue:
                                                        ["highBit is not defined for negative Integer"
                                                        self primitiveFail]
                                                ifFalse: [self pop: 1 thenPushInteger: objectMemory wordSize * 8 - objectMemory numTagBits - leadingZeroCount]]
                ifFalse: [self cppIf: #'_MSC_VER' defined
                        ifTrue:
                                ["In MSVC, _lzcnt and _lzcnt64 builtins do not fallback to BSR(RBIT) when not supported by CPU
                                Instead of messing with __cpuid() we always use the BSR intrinsic"
                                
                                "Trick: we test the oop sign rather than the integerValue. Assume oop are signed (so far, they are, sqInt are signed)"
                                integerReceiverOop < 0 ifTrue: [self primitiveFail] ifFalse: [               
                                self cppIf: objectMemory wordSize = 4
                                        ifTrue:
                                                ["We do not even test the return value, because integerReceiverOop is never zero"
                                                self _BitScanReverse: highestBitZeroBased address _: integerReceiverOop.
                                                "thanks to the tag bit, the +1 operation for getting 1-based rank is not necessary"
                                                self pop: 1 thenPushInteger: highestBitZeroBased]
                                        ifFalse:
                                                ["In 64 bits, that' a bit more tricky because of 3 tag bits, use un-tagged value"
                                                integerValue := objectMemory integerValueOf: integerReceiverOop.
                                                "initialize with -1 so that adding +1 works when integerValue is zero"
                                                highestBitZeroBased := -1.
                                                indexWasSet := self _BitScanReverse64: highestBitZeroBased address _: integerValue.
                                                "assume that highestBitZeroBased is untouched when integerValue is zero"
                                                self pop: 1 thenPushInteger: highestBitZeroBased + 1]]]
                        ifFalse:
                                ["not gcc/clang, nor MSVC, you have to implement if your compiler provide useful builtins"
                                self primitiveFail]].
</code></pre>
<p>It works, gives a speed-up of 1.5x to 3x in Spur64, and I expect much more from JIT version.</p>
<p>For JIT, I will have to test cpu capabilities via CPUID and switch to LZCNT or BSR(RBIT) instructions on INTEL (or just BSR as demonstrated in the MSVC branch above).<br>
It seems there is a CLZ on both ARM and MIPS.<br>
But I first need to enhance our cogit with new instructions. How to name the generic/specific selectors?</p>
<p>For the primitive, it's better to abandon the monolithic-omniscient style for a more distributed-object-oriented one, or there's no point in writing the VM in Smalltalk...</p>
<p>Specifically, I would like to delegate the object-representation-dependent code to specialized classes...<br>
For example above code generate incorrect (but unreachable) code for 32bits words in 64bits VM, <em>et vice et versa</em>. This is yet another smell...</p>
<p>I don't well see which class (ObjectMemory, SpurMemoryManager, ...) nor which selector would fit the current VM-style...</p>
<p>There will be some duplication of C-Compiler-dependent-builtin-logic, because we do not reify the C-compiler and cannot double dispatch... If there's a way to avoid, I'm all ears (JIT is somehow easier!!!).</p>
<p>Any thought appreciated.</p>

<p style="font-size:small;-webkit-text-size-adjust:none;color:#666;">—<br />You are receiving this because you are subscribed to this thread.<br />Reply to this email directly, <a href="https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/418?email_source=notifications&email_token=AIJPEW2AU4UDOQB2GJQXDJTQGGHWRA5CNFSM4IOGIB32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5CGQMA#issuecomment-524576816">view it on GitHub</a>, or <a href="https://github.com/notifications/unsubscribe-auth/AIJPEW3V7XKJ6LL3TGADA5TQGGHWRANCNFSM4IOGIB3Q">mute the thread</a>.<img src="https://github.com/notifications/beacon/AIJPEWYVBWCC3N4FZ6JABT3QGGHWRA5CNFSM4IOGIB32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5CGQMA.gif" height="1" width="1" alt="" /></p>
<script type="application/ld+json">[
{
"@context": "http://schema.org",
"@type": "EmailMessage",
"potentialAction": {
"@type": "ViewAction",
"target": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/418?email_source=notifications\u0026email_token=AIJPEW2AU4UDOQB2GJQXDJTQGGHWRA5CNFSM4IOGIB32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5CGQMA#issuecomment-524576816",
"url": "https://github.com/OpenSmalltalk/opensmalltalk-vm/issues/418?email_source=notifications\u0026email_token=AIJPEW2AU4UDOQB2GJQXDJTQGGHWRA5CNFSM4IOGIB32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5CGQMA#issuecomment-524576816",
"name": "View Issue"
},
"description": "View this Issue on GitHub",
"publisher": {
"@type": "Organization",
"name": "GitHub",
"url": "https://github.com"
}
}
]</script>