[Vm-dev] Re: [Cog] Integer comparison wierdness

Eliot Miranda eliot.miranda at gmail.com
Sun Jul 10 18:53:05 UTC 2011


On Sun, Jul 10, 2011 at 5:06 AM, Levente Uzonyi <leves at elte.hu> wrote:

> I read the JIT code (#genSmallIntegerComparison:) and found that it doesn't
> implement the 4-byte LargePositiveInteger checks. So the code behaves
> differently when it's interpreted and when it's jitted:
>
> (1 to: 2) collect: [ :each | 0 = (LargePositiveInteger new: 4) ].
>
> gives
>
> #(true false).
>
> But this is not a real problem IMHO, because the argument is not
> normalized. How hard would it be to implement the support for
> LargePositiveIntegers in the JIT-ed code?
>

Quite easy.  The failure path would be modified to invoke the interpreter
primitives.  I had not done so on minimalism grounds in that I had wanted to
get things working fast for the common case asap.  I have a slight antipathy
to the normal Squeak code which only copes with integers up to 64-bits.  But
Cog includes the LargeIntegersPlugin by Stephan Rudlof, and this is probably
the best code to fall back on.

Regarding implementation, arguably fast failure is important (since
inequality can be important to establish too) so, since
LargeNegativeInteger, LargePositiveInteger and Float all have contiguous
compact class indices the code could be something of the flavour of

    SmallInteger = (assume the receiver is a SmallInteger)
        arg isSmallInteger ifTrue:
            [....inline code for SmallInteger comparison].
        compactClassIndex := arg compactClassIndex.
        (compactClassIndex >= ClassLargeNegativeIntegerCompactIndex
        and: [compactClassIndex <= ClassFloatCompactIndex]) ifTrue:
            [compactClassIndex <= ClassLargePositiveIntegerCompactIndex
ifTrue:
                  [call corresponding primitive in the LargeIntegersPlugin].
             inline code for SmallInteger x Float comparison]
        fail

and of course this pattern could apply to comparison operations and
arithmetic.  For little extra cost the code could also include a test for
the receiver being a SmallInteger, but IMO its better to have primitives
that are specific to the receiver (for performance) and have the VM
implement primitives for SmallInteger Large{Posi,Nega}tiveInteger & Float
than have one size fits all.

2¢


>
> Levente
>
> P.S.: Sorry for still nagging about this.
>
>
> On Sun, 10 Jul 2011, Levente Uzonyi wrote:
>
>  I fired up QVMProfiler (hacked it a bit to make it work again under
>> windows). Integer >> #= is invoked during execution, even though it can't be
>> seen from the debugger. Here's the VM report for the old version:
>>
>> gc prior.  clear prior.
>> 2.393 seconds; sampling frequency 998 hz
>> 2389 samples in the VM  (2389 samples in the entire program)  100.0% of
>> total
>>
>> 799 samples in generated vm code 33.44% of entire vm (33.44% of total)
>> 1590 samples in vanilla vm code 66.56% of entire vm (66.56% of total)
>>
>> % of generated vm code (% of total) (samples) (cumulative)
>> 29.91%    (10.00%)      SmallInteger>>=
>> (239) (29.91%)
>> 18.40%    (  6.15%)     Integer>>= (147)        (48.31%)
>> 10.64%    (  3.56%)     UndefinedObject>>DoIt                   (85)
>> (58.95%)
>>  8.76%    (  2.93%)     Integer>>digitCompare:                  (70)
>> (67.71%)
>>  8.01%    (  2.68%)     Number>>negative (64)           (75.72%)
>>  7.63%    (  2.55%)     cePrimReturnEnterCogCode                (61)
>> (83.35%)
>>  3.38%    (  1.13%)     PIC isInteger (27)              (86.73%)
>>  3.00%    (  1.00%)     SmallInteger>><
>> (24) (89.74%)
>>  2.75%    (  0.92%)     PIC digitCompare: (22)          (92.49%)
>>  2.63%    (  0.88%)     PIC negative (21)               (95.12%)
>>  2.50%    (  0.84%)     PIC isNumber (20)               (97.62%)
>>  1.75%    (  0.59%)     PIC negative (14)               (99.37%)
>>  0.38%    (  0.13%)     Integer>>isInteger (3)          (99.75%)
>>  0.13%    (  0.04%)     LargePositiveInteger>>negative  (1) (99.87%)
>>  0.13%    (  0.04%)     Number>>isNumber
>>  (1) (100.0%)
>>
>>
>> % of vanilla vm code (% of total) (samples) (cumulative)
>> 31.26%    (20.80%)      _classNameOfIs                  (497)   (31.26%)
>> 13.84%    (  9.21%)     _stSizeOf
>> (220) (45.09%)
>> 11.07%    (  7.37%)     _lengthOf
>> (176) (56.16%)
>>  9.69%    (  6.45%)     _isWordsOrBytesNonInt   (154)   (65.85%)
>>  8.18%    (  5.44%)     _isKindOf
>> (130) (74.03%)
>>  7.04%    (  4.69%)     _arrayValueOf                           (112)
>> (81.07%)
>>  6.35%    (  4.23%)     _primDigitCompare                       (101)
>> (87.42%)
>>  6.23%    (  4.14%)     _stackValue                             (99)
>> (93.65%)
>>  3.14%    (  2.09%)     _popthenPush                            (50)
>> (96.79%)
>>  1.51%    (  1.00%)     _failed
>> (24) (98.30%)
>>  1.32%    (  0.88%)     _success
>>  (21) (99.62%)
>>  0.38%    (  0.25%)     _integerObjectOf                        (6)
>> (100.0%)
>>
>> And here's the same report with the new version of Integer >> #=:
>>
>> gc prior.  clear prior.
>> 0.169 seconds; sampling frequency 994 hz
>> 168 samples in the VM   (168 samples in the entire program)  100.0% of
>> total
>>
>> 168 samples in generated vm code 100.0% of entire vm (100.0% of total)
>> 0 samples in vanilla vm code   0.00% of entire vm (  0.00% of total)
>>
>> % of generated vm code (% of total)                             (samples)
>> (cumulative)
>> 45.24%    (45.24%)      Integer>>=
>>  (76) (45.24%)
>> 25.00%    (25.00%)      SmallInteger>>=                 (42) (70.24%)
>> 18.45%    (18.45%)      PIC isInteger                           (31)
>> (88.69%)
>>  8.93%    (  8.93%)     UndefinedObject>>DoIt   (15)            (97.62%)
>>  2.38%    (  2.38%)     Integer>>isInteger                      (4)
>> (100.0%)
>>
>> The new version is faster, because it avoids Integer >> #digitCompare: as
>> I expected. But why is Integer >> #= invoked at all? Why can't it be seen
>> from the debugger?
>>
>>
>> Levente
>>
>>
>> On Sat, 9 Jul 2011, Levente Uzonyi wrote:
>>
>>  Hi Eliot,
>>>
>>> I found that Cog is really slow when I compare SmallIntegers with 4-byte
>>> LargePositiveIntegers. In theory this kind of comparison should only be
>>> slightly slower than SmallInteger-SmallInteger comparisons, but that's not
>>> the case.
>>>
>>> With Cog (r2434 on windows) I get:
>>>
>>> evaluator := [ :aBlock |
>>>        (((1 to: 5) collect: [ :run |
>>>                aBlock timeToRun ]) sort copyFrom: 2 to: 4) average
>>> asFloat ].
>>> evaluator value: [ 1 to: 1000000 do: [ :i | 0 = 1 ] ]. "3.0"
>>> evaluator value: [ 1 to: 1000000 do: [ :i | 0 = 16r40000000 ] ]. "244.0"
>>>
>>> The same code with the Interpreter VM gives 22.0 and 74.0.
>>>
>>> I tried debugging the 0 = 16r40000000 expression with Cog, but the
>>> debugger doesn't touch any smalltalk code during the execution of #=.
>>> If I replace the implementation of Integer >> #= to:
>>>
>>> = aNumber
>>>
>>>        aNumber isInteger ifTrue: [
>>>                aNumber class == self class ifFalse: [ ^false ].
>>>                ^(self digitCompare: aNumber) = 0 ].
>>>        aNumber isNumber ifFalse: [ ^false ].
>>>        ^aNumber adaptToInteger: self andCompare: #=
>>>
>>> then the second number decreases to 18.0 with Cog. Changing this method
>>> has no effect on the interpreter VM's performance.
>>>
>>> Do you have an idea what can cause the slowdown and why does the
>>> implementation of Integer >> #= matter for Cog?
>>>
>>> Cheers,
>>> Levente
>>>
>>
-- 
best,
Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20110710/54e4937a/attachment.htm


More information about the Vm-dev mailing list