[Vm-dev] Re: [Pharo-project] Do we have the new primitive?? [WAS] Re: IdentitySet but using #hash rather than #identityHash ?

Mariano Martinez Peck marianopeck at gmail.com
Thu Apr 26 15:19:33 UTC 2012


>
>>> Hi Eliot/Levente. What is the status of this? Do we have already the new
>>> primitive? If true, how can we adapt LargeIdentitySet to use such new
>>> primitive?
>>>
>>
>> AFAIK the new primitive is not implemented yet. Adding the primitive to
>> the interpreter VM is very easy, but it seems to be a lot more complicated
>> (to me) to add it to Cog, because the receiver can be a MethodContext which
>> needs special handling.
>> I'll rewrite both LargeIdentitySet and LargeIdentityDictionary when the
>> primitive is ready.
>>
>
> Thanks Levente. So we should wait Eliot.
> I will ping again in a couple of weeks/months  ;)
>
>
Ping. So I did it :)
Eliot if you tell us what it is needed maybe I can push Esteban or Igor (or
me?) to do it ;)

Thanks!



>
>
>>
>>
>> Levente
>>
>>
>>
>>> Thanks!
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>      Levente
>>>
>>>
>>>            thanks
>>>
>>>            On Fri, Dec 16, 2011 at 3:28 PM, Levente Uzonyi <
>>> leves at elte.hu> wrote:
>>>
>>>      On Fri, 16 Dec 2011, Henrik Sperre Johansen wrote:
>>>
>>>       On 16.12.2011 03:26, Levente Uzonyi wrote:
>>>
>>>
>>>            How about my numbers? :)
>>>
>>>            "Preallocate objects, so we won't count gc time."
>>>            n := 1000000.
>>>            objects := Array new: n streamContents: [ :stream |
>>>              n timesRepeat: [ stream nextPut: Object new ] ].
>>>
>>>            set := IdentitySet new: n.
>>>            Smalltalk garbageCollect.
>>>            [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun.
>>> "4949"
>>>
>>>            set := LargeIdentitySet new.
>>>            Smalltalk garbageCollect.
>>>            [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun.
>>> "331"
>>>
>>>            set := (PluggableSet new: n)
>>>              hashBlock: [ :object | object identityHash * 4096 + object
>>> class
>>>            identityHash * 64 ]; "Change this to #basicIdentityHash in
>>> Pharo"
>>>              equalBlock: [ :a :b | a == b ];
>>>              yourself.
>>>            Smalltalk garbageCollect.
>>>            [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun.
>>> "5511"
>>>
>>>
>>>            I also have a LargeIdentityDictionary, which is relatively
>>> fast, but not
>>>            as fast as LargeIdentitySet, because (for some unknown
>>> reason) we don't
>>>            have a primitive that could support it. If we had a primitive
>>> like
>>>            primitive 132 which would return the index of the element if
>>> found or 0 if
>>>            not, then we could have a really fast LargeIdentityDictionary.
>>>
>>>
>>>            Levente
>>>
>>>      Hehe yes, if writing a version fully exploiting the limited range,
>>> that's
>>>      probably the approach I would go for as well.
>>> (IAssuming it's the version at http://leves.web.elte.hu/**
>>> squeak/LargeIdentitySet.st<htt**p://leves.web.elte.hu/squeak/**
>>> LargeIdentitySet.st<http://leves.web.elte.hu/squeak/LargeIdentitySet.st>
>>> >
>>> )
>>>
>>> Mariano commented in the version at http://www.squeaksource.com/**
>>> FuelExperiments <http://www.squeaksource.com/**FuelExperiments<http://www.squeaksource.com/FuelExperiments>>
>>> that it's
>>> slow for them, which I guess is due to not adopting #identityHash calls
>>> to
>>> #basicIdentityHash calls for Pharo:
>>> ((0 to: 4095) collect: [:each | each << 22 \\ 4096 ]) asSet size -> 1
>>> So it basically uses 1 bucket instead of 4096... Whoops. :)
>>>
>>> Uploaded a new version to the MC repository which is adapted for Pharo,
>>> on the same machine my numbers were taken from, it does the same test as
>>> I
>>> used above in 871 ms. (181 with preallocation).
>>>
>>>
>>> Cool. One more thing: in Squeak the method using primitive 132 directly
>>> was renamed to #instVarsInclude:, so now #pointsTo: works as expected. If
>>> this was also added to Pharo, then the #pointsTo: sends should be changed
>>> to #instVarsInclude:, otherwise Array can be reported as included even if
>>> it wasn't added.
>>> I'll upload my LargeIdentityDictionary implementation to the same place
>>> this evening, since it's still 2-3 factor faster than other solutionts
>>> and
>>> there seem to be demand for it.
>>>
>>>
>>> Levente
>>>
>>>
>>>      Cheers,
>>>      Henry
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Mariano
>>> http://marianopeck.wordpress.**com <http://marianopeck.wordpress.com>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Mariano
>>> http://marianopeck.wordpress.**com <http://marianopeck.wordpress.com>
>>>
>>>
>>>
>
>
> --
> Mariano
> http://marianopeck.wordpress.com
>
>


-- 
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20120426/71b4c2e8/attachment.htm


More information about the Vm-dev mailing list