[Vm-dev] Re: Do we have the new primitive?? [WAS] Re: [Pharo-project] IdentitySet but using #hash rather than #identityHash ?

Tue Feb 28 22:52:41 UTC 2012

On Mon, Feb 27, 2012 at 3:16 PM, Levente Uzonyi <leves at elte.hu> wrote:

> On Sat, 25 Feb 2012, Mariano Martinez Peck wrote:
>
>
>>            All I can say is that I am impressed by the numbers it is
>> really much
>>            faster.
>>            I still don't understand why I send this email with a subject
>> say
>>            IdentitySet because what I really need is a fast/large
>> IdentityDictionary
>>            :(  Anyway, there's a place where we can use this
>> LargeIdentitySet in Fuel
>>            I think).
>>
>>            So Levente, you say this is not possible to adapt this for
>> dictionary?  can
>>            we contact Eliot to provide such a primitive?
>>
>>
>> As promised, I uploaded my LargeIdentityDictionary implementation to
>> http://leves.web.elte.hu/**squeak/**LargeIdentityDictionary.st<http://leves.web.elte.hu/squeak/LargeIdentityDictionary.st>.
>> The numbers will be a bit worse compared to LargeIdentitySet, because of
>> the lack of the primitive, but it's still 2-3x faster than other solutions
>> (IdentityDictionary, PluggableIdentityDictionary, subclassing, etc). I'm
>> about to propose this primitive with other improvements on the vm-dev
>> list.
>>
>
> My proposals are still on the way. :)
>
>
>
>>
>> Hi Eliot/Levente. What is the status of this? Do we have already the new
>> primitive? If true, how can we adapt LargeIdentitySet to use such new
>> primitive?
>>
>
> AFAIK the new primitive is not implemented yet. Adding the primitive to
> the interpreter VM is very easy, but it seems to be a lot more complicated
> (to me) to add it to Cog, because the receiver can be a MethodContext which
> needs special handling.
> I'll rewrite both LargeIdentitySet and LargeIdentityDictionary when the
> primitive is ready.
>

Thanks Levente. So we should wait Eliot.
I will ping again in a couple of weeks/months  ;)

>
>
> Levente
>
>
>
>> Thanks!
>>
>>
>>
>>
>>
>>
>>
>>
>>      Levente
>>
>>
>>            thanks
>>
>>            On Fri, Dec 16, 2011 at 3:28 PM, Levente Uzonyi <leves at elte.hu>
>> wrote:
>>
>>      On Fri, 16 Dec 2011, Henrik Sperre Johansen wrote:
>>
>>       On 16.12.2011 03:26, Levente Uzonyi wrote:
>>
>>
>>            How about my numbers? :)
>>
>>            "Preallocate objects, so we won't count gc time."
>>            n := 1000000.
>>            objects := Array new: n streamContents: [ :stream |
>>              n timesRepeat: [ stream nextPut: Object new ] ].
>>
>>            set := IdentitySet new: n.
>>            Smalltalk garbageCollect.
>>            [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun.
>> "4949"
>>
>>            set := LargeIdentitySet new.
>>            Smalltalk garbageCollect.
>>            [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun.
>> "331"
>>
>>            set := (PluggableSet new: n)
>>              hashBlock: [ :object | object identityHash * 4096 + object
>> class
>>            identityHash * 64 ]; "Change this to #basicIdentityHash in
>> Pharo"
>>              equalBlock: [ :a :b | a == b ];
>>              yourself.
>>            Smalltalk garbageCollect.
>>            [1 to: n do: [ :i | set add: (objects at: i) ] ] timeToRun.
>> "5511"
>>
>>
>>            I also have a LargeIdentityDictionary, which is relatively
>> fast, but not
>>            as fast as LargeIdentitySet, because (for some unknown reason)
>> we don't
>>            have a primitive that could support it. If we had a primitive
>> like
>>            primitive 132 which would return the index of the element if
>> found or 0 if
>>            not, then we could have a really fast LargeIdentityDictionary.
>>
>>
>>            Levente
>>
>>      Hehe yes, if writing a version fully exploiting the limited range,
>> that's
>>      probably the approach I would go for as well.
>> (IAssuming it's the version at http://leves.web.elte.hu/**
>> squeak/LargeIdentitySet.st<htt**p://leves.web.elte.hu/squeak/**
>> LargeIdentitySet.st <http://leves.web.elte.hu/squeak/LargeIdentitySet.st>
>> >
>> )
>>
>> Mariano commented in the version at http://www.squeaksource.com/**
>> FuelExperiments <http://www.squeaksource.com/**FuelExperiments<http://www.squeaksource.com/FuelExperiments>>
>> that it's
>> slow for them, which I guess is due to not adopting #identityHash calls to
>> #basicIdentityHash calls for Pharo:
>> ((0 to: 4095) collect: [:each | each << 22 \\ 4096 ]) asSet size -> 1
>> So it basically uses 1 bucket instead of 4096... Whoops. :)
>>
>> Uploaded a new version to the MC repository which is adapted for Pharo,
>> on the same machine my numbers were taken from, it does the same test as I
>> used above in 871 ms. (181 with preallocation).
>>
>>
>> Cool. One more thing: in Squeak the method using primitive 132 directly
>> was renamed to #instVarsInclude:, so now #pointsTo: works as expected. If
>> this was also added to Pharo, then the #pointsTo: sends should be changed
>> to #instVarsInclude:, otherwise Array can be reported as included even if
>> it wasn't added.
>> I'll upload my LargeIdentityDictionary implementation to the same place
>> this evening, since it's still 2-3 factor faster than other solutionts and
>> there seem to be demand for it.
>>
>>
>> Levente
>>
>>
>>      Cheers,
>>      Henry
>>
>>
>>
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.**com <http://marianopeck.wordpress.com>
>>
>>
>>
>>
>>
>> --
>> Mariano
>> http://marianopeck.wordpress.**com <http://marianopeck.wordpress.com>
>>
>>
>>

-- 
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20120228/8a544e44/attachment-0001.htm