[Vm-dev] Array new: SmallInteger maxVal
David T. Lewis
lewis at mail.msen.com
Fri Oct 23 01:36:59 UTC 2009
Thanks Henrik,
I took your suggestions and found the following:
- Using your suggested test:
[1 to: 200 do: [:e | 1 to: 25185 do: [:t | Array new: e]]] timeToRun -
[1 to: 200 do: [:e | 1 to: 25185 do: [:t | ]]] timeToRun.
Unfortunately I was not able to get any useful data from a TimeProfileBrowser
on my system (there was no indication that time was being spent in GC though),
but overall time to run showed the updated VM (with allocation checks) giving
a 12% better performance in primitives than the prior VM without checks (!?!).
- Going back to my original test, and looking at it with a TimeProfileBrowser,
I saw about 91-95% of the time was spent in primitives under Collection>>add:
so the time was presumably being spent largely in array allocation. That
presumably included garbage collection, but it was nonetheless primarily
exercising #primitiveNewWithArg.
- Comparing just the time spent in primitives, the time in primitives for
the VM with new object allocation checks was 3.8% better than the VM without
those checks. I would not attribute much precision to this, but it's still
consistent with my original smoke test check that showed the VM with checks
being slightly ( < 1% ) faster than the prior version without the checks.
I cannot explain why the updates seem to make the VM slightly faster, but
it does seem to be the case on my machine (AMD, 64-bit Linux). My best SWAG
speculative-and-probably-wrong guess would be that the variable declaration
updates included in the change set may have had the unintended side effect
of eliminating some inefficiencies somewhere.
I suspect that I am making a mistake somewhere. Really, there's just no
way that the added checks should make things go *faster*. Can anyone
else confirm or deny a performance difference between a VM built with
VMMaker-dtl.143 (including the allocation checks) versus a VM built with
VMMaker-dtl.142 or earlier?
Dave
On Thu, Oct 22, 2009 at 02:47:36PM +0200, Henrik Johansen wrote:
>
> That's more of a GC-test :) (93% GC, 5% OrderedCollection>>add: on my
> machine)
> I found it's usually a good idea to first do a
> TimeProfileBrowser onBlock: testBlock
> just to check the timing is actually spent doing what you want to
> measure a difference in,
> before switching to millisecondsToRun to get the number without tally
> overhead.
>
> Measuring single primitives can be rather hard though, since any
> overhead can be a big part of total runtime...
> Also, do:, timesRepeat: etc. should be avoided for looping when
> measuring performance until the Stack VM is out, since they create
> additional BlockContexts (and thus more time spent in gc) that weren't
> there before closures.
>
> It's also good to avoid computations other than the one you're testing
> in the inner loop, so a better test might be something like:
>
> [1 to: 200 do: [:e | 1 to: 25185 do: [:t | Array new: e]]] timeToRun -
> [1 to: 200 do: [:e | 1 to: 25185 do: [:t | ]]] timeToRun.
> Then open a TimeProfileBrowser on the first block and subtract the GC-
> time listed there.
> (The 25185 was 1000000//27 from your test, changed 27 with 200 since
> the ms runtime with 27 was in the double digits...)
>
> If any of my assumptions are incorrect, I'd like to know :)
>
> Cheers,
> Henry
>
> On Oct 22, 2009, at 3:23 15AM, David T. Lewis wrote:
>
> >Regarding performance associated with the changes, I was not able to
> >measure
> >any loss of performance. Actually, my crude test showed a slight
> >improvement,
> >which I can only attribute to random variation in the results.
> >
> >Here is an example of one of the informal tests that I tried:
> >
> > block := [oc := OrderedCollection new.
> > (1 to: 1000000) do: [:e | oc add: (Array new: (e \\ 27) + 1)]].
> >
> > "Stock VM:"
> > Smalltalk garbageCollect.
> > before := (1 to: 5) collect: [:e | Time millisecondsToRun: block]
> >==> #(21393 20582 21511 21101 20761)
> >
> > "VM with my Array alloc changes:"
> > Smalltalk garbageCollect.
> > after := (1 to: 5) collect: [:e | Time millisecondsToRun: block]
> >==> #(21582 20737 20693 20691 20725)
> >
> > slowdownDueToTheChanges := (after sum - before sum / before sum)
> >asFloat ==> -0.008732961233246
> >
> >I got similar results for allocating strings, very slightly faster
> >after
> >the changes. I was happy with "not slower" and left it at that.
> >
> >Can anyone suggest a more suitable benchmark?
> >
> >Also, I'm running on AMD 64 and I was only guessing that integer
> >shift and
> >test sign would be a good approach. It might be awful on some
> >hardware, I
> >don't know.
> >
> >r.e. vmParameterAt:put: to modify max allocation request size --
> >good idea.
> >The changes that I made are strictly intended to protect against a
> >VM crash
> >or object memory corruption, nothing more. But some mechanism to
> >prevent
> >people from making unreasonable memory requests is clearly also
> >needed.
> >
> >Dave
> >
> >
More information about the Vm-dev
mailing list