Re: [Vm-dev] Array new: SmallInteger maxVal

23 Oct 2009


      Thanks Henrik,
I took your suggestions and found the following:
- Using your suggested test:
    [1 to: 200 do: [:e | 1 to: 25185 do: [:t | Array new: e]]] timeToRun -
    [1 to: 200 do: [:e | 1 to: 25185 do: [:t | ]]] timeToRun.
  Unfortunately I was not able to get any useful data from a TimeProfileBrowser
  on my system (there was no indication that time was being spent in GC though),
  but overall time to run showed the updated VM (with allocation checks) giving
  a 12% better performance in primitives than the prior VM without checks (!?!).
- Going back to my original test, and looking at it with a TimeProfileBrowser,
  I saw about 91-95% of the time was spent in primitives under Collection>>add:
  so the time was presumably being spent largely in array allocation. That
  presumably included garbage collection, but it was nonetheless primarily
  exercising #primitiveNewWithArg.
- Comparing just the time spent in primitives, the time in primitives for
  the VM with new object allocation checks was 3.8% better than the VM without
  those checks. I would not attribute much precision to this, but it's still
  consistent with my original smoke test check that showed the VM with checks
  being slightly ( < 1% ) faster than the prior version without the checks.
I cannot explain why the updates seem to make the VM slightly faster, but
it does seem to be the case on my machine (AMD, 64-bit Linux). My best SWAG
speculative-and-probably-wrong guess would be that the variable declaration
updates included in the change set may have had the unintended side effect
of eliminating some inefficiencies somewhere.
I suspect that I am making a mistake somewhere. Really, there's just no
way that the added checks should make things go *faster*. Can anyone
else confirm or deny a performance difference between a VM built with
VMMaker-dtl.143 (including the allocation checks) versus a VM built with
VMMaker-dtl.142 or earlier?
Dave
On Thu, Oct 22, 2009 at 02:47:36PM +0200, Henrik Johansen wrote:
...
That's more of a GC-test :) (93% GC, 5% OrderedCollection>>add: on my  
machine)
I found it's usually a good idea to first do a
TimeProfileBrowser onBlock: testBlock
just to check the timing is actually spent doing what you want to  
measure a difference in,
before switching to millisecondsToRun to get the number without tally  
overhead.
Measuring single primitives can be rather hard though, since any  
overhead can be a big part of total runtime...
Also, do:,  timesRepeat: etc. should be avoided for looping when  
measuring performance until the Stack VM is out, since they create  
additional BlockContexts (and thus more time spent in gc) that weren't  
there before closures.
It's also good to avoid computations other than the one you're testing  
in the inner loop, so a better test might be something like:
[1 to: 200 do: [:e | 1 to: 25185 do: [:t | Array new: e]]] timeToRun -  
[1 to: 200 do: [:e | 1 to: 25185 do: [:t | ]]] timeToRun.
Then open a TimeProfileBrowser  on the first block and subtract the GC- 
time listed there.
(The 25185 was 1000000//27 from your test, changed 27 with 200 since  
the ms runtime with 27 was in the double digits...)
If any of my assumptions are incorrect, I'd like to know :)
Cheers,
Henry
On Oct 22, 2009, at 3:23 15AM, David T. Lewis wrote:
...
Regarding performance associated with the changes, I was not able to  
measure
any loss of performance. Actually, my crude test showed a slight  
improvement,
which I can only attribute to random variation in the results.
Here is an example of one of the informal tests that I tried:
block := [oc := OrderedCollection new.
(1 to: 1000000) do: [:e | oc add: (Array new: (e \ 27) + 1)]].
"Stock VM:"
Smalltalk garbageCollect.
before := (1 to: 5) collect: [:e | Time millisecondsToRun: block]  
==> #(21393 20582 21511 21101 20761)
"VM with my Array alloc changes:"
Smalltalk garbageCollect.
after := (1 to: 5) collect: [:e | Time millisecondsToRun: block]  
==> #(21582 20737 20693 20691 20725)
slowdownDueToTheChanges := (after sum - before sum / before sum)  
asFloat ==> -0.008732961233246
I got similar results for allocating strings, very slightly faster  
after
the changes. I was happy with "not slower" and left it at that.
Can anyone suggest a more suitable benchmark?
Also, I'm running on AMD 64 and I was only guessing that integer  
shift and
test sign would be a good approach. It might be awful on some  
hardware, I
don't know.
r.e. vmParameterAt:put: to modify max allocation request size --  
good idea.
The changes that I made are strictly intended to protect against a  
VM crash
or object memory corruption, nothing more. But some mechanism to  
prevent
people from making unreasonable memory requests is clearly also  
needed.
Dave

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Vm-dev] Array new: SmallInteger maxVal