[squeak-dev] SqueakCI Benchmarking

Tue Feb 26 21:54:20 UTC 2013

On Tue, Feb 26, 2013 at 2:28 PM, Stefan Marr <smalltalk at stefan-marr.de> wrote:
> Hi Jeff:
>
> On 26 Feb 2013, at 21:58, Jeff Gonis wrote:
>
>> Hi Everyone,
>>
>> So with a lot of help from Frank Shearar and Nicolas Cellier, I have
>> introduced performance benchmarking to the SqueakCI server.  You can
>> see our current performance trends at the following link:
>> http://build.squeak.org/job/SqueakTrunk/performance/
>
>> [..]
>
>> Thank you all for your time, and for any feedback you can provide.
>> Jeff
>
> Looks very interesting.
>
> I do something similar for the RoarVM at http://soft.vub.ac.be/~ppp/codespeed/ (page loading times aren't great...)
>
> That's all based on SMark [1] a benchmarking framework in the SUnit-style,
> which also properly warms up the JIT on the CogVMs.
> You might want to look into it, it has also a number of the Benchmark game benchmarks.
>
> Another thing, what's your rational for choosing  5-10sec runtimes?
>
> I try typically to make the runtime just long enough to avoid an impact of imprecise time measurement.
> Most of the time, the goal is to keep the runtime low and avoid triggering GC, except when GC is supposed to be measured.
>
> Best regards
> Stefan
>
>
> [1] http://smalltalkhub.com/#!/~StefanMarr/SMark
>
> --
> Stefan Marr
> Software Languages Lab
> Vrije Universiteit Brussel
> Pleinlaan 2 / B-1050 Brussels / Belgium
> http://soft.vub.ac.be/~smarr
> Phone: +32 2 629 2974
> Fax:   +32 2 629 3525
>
>

Wow, holy smokes that benchmarking page is impressive.  What sort of
setup is that running on? Maybe something similar to that for Squeak
could be my long term goal.

As for my rationale for a 5-10 second running time? I wanted something
that would allow for the image and the VM to become much faster
without having to adjust the benchmark too much.  Giving us plenty of
headroom as it were.  That way we could have a single unbroken chain
of progress on the graph, as things speed up.  I didn't mind
triggering the GC and such, because I felt that improvements to that
should be reflected as part of our speed, and I didn't have a specific
GC benchmark that I could use instead.

I have no idea if this rationale is completely wrong-headed, as I said
I haven't really done this before, so I just went with my gut.  If
many people feel I have headed in the wrong direction it won't be much
effort to change.

Thanks for your feedback, I appreciate it.
Jeff