[Vm-dev] StackVM with latest sources tinyBenchmarks

Wed Feb 20 17:29:17 UTC 2013

On Tue, Feb 19, 2013 at 11:10 PM, Camillo Bruni <camillobruni at gmail.com> wrote:
>
>
> On 2013-02-20, at 01:25, Eliot Miranda <eliot.miranda at gmail.com> wrote:
>
>>
>> On Tue, Feb 19, 2013 at 2:16 PM, Camillo Bruni <camillobruni at gmail.com> wrote:
>>>
>>>>> The most annoying piece is Time machine and its disk access, I
>>>>> sometimes forget to suspend it, but it was off during the
>>>>> tinyBenchmark.
>>>>
>>>> One simple approach is to run the benchmark three times and to discard
>>>> the best and the worst results.
>>>
>>> that is as good as taking the first one... if you want decent results
>>> measure >30 times and do the only scientific correct thing: avg + std deviation?
>>
>> If the benchmark takes very little time to run and you're trying to
>> avoid background effects then your approach won't necessarily work
>> either.
>
> true, but the deviation will most probably give you exactly that feedback.
> if you increase the runs but the quality of the result doesn't improve
> you know that you're dealing with some systematic error source.
>
> This approach is simply more scientific and less home-brewed.

Of course, no argument here.  But what's being discussed is using
tinyBenchmarks as a quick smoke test.  A proper CI system can be set
it up for reliable results, but for IMO for a quick smoke test doing
three runs manually is fine.  IME, what tends to happen is that the
first run is slow (caches heating up etc) and the second two runs are
extremely close.
-- 
best,
Eliot