The most annoying piece is Time machine and its disk access, I sometimes forget to suspend it, but it was off during the tinyBenchmark.
One simple approach is to run the benchmark three times and to discard the best and the worst results.
that is as good as taking the first one... if you want decent results measure >30 times and do the only scientific correct thing: avg + std deviation?
Too much work? use http://www.squeaksource.com/SMark.html