[Vm-dev] Call for big benchmarks

Sat Mar 25 09:19:41 UTC 2017

The benchmarks on the fourmilab page were done over a period of time and
when I last messed with running the benchmark in ST in late 2014, there
were discrepancies in the relative times.

https://drive.google.com/drive/folders/0B30Y4WM1G4tuWWEyblh5RUNMOXc?
usp=sharing
has a folder with a spreadsheet with the results I got then along with an
mcz with the benchmark for Squeak 4.

Cheers,
 -- John

On Fri, Mar 24, 2017 at 7:37 PM, Eliot Miranda <eliot.miranda at gmail.com>
wrote:

>
>
>
> On Fri, Mar 24, 2017 at 7:36 PM, Eliot Miranda <eliot.miranda at gmail.com>
> wrote:
>
>> Hi John,
>>
>> On Fri, Mar 24, 2017 at 7:29 PM, John Dougan <jdougan at acm.org> wrote:
>>
>>>
>>> I don't know if this qualifies, but I ported John Walker's fbench
>>> floating point accuracy benchmark (https://www.fourmilab.ch/fben
>>> ch/fbench.html) to a variety of Smalltalk platforms. The numerical code
>>> is written in the standard Numerical Recipes style, which isn't very
>>> Smalltalky, but is very common. Probably lots of opportunities for
>>> optimizations. The included code tries to write to stdout as it was
>>> designed to be called from the command line, but that is pretty trivial to
>>> change.
>>>
>>
>> I'd love to see this contributed.  How old is that page?
>>
>
> (I mean when were the results computed; it says last updated 2016, but no
> dates for the individual times are taken; were they all computed at the
> same time or are some historical results)
>
>
>> I'm curious about these relative results:
>>
>> C 1 GCC 3.2.3 -O3, Linux
>> ...
>> Smalltalk 7.59 GNU Smalltalk 2.3.5, Linux
>>
>> I'd like to see if Spur Cog can beat VW and Gnu St.
>>
>>
>>>
>>> Cheers,
>>>  -  John
>>>
>>> On Fri, Mar 24, 2017 at 1:10 AM, Tim Felgentreff <
>>> timfelgentreff at gmail.com> wrote:
>>>
>>>>
>>>> Hi Eliot,
>>>>
>>>> the question for me is, how indicative is this workload of real world
>>>> performance? Creating compiled methods may not be something that is highly
>>>> optimized, simply because it doesn't need to be in real applications. One
>>>> would have to be careful about what is being measured, or if the benchmark
>>>> is just measuring how fast we can blow out the caches...
>>>>
>>>> If we're just talking about running parsing and optimizing something,
>>>> then maybe some real world applications are using that, but even then some
>>>> JSON or HTML parsing library that implements e.g. Apache mod_rewrite would
>>>> be more realistic, I think. Dynamically parsing and patching HTML and then
>>>> pretty-printing or minimizing it seems a more common problem.
>>>>
>>>> I know, you're trying to argue that the Opal compiler may show common
>>>> workloads equally well, but we could argue that for some of the Shootout
>>>> benchmarks, too. It's an argument that doesn't seem to convince some people.
>>>>
>>>>
>>>> Eliot Miranda <eliot.miranda at gmail.com> schrieb am Do., 23. März 2017,
>>>> 17:18:
>>>>
>>>>>
>>>>> Hi Tim,
>>>>>
>>>>> On Thu, Mar 23, 2017 at 1:31 AM, Tim Felgentreff <
>>>>> timfelgentreff at gmail.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>> Yes, big benchmarks would be nice. Those on speed.squeak.org or in
>>>>> VMMaker are all somewhat small.
>>>>>
>>>>> Note the Ruby community, for example, has benchmarks such as a NES
>>>>> emulator (optcarrot) that can run for a few thousand frames with predefined
>>>>> input as benchmarks. It's definitely possible.
>>>>>
>>>>> Maybe some of the projects from HPI students could be made to work,
>>>>> there was a Chip8 emulator in Squeak, for example, that seems big enough.
>>>>> Or maybe the DCPU emulator at github.com/fniephaus/BroDCPU without a
>>>>> frame limit would work as a decent CPU bound benchmark.
>>>>>
>>>>>
>>>>> I've discussed with Clément doing something like cloning the Opal
>>>>> compiler, or the Squeak compiler, so that it uses a fixed set of classes
>>>>> that won't change over time, excepting the collections, and using as a
>>>>> benchmark this compiler recompiling all its own methods.  This is a nice
>>>>> mix of string processing (in the tokenizer) and symbolic processing (in the
>>>>> building and optimizing of the parse tree).
>>>>>
>>>>> Cross - dialect could be hard. Pharo and Squeak are fairly easy to do,
>>>>> but with larger programs staying compatible across different dialects is
>>>>> harder.
>>>>>
>>>>>
>>>>> Again, extracting a compiler from its host system would make it
>>>>> possible to maintain a cross-platform version.  It could be left as an
>>>>> exercise to the reader to port it to one's favorite non-Smalltalk dynamic
>>>>> language.
>>>>>
>>>>> tim Rowledge <tim at rowledge.org> schrieb am Mi., 22. März 2017, 21:40:
>>>>>
>>>>>
>>>>>
>>>>> > On 21-03-2017, at 4:53 PM, Javier Pimás <elpochodelagente at gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Hi everybody! While measuring performance I usually face the problem
>>>>> of assessing performance.
>>>>>
>>>>> Have you tried the benchmarks package - CogBenchmarks - included in
>>>>> the source.squeak.org/VMMaker repository?
>>>>>
>>>>> tim
>>>>> --
>>>>> tim Rowledge; tim at rowledge.org; http://www.rowledge.org/tim
>>>>> Strange OpCodes: BOMB: Burn Out Memory Banks
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> _,,,^..^,,,_
>>>>> best, Eliot
>>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> John Dougan
>>> jdougan at acm.org
>>>
>>>
>>
>>
>> --
>> _,,,^..^,,,_
>> best, Eliot
>>
>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>
>

-- 
John Dougan
jdougan at acm.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20170325/529014ce/attachment-0001.html>