Adding loop primitives/optimizations (was Making Set/Dictionaryetc. loops more robust)

Wed Dec 1 21:22:02 UTC 2004

----- Original Message ----- 
From: "John M McIntosh" <johnmci at smalltalkconsulting.com>
To: "The general-purpose Squeak developers list"
<squeak-dev at lists.squeakfoundation.org>; "Joshua Scholar"
<jscholar at access4less.net>
Sent: Wednesday, December 01, 2004 12:36 PM
Subject: Re: Adding loop primitives/optimizations (was Making
Set/Dictionaryetc. loops more robust)

> It's more interesting to look at the bytecodes, not that the
> compiler/vm might do something clever and ignore the flow. Still some
> languages might send size to the array on each check of the do loop
> just in case things did change. However here as you point out
> size is only sent once.
>
> 5 <70> self
> 6 <C2> send: size
> 7 <6A> popIntoTemp: 2
> 8 <76> pushConstant: 1
> 9 <69> popIntoTemp: 1
> 10 <11> pushTemp: 1
> 11 <12> pushTemp: 2
> 12 <B4> send: <=
> 13 <AC 0C> jumpFalse: 27
> 15 <10> pushTemp: 0
> 16 <70> self
> 17 <11> pushTemp: 1
> 18 <C0> send: at:
> 19 <CA> send: value:
> 20 <87> pop
> 21 <11> pushTemp: 1
> 22 <76> pushConstant: 1
> 23 <B0> send: +
> 24 <69> popIntoTemp: 1
> 25 <A3 EF> jumpTo: 10
> 27 <78> returnSelf
>
> Something Andreas told me years ago, you need to do lots of clever
> things per second, and each one needs to have significant gains in
> order to
> see any real world improvement. Even if you improved array iteration
> speed by 50%, would anything show in the macro benchmarks? This becomes
> more of an issue with the highly threaded CPUs of today, for example
> you could remove an entire smalltalk bytecode from the loop above and
> actually see no improvement in performance.

Well, I'm used to a sort of programming where the speed of loops is
paramount.  In DSP programming, sound and graphics processing, an extra
instruction or two in a loop can mean a massive speed difference to a
program.

Ok, maybe Smalltalk isn't so fast at number crunching anyway, but still the
speed of simple loops can make a massive difference in my experience.
Whenever you have a lot of data to loop over and simple processing to do on
that data, the speed of the loop itself is very important.

I was thinking about iteration and closures the other day.  Smalltalk blocks
are becoming the equivalent of lambdas, but in Scheme, the block inside of a
loop does not have to be a lambda, and can use the outer environment I
think...  So when we slow our loops down in order to create a new
environment for each iteration, we're doing something that Scheme and Lisp
didn't I think.  I'll have to write some Scheme programs to be sure.

Anyway there may be some optimization for iteration there, once we've gone
to block closures.  Perhaps a block environment can have a flag that
mentions whether it has ever become visible, and the loop code will create
new environment only if the old one was visible.

On a similar note to the original topic (loops that crunch), one of my long
term goals is to learn Squeak well enough to implement a sublanguge or tools
that makes Squeak a suitable environment for Sound processing / DSP
development.  I want to compile to Slang - though there may be some problem
with garbage collecting and calls out of Slang code into byte interpreted
code... Is it possible to unlink/relink Slang generated C code at run time?
In other words can I make Slang compilation transparent?  Basically I want
some tools to convert a more abstract language into Slang and to hide all of
that plug-in linking stuff.

Joshua Scholar