Re: Adding loop primitives/optimizations (was Making Set/Dictionaryetc. loops more robust)

1 Dec 2004


      ----- Original Message ----- 
From: "John M McIntosh" johnmci@smalltalkconsulting.com
To: "The general-purpose Squeak developers list"
squeak-dev@lists.squeakfoundation.org; "Joshua Scholar"
jscholar@access4less.net
Sent: Wednesday, December 01, 2004 12:36 PM
Subject: Re: Adding loop primitives/optimizations (was Making
Set/Dictionaryetc. loops more robust)
...
It's more interesting to look at the bytecodes, not that the
compiler/vm might do something clever and ignore the flow. Still some
languages might send size to the array on each check of the do loop
just in case things did change. However here as you point out
size is only sent once.
5 <70> self
6 <C2> send: size
7 <6A> popIntoTemp: 2
8 <76> pushConstant: 1
9 <69> popIntoTemp: 1
10 <11> pushTemp: 1
11 <12> pushTemp: 2
12 <B4> send: <=
13 <AC 0C> jumpFalse: 27
15 <10> pushTemp: 0
16 <70> self
17 <11> pushTemp: 1
18 <C0> send: at:
19 <CA> send: value:
20 <87> pop
21 <11> pushTemp: 1
22 <76> pushConstant: 1
23 <B0> send: +
24 <69> popIntoTemp: 1
25 <A3 EF> jumpTo: 10
27 <78> returnSelf
Something Andreas told me years ago, you need to do lots of clever
things per second, and each one needs to have significant gains in
order to
see any real world improvement. Even if you improved array iteration
speed by 50%, would anything show in the macro benchmarks? This becomes
more of an issue with the highly threaded CPUs of today, for example
you could remove an entire smalltalk bytecode from the loop above and
actually see no improvement in performance.
Well, I'm used to a sort of programming where the speed of loops is
paramount.  In DSP programming, sound and graphics processing, an extra
instruction or two in a loop can mean a massive speed difference to a
program.
Ok, maybe Smalltalk isn't so fast at number crunching anyway, but still the
speed of simple loops can make a massive difference in my experience.
Whenever you have a lot of data to loop over and simple processing to do on
that data, the speed of the loop itself is very important.
I was thinking about iteration and closures the other day.  Smalltalk blocks
are becoming the equivalent of lambdas, but in Scheme, the block inside of a
loop does not have to be a lambda, and can use the outer environment I
think...  So when we slow our loops down in order to create a new
environment for each iteration, we're doing something that Scheme and Lisp
didn't I think.  I'll have to write some Scheme programs to be sure.
Anyway there may be some optimization for iteration there, once we've gone
to block closures.  Perhaps a block environment can have a flag that
mentions whether it has ever become visible, and the loop code will create
new environment only if the old one was visible.
On a similar note to the original topic (loops that crunch), one of my long
term goals is to learn Squeak well enough to implement a sublanguge or tools
that makes Squeak a suitable environment for Sound processing / DSP
development.  I want to compile to Slang - though there may be some problem
with garbage collecting and calls out of Slang code into byte interpreted
code... Is it possible to unlink/relink Slang generated C code at run time?
In other words can I make Slang compilation transparent?  Basically I want
some tools to convert a more abstract language into Slang and to hide all of
that plug-in linking stuff.
Joshua Scholar