sqGnu.h
Ned Konz
ned at bike-nomad.com
Thu Apr 4 06:56:46 UTC 2002
On Wednesday 03 April 2002 10:05 pm, Ned Konz wrote:
> I don't know that much about the possibilities, but on my machine, if I let
> gcc do aggressive optimization using these flags:
>
> CFLAGS = -I/usr/X11R6/include -O3 -DLSB_FIRST=1 -Wa,-a -fno-gcse
> -fomit-frame-pointer -fschedule-insns -g
>
> then I see that the dispatch is a single (albeit 7 byte long) instruction
> (line 11648):
>
> 5160:gnu-interp.c **** CASE(1)
> 5161:gnu-interp.c **** /* pushReceiverVariableBytecode */
> 5162:gnu-interp.c **** /* begin fetchNextBytecode */
> 5163:gnu-interp.c **** currentBytecode = byteAt(++localIP);
> 5164:gnu-interp.c **** /* begin pushReceiverVariable: */
> 5165:gnu-interp.c **** /* begin internalPush: */
> 5166:gnu-interp.c **** longAtput(localSP += 4, longAt(((((char *)
> receiver)) + 4) + ((1 & 15) << 2)));
> 11634 .stabn 68,0,5166,.LM1552-interpret
> 11635 .LM1552:
> 11636 2b60 A1000000 movl receiver,%eax
> 11636 00
> 11637 .stabn 68,0,5163,.LM1553-interpret
> 11638 .LM1553:
> 11639 2b65 46 incl %esi
> 11640 2b66 0FB62E movzbl (%esi),%ebp
> 11641 .stabn 68,0,5166,.LM1554-interpret
> 11642 .LM1554:
> 11643 2b69 83C704 addl $4,%edi
> 11644 2b6c 8B4008 movl 8(%eax),%eax
> 11645 2b6f 8907 movl %eax,(%edi)
> 5167:gnu-interp.c **** BREAK;
> 11646 .stabn 68,0,5167,.LM1555-interpret
> 11647 .LM1555:
> 11648 2b71 FF24AD40 jmp *jumpTable.586(,%ebp,4)
> 11648 240000
>
> or (stripping out the actual work of the bytecode) just three instructions
> total:
>
> 11639 2b65 46 incl %esi
> 11640 2b66 0FB62E movzbl (%esi),%ebp
> 11648 2b71 FF24AD40 jmp *jumpTable.586(,%ebp,4)
> 11648 240000
That was with gcc 3.0.3. Unfortunately, some of the plugins don't work right,
so I'm a bit wary of using this compiler.
With 2.95.3, it uses a stack variable for the current bytecode, which slows it
down from 49000000 or so to 42000000 or so bytecodes/sec (2 additional
instructions: store and fetch from this memory location). However, things
seem to work correctly.
So we intel-users have a choice of (at least) two options: fast and flaky, or
slow and less flaky.
I'm going to fiddle around with 2.95.3 a bit to see if I can get those
instructions out.
--
Ned Konz
currently: Stanwood, WA
email: ned at bike-nomad.com
homepage: http://bike-nomad.com
More information about the Squeak-dev
mailing list
|