Cost of Squeak
John M McIntosh
johnmci at smalltalkconsulting.com
Tue Nov 16 19:54:01 UTC 2004
I poked at that back in July of 2003, notes attached
On Nov 16, 2004, at 11:29 AM, tim Rowledge wrote:
> Andreas Raab wrote:
>
>>> Also, I'm guessing C which is auto-generated (from Slang) will be
>>> more
>>> verbose than hand-crafted C...
>> Definitely. And don't forget the inliner - it manually inlines
>> functions therefore dramatically increasing the LOC count.
> We really must get around to trying the experiment of using the inline
> function spec for the C compiler to see if it works better. GCC & RISC
> OS CC do it so that covers about all the bases.
>
> tim
>
From a note July 2003:
For the curious, where the time goes running the macrobenchmarks with a
VM built without inlining.
This breaks out better the 40% or so numbers we see in interpret() in
the inlined version.
14.2% fetchNextBytecode
11.9% interpret (40% of that 11.9% is taken resolving the case
statement)
4.4% fetchByte (3.6% of this is via fetchNextByteCode)
3.9% internalActivateNewMethod
3.4% internalFetchContextRegisters
3.4% fetchPointerofObject
2.5% internalStackValue
2.4% internalPush
2.2% startField
1.9% internalExecuteNewMethod
1.8% booleanCheat
1.7% fetchClassOf
1.6% lookupInMethodCacheSelclass
1.5% pushTemporaryVariable
1.3% upward
1.3% headerType
1.3% startObj
1.2% instantiateSmallClasssizeInBytes
1.1% internalStoreContextRegisters
1.1% oopFromChunk
1.1% quickFetchIntegerofObject
1.1% storePointerUncheckedofObjectwithValue
1.1% remapFieldsAndClassOf
1.0% internalPop
1.0% markAndTrace
1.0% internalStackTop
For interest sake, I looked a bit more into this. Just by doing post
processing on the C code I noticed that
case 7:
/* pushReceiverVariableBytecode */
{
fetchNextBytecode();
pushReceiverVariable(7 & 15);
}
000000D8: 48000001 bl .fetchNextBytecode
000000DC: 38600007 li r3,7
000000E0: 48000001 bl .pushReceiverVariable
which really is...
000000D4: 48000001 bl .fetchNextBytecode
00000000: 7C0802A6 mflr r0 /* fetchNextBytecode */
00000004: 90010008 stw r0,8(SP)
00000008: 9421FFC0 stwu SP,-64(SP)
0000000C: 48000001 bl .fetchByte
00000000: 80620000 lwz r3,foo(RTOC) /* fetchByte */
00000004: 80830000 lwz r4,0(r3)
00000008: 8064011C lwz r3,284(r4)
0000000C: 38630001 addi r3,r3,1
00000010: 9064011C stw r3,284(r4)
00000014: 88630000 lbz r3,0(r3)
00000018: 4E800020 blr
00000010: 80820000 lwz r4,foo(RTOC) /* fetchNextBytecode */
00000014: 80840000 lwz r4,0(r4)
00000018: 90640034 stw r3,52(r4)
0000001C: 80010048 lwz r0,72(SP)
00000020: 38210040 addi SP,SP,64
00000024: 7C0803A6 mtlr r0
00000028: 4E800020 blr
000000D8: 38600007 li r3,7
000000DC: 48000001 bl .pushReceiverVariable
can be altered to this, because that is what the fetchNextBytecode
really does...
case 7:
/* pushReceiverVariableBytecode */
{
foo->currentBytecode = byteAt(++foo->localIP)
pushReceiverVariable(7 & 15);
}
0000015C: 809F011C lwz r4,284(r31)
00000160: 38600007 li r3,7
00000164: 38840001 addi r4,r4,1
00000168: 909F011C stw r4,284(r31)
0000016C: 88040000 lbz r0,0(r4)
00000170: 901F0034 stw r0,52(r31)
00000174: 48000001 bl .pushReceiverVariable
21 instructions, versus 7. A bit more work if you wanted to turn all
foo->localIP to just localIP, but nothing a post bit of string
manipulation could not do...
--
========================================================================
===
John M. McIntosh <johnmci at smalltalkconsulting.com> 1-800-477-2659
Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com
========================================================================
===
More information about the Squeak-dev
mailing list
|