Cost of Squeak

John M McIntosh johnmci at smalltalkconsulting.com
Tue Nov 16 19:54:01 UTC 2004


I poked at that back in July of 2003, notes attached

On Nov 16, 2004, at 11:29 AM, tim Rowledge wrote:

> Andreas Raab wrote:
>
>>> Also, I'm guessing C which is auto-generated (from Slang) will be  
>>> more
>>> verbose than hand-crafted C...
>> Definitely. And don't forget the inliner - it manually inlines  
>> functions therefore dramatically increasing the LOC count.
> We really must get around to trying the experiment of using the inline  
> function spec for the C compiler to see if it works better. GCC & RISC  
> OS CC do it so that covers about all the bases.
>
> tim
>

 From a note July 2003:
For the curious, where the time goes running the macrobenchmarks with a  
VM built without inlining.
This breaks out better the 40% or so numbers we see in interpret() in  
the inlined version.

	14.2%	fetchNextBytecode		
	11.9%	interpret	(40% of that 11.9% is taken resolving the case  
statement)
	4.4%	fetchByte	(3.6% of this is via fetchNextByteCode)
	3.9%	internalActivateNewMethod	
	3.4%	internalFetchContextRegisters	
	3.4%	fetchPointerofObject		
	2.5%	internalStackValue		
	2.4%	internalPush		
	2.2%	startField		
	1.9%	internalExecuteNewMethod	
	1.8%	booleanCheat		
	1.7%	fetchClassOf		
	1.6%	lookupInMethodCacheSelclass	
	1.5%	pushTemporaryVariable		
	1.3%	upward		
	1.3%	headerType		
	1.3%	startObj		
	1.2%	instantiateSmallClasssizeInBytes		
	1.1%	internalStoreContextRegisters		
	1.1%	oopFromChunk		
	1.1%	quickFetchIntegerofObject		
	1.1%	storePointerUncheckedofObjectwithValue	
	1.1%	remapFieldsAndClassOf		
	1.0%	internalPop		
	1.0%	markAndTrace		
	1.0%	internalStackTop		

For interest sake, I looked a bit more into this. Just by doing post  
processing on the C code I noticed that
case 7:
			/* pushReceiverVariableBytecode */
			{
				fetchNextBytecode();
				pushReceiverVariable(7 & 15);
			}

000000D8: 48000001  bl         .fetchNextBytecode
000000DC: 38600007  li         r3,7
000000E0: 48000001  bl         .pushReceiverVariable

which really is...

000000D4: 48000001  bl         .fetchNextBytecode

00000000: 7C0802A6  mflr       r0        /*  fetchNextBytecode */
00000004: 90010008  stw        r0,8(SP)
00000008: 9421FFC0  stwu       SP,-64(SP)
0000000C: 48000001  bl         .fetchByte

00000000: 80620000  lwz        r3,foo(RTOC) /* fetchByte */
00000004: 80830000  lwz        r4,0(r3)
00000008: 8064011C  lwz        r3,284(r4)
0000000C: 38630001  addi       r3,r3,1
00000010: 9064011C  stw        r3,284(r4)
00000014: 88630000  lbz        r3,0(r3)
00000018: 4E800020  blr


00000010: 80820000  lwz        r4,foo(RTOC) /*  fetchNextBytecode */
00000014: 80840000  lwz        r4,0(r4)
00000018: 90640034  stw        r3,52(r4)
0000001C: 80010048  lwz        r0,72(SP)
00000020: 38210040  addi       SP,SP,64
00000024: 7C0803A6  mtlr       r0
00000028: 4E800020  blr


000000D8: 38600007  li         r3,7
000000DC: 48000001  bl         .pushReceiverVariable


can be altered to this, because that is what the fetchNextBytecode  
really does...

		case 7:
			/* pushReceiverVariableBytecode */
			{
				foo->currentBytecode = byteAt(++foo->localIP)
				pushReceiverVariable(7 & 15);
			}

0000015C: 809F011C  lwz        r4,284(r31)
00000160: 38600007  li         r3,7
00000164: 38840001  addi       r4,r4,1
00000168: 909F011C  stw        r4,284(r31)
0000016C: 88040000  lbz        r0,0(r4)
00000170: 901F0034  stw        r0,52(r31)
00000174: 48000001  bl         .pushReceiverVariable


21 instructions, versus 7. A bit more work if you wanted to turn all  
foo->localIP to just localIP, but nothing a post bit of string  
manipulation could not do...


--
======================================================================== 
===
John M. McIntosh <johnmci at smalltalkconsulting.com> 1-800-477-2659
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
======================================================================== 
===




More information about the Squeak-dev mailing list