[Vm-dev] VM Maker: VMMaker.oscog-eem.567.mcz
commits at source.squeak.org
commits at source.squeak.org
Fri Dec 20 23:50:53 UTC 2013
Eliot Miranda uploaded a new version of VMMaker to project VM Maker:
http://source.squeak.org/VMMaker/VMMaker.oscog-eem.567.mcz
==================== Summary ====================
Name: VMMaker.oscog-eem.567
Author: eem
Time: 20 December 2013, 3:47:15.976 pm
UUID: 88799310-3943-4468-b8ee-4c007e7f98e7
Ancestors: VMMaker.oscog-eem.565
Commit the takeaways, which are that
a) 4 byte entry-point alignment is as good as 8-byte on Core i7
b) the older backward-branching for immediates entry-point code
is significantly faster for non-immediates and because we expect
most SmallInteger code to be performed in-line it is better to prefer
non-immediate send performance.
N.B. None of this would be an issue with 30-bit immediates.
=============== Diff against VMMaker.oscog-eem.565 ===============
Item was changed:
----- Method: CogObjectRepresentationFor32BitSpur>>getInlineCacheClassTagFrom:into: (in category 'compile abstract instructions') -----
getInlineCacheClassTagFrom: sourceReg into: destReg
"Extract the inline cache tag for the object in sourceReg into destReg. The inline
cache tag for a given object is the value loaded in inline caches to distinguish
objects of different classes. In Spur this is either the tags for immediates, (with
1 & 3 collapsed to 1 for SmallIntegers, and 2 collapsed to 0 for Characters), or
the receiver's classIndex. Generate something like this:
+ Limm:
+ andl $0x1, rDest
+ j Lcmp
Lentry:
movl rSource, rDest
andl $0x3, rDest
+ jnz Limm
- jz LnotImm
- andl $1, rDest
- j Lcmp
- LnotImm:
movl 0(%edx), rDest
andl $0x3fffff, rDest
Lcmp:
+ At least on a 2.2GHz Intel Core i7 the following is slightly faster than the above,
+ 136m sends/sec vs 130m sends/sec for nfib in tinyBenchmarks
- At least on a 2.2GHz Intel Core i7 it is slightly faster,
- 136m sends/sec vs 130m sends/sec for nfib in tinyBenchmarks, than
- Limm:
- andl $0x1, rDest
- j Lcmp
Lentry:
movl rSource, rDest
andl $0x3, rDest
+ jz LnotImm
+ andl $1, rDest
+ j Lcmp
+ LnotImm:
- jnz Limm
movl 0(%edx), rDest
andl $0x3fffff, rDest
Lcmp:
+ But we expect most SMallInteger arithmetic to be performwd in-line and so prefer the
+ version that is faster for non-immediates (because it branches for immediates only)."
- "
| immLabel jumpNotImm entryLabel jumpCompare |
<var: #immLabel type: #'AbstractInstruction *'>
<var: #jumpNotImm type: #'AbstractInstruction *'>
<var: #entryLabel type: #'AbstractInstruction *'>
<var: #jumpCompare type: #'AbstractInstruction *'>
+ false
- true
ifTrue:
+ [cogit AlignmentNops: BytesPerWord.
- [cogit AlignmentNops: (BytesPerWord max: 8).
entryLabel := cogit Label.
cogit MoveR: sourceReg R: destReg.
cogit AndCq: objectMemory tagMask R: destReg.
jumpNotImm := cogit JumpZero: 0.
cogit AndCq: 1 R: destReg.
jumpCompare := cogit Jump: 0.
"Get least significant half of header word in destReg"
self flag: #endianness.
jumpNotImm jmpTarget:
(cogit MoveMw: 0 r: sourceReg R: destReg).
jumpCompare jmpTarget:
(cogit AndCq: objectMemory classIndexMask R: destReg)]
ifFalse:
[cogit AlignmentNops: BytesPerWord.
immLabel := cogit Label.
cogit AndCq: 1 R: destReg.
jumpCompare := cogit Jump: 0.
cogit AlignmentNops: BytesPerWord.
entryLabel := cogit Label.
cogit MoveR: sourceReg R: destReg.
cogit AndCq: objectMemory tagMask R: destReg.
cogit JumpNonZero: immLabel.
self flag: #endianness.
"Get least significant half of header word in destReg"
cogit MoveMw: 0 r: sourceReg R: destReg.
cogit AndCq: objectMemory classIndexMask R: destReg.
jumpCompare jmpTarget: cogit Label].
^entryLabel!
More information about the Vm-dev
mailing list