Tim Felgentreff uploaded a new version of Kernel to project The Trunk:
http://source.squeak.org/trunk/Kernel-tfel.1046.mcz
==================== Summary ====================
Name: Kernel-tfel.1046
Author: tfel
Time: 19 October 2016, 5:34:18.991659 pm
UUID: 12ebf345-b919-2845-8147-56a8b2f4cae4
Ancestors: Kernel-eem.1045
Improve performance of large integer fallback code for multiplication and division with small integers, by avoiding recalculating SmallInteger>>digitLength a lot of times. (This yields a 3x speedup in some integer heavy shootout benchmarks like pidigits for RSqueak, where we don't have a LargeIntegers plugin)
=============== Diff against Kernel-eem.1045 ===============
Item was changed:
----- Method: Integer>>digitDiv:neg: (in category 'private') -----
digitDiv: arg neg: ng
"Answer with an array of (quotient, remainder)."
+ | quo rem ql d div dh dnh dl qhi qlo j l hi lo r3 a t divDigitLength remDigitLength |
- | quo rem ql d div dh dnh dl qhi qlo j l hi lo r3 a t |
<primitive: 'primDigitDivNegative' module:'LargeIntegers'>
arg = 0 ifTrue: [^ (ZeroDivide dividend: self) signal].
"TFEI added this line"
l := self digitLength - arg digitLength + 1.
l <= 0 ifTrue: [^ Array with: 0 with: self].
"shortcut against #highBit"
d := 8 - arg lastDigit highBitOfByte.
div := arg digitLshift: d.
+ divDigitLength := div digitLength + 1.
+ div := div growto: divDigitLength.
- div := div growto: div digitLength + 1.
"shifts so high order word is >=128"
rem := self digitLshift: d.
rem digitLength = self digitLength ifTrue: [rem := rem growto: self digitLength + 1].
+ remDigitLength := rem digitLength.
"makes a copy and shifts"
quo := Integer new: l neg: ng.
+ dl := divDigitLength - 1.
- dl := div digitLength - 1.
"Last actual byte of data"
ql := l.
dh := div digitAt: dl.
dnh := dl = 1
ifTrue: [0]
ifFalse: [div digitAt: dl - 1].
1 to: ql do:
[:k |
"maintain quo*arg+rem=self"
"Estimate rem/div by dividing the leading to bytes of rem by dh."
"The estimate is q = qhi*16+qlo, where qhi and qlo are nibbles."
+ j := remDigitLength + 1 - k.
- j := rem digitLength + 1 - k.
"r1 := rem digitAt: j."
(rem digitAt: j)
= dh
ifTrue: [qhi := qlo := 15
"i.e. q=255"]
ifFalse:
["Compute q = (r1,r2)//dh, t = (r1,r2)\\dh.
Note that r1,r2 are bytes, not nibbles.
Be careful not to generate intermediate results exceeding 13
bits."
"r2 := (rem digitAt: j - 1)."
t := ((rem digitAt: j)
bitShift: 4)
+ ((rem digitAt: j - 1)
bitShift: -4).
qhi := t // dh.
t := (t \\ dh bitShift: 4)
+ ((rem digitAt: j - 1)
bitAnd: 15).
qlo := t // dh.
t := t \\ dh.
"Next compute (hi,lo) := q*dnh"
hi := qhi * dnh.
lo := qlo * dnh + ((hi bitAnd: 15)
bitShift: 4).
hi := (hi bitShift: -4)
+ (lo bitShift: -8).
lo := lo bitAnd: 255.
"Correct overestimate of q.
Max of 2 iterations through loop -- see Knuth vol. 2"
r3 := j < 3
ifTrue: [0]
ifFalse: [rem digitAt: j - 2].
[(t < hi
or: [t = hi and: [r3 < lo]])
and:
["i.e. (t,r3) < (hi,lo)"
qlo := qlo - 1.
lo := lo - dnh.
lo < 0
ifTrue:
[hi := hi - 1.
lo := lo + 256].
hi >= dh]]
whileTrue: [hi := hi - dh].
qlo < 0
ifTrue:
[qhi := qhi - 1.
qlo := qlo + 16]].
"Subtract q*div from rem"
l := j - dl.
a := 0.
+ 1 to: divDigitLength do:
- 1 to: div digitLength do:
[:i |
hi := (div digitAt: i)
* qhi.
lo := a + (rem digitAt: l) - ((hi bitAnd: 15)
bitShift: 4) - ((div digitAt: i)
* qlo).
rem digitAt: l put: lo - (lo // 256 * 256).
"sign-tolerant form of (lo bitAnd: 255)"
a := lo // 256 - (hi bitShift: -4).
l := l + 1].
a < 0
ifTrue:
["Add div back into rem, decrease q by 1"
qlo := qlo - 1.
l := j - dl.
a := 0.
+ 1 to: divDigitLength do:
- 1 to: div digitLength do:
[:i |
a := (a bitShift: -8)
+ (rem digitAt: l) + (div digitAt: i).
rem digitAt: l put: (a bitAnd: 255).
l := l + 1]].
+ quo digitAt: ql + 1 - k put: (qhi bitShift: 4)
- quo digitAt: quo digitLength + 1 - k put: (qhi bitShift: 4)
+ qlo].
rem := rem
digitRshift: d
bytes: 0
lookfirst: dl.
^ Array with: quo with: rem!
Item was changed:
----- Method: Integer>>digitMultiply:neg: (in category 'private') -----
digitMultiply: arg neg: ng
+ | selfLen argLen prod prodLen carry digit k ab |
- | prod prodLen carry digit k ab |
<primitive: 'primDigitMultiplyNegative' module:'LargeIntegers'>
+ argLen := arg digitLength.
+ (argLen = 1 and: [(arg digitAt: 1)
- (arg digitLength = 1 and: [(arg digitAt: 1)
= 0])
ifTrue: [^ 0].
+ selfLen := self digitLength.
+ (selfLen = 1 and: [(self digitAt: 1)
- (self digitLength = 1 and: [(self digitAt: 1)
= 0])
ifTrue: [^ 0].
+ prodLen := selfLen + argLen.
- prodLen := self digitLength + arg digitLength.
prod := Integer new: prodLen neg: ng.
"prod starts out all zero"
+ 1 to: selfLen do: [:i | (digit := self digitAt: i) ~= 0
- 1 to: self digitLength do: [:i | (digit := self digitAt: i) ~= 0
ifTrue:
[k := i.
carry := 0.
"Loop invariant: 0<=carry<=0377, k=i+j-1"
+ 1 to: argLen do:
- 1 to: arg digitLength do:
[:j |
ab := (arg digitAt: j)
* digit + carry + (prod digitAt: k).
carry := ab bitShift: -8.
prod digitAt: k put: (ab bitAnd: 255).
k := k + 1].
prod digitAt: k put: carry]].
^ prod normalize!
Eliot Miranda uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-eem.717.mcz
==================== Summary ====================
Name: Collections-eem.717
Author: eem
Time: 13 October 2016, 4:14:34.323953 pm
UUID: ff6f9273-5077-47f9-b660-0cdd4b184bb1
Ancestors: Collections-dtl.716
Revise upwards the cross-over in SequenceableCollection>>atAllPut: at which point to move from a simple loop using at:put: to connivance using replaceFrom:to:with:startingAt:. Add a comment that includes the code to actually test this, instead of simply claiming without support.
=============== Diff against Collections-dtl.716 ===============
Item was changed:
----- Method: SequenceableCollection>>atAllPut: (in category 'accessing') -----
atAllPut: anObject
"Put anObject at every one of the receiver's indices."
| size |
+ (size := self size) > 50 "first method faster for larger sizes; see below"
- (size := self size) > 26 "first method faster from 27 accesses and on"
ifTrue: [self from: 1 to: size put: anObject]
+ ifFalse: [1 to: size do: [:index | self at: index put: anObject]]
+
+ "Here's code to test what's a good cross over."
+ "(1 to: 3) collect:
+ [:j|
+ { Array. ByteArray. FloatArray. WordArray } collect:
+ [:class| | a e |
+ a := class new: 250.
+ e := a at: 1.
+ (1 to: a size) detect:
+ [:n| | t1 t2 |
+ t1 := [1 to: 1000 do: [:i| a from: 1 to: n put: e]] timeToRun.
+ t2 := [1 to: 1000 do: [:i| 1 to: n do: [:index | a at: index put: e]]] timeToRun.
+ t1 < t2]]]"
+ "32-bit Spur x86 #(#(69 54 9 63) #(64 52 10 55) #(63 53 9 61))"
+ "64-bit Spur x86-64 #(#(63 50 10 55) #(60 48 10 54) #(63 44 9 50))"!
- ifFalse: [1 to: size do: [:index | self at: index put: anObject]]!