[squeak-dev] The Trunk: Kernel-nice.589.mcz

commits at source.squeak.org commits at source.squeak.org
Thu May 19 00:42:28 UTC 2011


Nicolas Cellier uploaded a new version of Kernel to project The Trunk:
http://source.squeak.org/trunk/Kernel-nice.589.mcz

==================== Summary ====================

Name: Kernel-nice.589
Author: nice
Time: 19 May 2011, 2:41:37.27 am
UUID: 094e80f6-660c-4e82-803e-7f4a2d5d5a5a
Ancestors: Kernel-dtl.588

Optimize the way to count number of printed digits for Integer
- by using iteration rather than recursion,
- by providing a slightly better guess in the decimal case.
Accelerate #raisedTo:modulo: because this operation is critical in cryptography.
Avoid using recursive #raisedToInteger:modulo: for same speed reasons, and deprecate it.
Sorry to sacrifice simplicity, hope you'll understand..

Implementation notes:
#raisedTo:modulo: now uses the new montgomeryTimes primitive if present.
Otherwise, it fallbacks to classical multiplication, remainder sequences.
The Montgomery algorithm avoids division implied by the modulo operation and thus save a few CPU cycles (see http://en.wikipedia.org/wiki/Montgomery_reduction for an introduction).
Both Montgomery and fallback version use a sliding window algorithm.
It consists in exponentiating by bit packets rather than bit by bit and then save some multiplications.
It is a classical CPU vs memory tradeoff. Hope the comments help if you're interested.
Note that raisedTo: could also use a slidingWindow if we want to. Do we?
In such case, we would have 3 most identical methods and should think of a better refactoring.

=============== Diff against Kernel-dtl.588 ===============

Item was changed:
  ----- Method: Integer>>isProbablyPrimeWithK:andQ: (in category 'private') -----
  isProbablyPrimeWithK: k andQ: q 
  	"Algorithm P, probabilistic primality test, from
  	Knuth, Donald E. 'The Art of Computer Programming', Vol 2,
  	Third Edition, section 4.5.4, page 395, P1-P5 refer to Knuth description."
  
  	"P1"
  
  	| x j y |
  	x := (self - 2) atRandom + 1.
  	"P2"
  	j := 0.
+ 	y := x raisedTo: q modulo: self.
- 	y := x raisedToInteger: q modulo: self.
  	"P3"
  	
  	[((j = 0 and: [y = 1]) or: [y = (self - 1)]) ifTrue: [^true].
  	(j > 0 and: [y = 1]) ifTrue: [^false].
  	"P5"
  	j := j + 1.
  	j < k
  		ifTrue: [y := y squared \\ self]
  		ifFalse:[^ false]] repeat!

Item was added:
+ ----- Method: Integer>>montgomeryRaisedTo:times:modulo:mInvModB: (in category 'private') -----
+ montgomeryRaisedTo: n times: y modulo: m mInvModB: mInv
+ 	"Private - do a Montgomery exponentiation of self modulo m.
+ 	The operation is equivalent to (self/y raisedTo: n)*y \\ m,
+ 	with y is (256 raisedTo: m digitLength),
+ 	with (m bitAnd: 255) * mInv \\ 256 = 255."
+ 	
+ 	| pow j k w index oddPowersOfSelf square |
+ 	
+ 	"Precompute powers of self for odd bit patterns xxxx1 up to length w + 1.
+ 	The width w is chosen with respect to the total bit length of n,
+ 	such that each bit pattern will on average be encoutered P times in the whole bit sequence of n.
+ 	This costs (2 raisedTo: w) multiplications, but more will be saved later (see below)."
+ 	k := n highBit.
+ 	w := (k highBit - 1 >> 1 min: 16) max: 1.
+ 	oddPowersOfSelf := Array new: 1 << w.
+ 	oddPowersOfSelf at: 1 put: (pow := self).
+ 	square := self montgomeryTimes: self modulo: m mInvModB: mInv.
+ 	2 to: oddPowersOfSelf size do: [:i | pow := oddPowersOfSelf at: i put: (pow montgomeryTimes: square modulo: m mInvModB: mInv)].
+ 	
+ 	"Now exponentiate by searching precomputed bit patterns with a sliding window"
+ 	pow := y.
+ 	[k > 0]
+ 		whileTrue:
+ 			[pow := pow montgomeryTimes: pow modulo: m mInvModB: mInv.
+ 			"Skip bits set to zero (the sliding window)"
+ 			(n bitAt: k) = 0
+ 				ifFalse:
+ 					["Find longest odd bit pattern up to window length (w + 1)"
+ 					j := k - w max: 1.
+ 					[j < k and: [(n bitAt: j) = 0]] whileTrue: [j := j + 1].
+ 					"We found a bit pattern of length k-j+1;
+ 					perform the square powers for each bit
+ 					(same cost as bitwise algorithm);
+ 					compute the index of this bit pattern in the precomputed powers."
+ 					index := 0.
+ 					[k > j] whileTrue:
+ 						[pow := pow montgomeryTimes: pow modulo: m mInvModB: mInv.
+ 						index := index << 1 + (n bitAt: k).
+ 						k := k - 1].
+ 					"Perform a single multiplication for the whole bit pattern.
+ 					This saves up to (k-j) multiplications versus a naive algorithm operating bit by bit"
+ 					pow := pow montgomeryTimes: (oddPowersOfSelf at: index + 1) modulo: m mInvModB: mInv].
+ 			k := k - 1].
+ 	^pow!

Item was added:
+ ----- Method: Integer>>montgomeryTimes:modulo:mInvModB: (in category 'private') -----
+ montgomeryTimes: a modulo: m mInvModB: mInv
+ 	"Answer the result of a Montgomery multiplication
+ 	self * a * (256 raisedTo: m digitLength) inv \\ m
+ 	NOTE: it is assumed that:
+ 	self digitLength <= m digitLength
+ 	a digitLength <= m digitLength
+ 	mInv * m \\ 256 = (-1 \\ 256) = 255 (this implies m odd)
+ 	
+ 	Answer nil in case of absent plugin or other failure."
+ 	
+ 	<primitive: 'primMontgomeryTimesModulo' module:'LargeIntegers'>
+ 	^nil!

Item was changed:
  ----- Method: Integer>>numberOfDigitsInBase: (in category 'printing') -----
  numberOfDigitsInBase: b 
  	"Return how many digits are necessary to print this number in base b.
  	This does not count any place for minus sign, radix prefix or whatever.
  	Note that this algorithm may cost a few operations on LargeInteger."
  
+ 	| nDigits q total |
- 	| nDigits q |
  	self negative ifTrue: [^self negated numberOfDigitsInBase: b].
  	self < b ifTrue: [^1].
+ 	b isPowerOfTwo ifTrue: [^self highBit + b highBit - 2 quo: b highBit - 1].
- 	b isPowerOfTwo	ifTrue: [^self highBit + b highBit - 2 quo: b highBit - 1].
  	
  	"A conversion from base 2 to base b has to be performed.
  	This algorithm avoids Float computations like (self log: b) floor + 1,
  	1) because they are inexact
  	2) because LargeInteger might overflow
  	3) because this algorithm might be cheaper than conversion"
  
+ 	q := self.
+ 	total := 0.
+ 	["Make an initial nDigits guess that is lower than or equal to required number of digits"
+ 	nDigits := b = 10
+ 		ifTrue: [((q highBit - 1) * 1233 >> 12) + 1. "This is because (2 log)/(10 log)*4096 is slightly greater than 1233"]
+ 		ifFalse: [q highBit quo: b highBit].
+ 	total := total + nDigits.
+ 	
- 	"Make an initial nDigits guess that is lower than or equal to required number of digits"
- 	b = 10
- 		ifTrue: [nDigits := ((self highBit - 1) * 3 quo: 10) + 1. "This is because 1024 is a little more than a kilo"]
- 		ifFalse: [nDigits := self highBit quo: b highBit].
- 
  	"See how many digits remains above these first nDigits guess"
+ 	(q := q quo: (b raisedTo: nDigits)) < b] whileFalse.
- 	q := self quo: (b raisedTo: nDigits).
  	^q = 0
+ 		ifTrue: [total]
+ 		ifFalse: [total + 1]!
- 		ifTrue: [nDigits]
- 		ifFalse: [nDigits + (q numberOfDigitsInBase: b)]!

Item was changed:
  ----- Method: Integer>>raisedTo:modulo: (in category 'mathematical functions') -----
+ raisedTo: n modulo: m
+ 	"Answer the modular exponential.
+ 	Note: this implementation is optimized for case of large integers raised to large powers."
+ 	| a s mInv |
+ 	n = 0 ifTrue: [^1].
+ 	(self >= m or: [self < 0]) ifTrue: [^self \\ m raisedTo: n modulo: m].
+ 	n < 0 ifTrue: [^(self reciprocalModulo: m) raisedTo: n negated modulo: m].
+ 	(n < 4096 or: [m even])
+ 		ifTrue:
+ 			["Overhead of Montgomery method might cost more than naive divisions, use naive"
+ 			^self slidingLeftRightRaisedTo: n modulo: m].
+ 	
+ 	mInv := 256 - ((m bitAnd: 255) reciprocalModulo: 256).
+  
+ 	"Initialize the result to R=256 raisedTo: m digitLength"
+ 	a := (1 bitShift: m digitLength*8) \\ m.
+ 	
+ 	"Montgomerize self (multiply by R)"
+ 	(s := self montgomeryTimes: (a*a \\ m) modulo: m mInvModB: mInv)
+ 		ifNil:
+ 			["No Montgomery primitive available ? fallback to naive divisions"
+ 			^self slidingLeftRightRaisedTo: n modulo: m].
+ 
+ 	"Exponentiate self*R"
+ 	a := s montgomeryRaisedTo: n times: a modulo: m mInvModB: mInv.
+ 
+ 	"Demontgomerize the result (divide by R)"
+ 	^a montgomeryTimes: 1 modulo: m mInvModB: mInv!
- raisedTo: y modulo: n
- 	"Answer the modular exponential. Code by Jesse Welton."
- 	| s t u |
- 	s := 1.
- 	t := self.
- 	u := y.
- 	[u = 0] whileFalse: [
- 		u odd ifTrue: [
- 			s := s * t.
- 			s >= n ifTrue: [s := s \\\ n]].
- 		t := t * t.
- 		t >= n ifTrue: [t := t \\\ n].
- 		u := u bitShift: -1].
- 	^ s
- !

Item was changed:
  ----- Method: Integer>>raisedToInteger:modulo: (in category 'mathematical functions') -----
  raisedToInteger: exp modulo: m
+ 	self deprecated: 'rather use #raisedTo:modulo: for efficiency'.
  	(exp = 0) ifTrue: [^ 1].
  	exp even
  		ifTrue: [^ (self raisedToInteger: (exp // 2) modulo: m) squared \\ m]
  		ifFalse: [^ (self * (self raisedToInteger: (exp - 1) modulo: m)) \\ m].!

Item was added:
+ ----- Method: Integer>>slidingLeftRightRaisedTo:modulo: (in category 'private') -----
+ slidingLeftRightRaisedTo: n modulo: m
+ 	"Private - compute (self raisedTo: n) \\ m,
+ 	Note: this method has to be fast because it is generally used with large integers in cryptography.
+ 	It thus operate on exponent bits from left to right by packets with a sliding window rather than bit by bit (see below)."
+ 	
+ 	| pow j k w index oddPowersOfSelf square |
+ 	
+ 	"Precompute powers of self for odd bit patterns xxxx1 up to length w + 1.
+ 	The width w is chosen with respect to the total bit length of n,
+ 	such that each bit pattern will on average be encoutered P times in the whole bit sequence of n.
+ 	This costs (2 raisedTo: w) multiplications, but more will be saved later (see below)."
+ 	k := n highBit.
+ 	w := (k highBit - 1 >> 1 min: 16) max: 1.
+ 	oddPowersOfSelf := Array new: 1 << w.
+ 	oddPowersOfSelf at: 1 put: (pow := self).
+ 	square := self * self \\\ m.
+ 	2 to: oddPowersOfSelf size do: [:i | pow := oddPowersOfSelf at: i put: pow * square \\\ m].
+ 	
+ 	"Now exponentiate by searching precomputed bit patterns with a sliding window"
+ 	pow := 1.
+ 	[k > 0]
+ 		whileTrue:
+ 			[pow := pow * pow \\\ m.
+ 			"Skip bits set to zero (the sliding window)"
+ 			(n bitAt: k) = 0
+ 				ifFalse:
+ 					["Find longest odd bit pattern up to window length (w + 1)"
+ 					j := k - w max: 1.
+ 					[j < k and: [(n bitAt: j) = 0]] whileTrue: [j := j + 1].
+ 					"We found an odd bit pattern of length k-j+1;
+ 					perform the square powers for each bit
+ 					(same cost as bitwise algorithm);
+ 					compute the index of this bit pattern in the precomputed powers."
+ 					index := 0.
+ 					[k > j] whileTrue:
+ 						[pow := pow * pow \\\ m.
+ 						index := index << 1 + (n bitAt: k).
+ 						k := k - 1].
+ 					"Perform a single multiplication for the whole bit pattern.
+ 					This saves up to (k-j) multiplications versus a naive algorithm operating bit by bit"
+ 					pow := pow * (oddPowersOfSelf at: index + 1) \\\ m].
+ 			k := k - 1].
+ 	^pow normalize!




More information about the Squeak-dev mailing list