From: Dean Swan@MITEL on 11/01/99 01:07 PM
Before anybody gets too far along on this, I'd like to encourage folks to look at a few references for work already done in the area of multiple precision arithmetic:
1) "Multiple-Precision Arithmetic in C", Kaliski, Burton S., Jr., Dr. Dobb's Journal, August 1992 2) "The Z80180 and Big-number Arithmetic", Kaliski, Burton S., Jr., Dr. Dobb's Journal, September 1993 3) "Arbirtrary Precision Floating Point Arithmetic", Moteller, Frederick C., Dr. Dobb's Journal, September 1993 3) "Using the Multiple Precision Library", Rogers, John, Dr. Dobb's Journal, January 1995 4) "Basic Arithmetic with Inifinite Integers", Hamilton, Jeffery W., Dr. Dobb's Journal January 1995 5) Chapter 20, Section 20.6 "Arithmetic at Arbitrary Precision", Numerical Recipes in C, (c) 1988-1992 Cambridge University Press. http://www.ulib.org/webRoot/Books/Numerical_Recipes/bookcpdf/c20-6.pdf
The first two of these references were written by the Cheif Scientist (at the time, I don't know if he still is) at RSA Laboratories, with cryptographic applications in mind, and he has some interesting things to say about performance.
The _Numerical_Recipes_In_C_ (sorry about the _ underscores for those of you reading your mail with Celeste), shows a neat trick to do fast multiple precision multiplies using the FFT. Since Squeak already has a pretty good implementation of the FFT, this may be worth looking at. It may offer a very efficient way to do LargeInteger multiplies, without new primitives.
A web search on 'multiple or arbitrary precision arithmetic' also will yeild lots of references (of varying quality, as usual). Because of it's uses in cryptography, amongst other things, a lot of work has been done in the area of multiple precision arithmetic since the original implementation of the LargeInteger classes in Squeak.
Also, just my opinion, but if people feel compelled to implement primitives for this stuff, I would definitely like to see them implemented in Smalltalk, as plug-ins. I think Squeak already has too many primitives (all in the name of performance, to be sure), and they just move the whole system further away from one of Dan Ingalls' original design premises, that Smalltalk be a system that could be completely understood by a single person. I would love to see some effort put into simplifying Squeak, without eliminating functionality, of course. Especially since my ideal target environment is the Casio E-105. (BTW, I am making some progress on my Cursor/Sensor changes for the E-105. I'll post an update in the near future).
-Dean Swan
dean_swan@mitel.com
Dean_Swan@Mitel.COM wrote:
........
A web search on 'multiple or arbitrary precision arithmetic' also will yeild lots of references (of varying quality, as usual). Because of it's uses in cryptography, amongst other things, a lot of work has been done in the area of multiple precision arithmetic since the original implementation of the LargeInteger classes in Squeak.
It can also be fruitful to search on 'bignum lisp'.
-- Dwight
Dean:
Thanks for getting the references together; I was just about to do it and your list is better anyway. I'd add:
6) Multiple Precision Arithmetic, chapter 4.3 of Knuth's The Art of Computer Programming, Volume 2, Third Edition.
Various ramblings about other posts to this thread:
* I don't find the current implementation of large integers to be very useful; even with occasional use, it's like hitting a brick wall.
* Space/performance tradeoff - I'd like to see what this is before making decisions. I'm interested in both large Squeak systems on high-end desktop machines AND tiny Squeak systems running in my wristwatch or shirt pocket. It might well be that the space hit is not that large and that fast LargeInteger support on a tiny machine might be very useful, especially if the machine lacks good floating-point support.
* Simple implementations: I'd much prefer a very high-performance implementation. I don't want to take a huge hit when an integer exceeds 30 bits. IE, if it's going to be done it would be good if the final version was extremely fast.
* Optimizing special cases: There are lots of applications where one only needs 32 or 64 bits. (See below). These cases probably should be special.
* Other users of integers: For example, the random number generator in Squeak uses floating-point to get around the 30 bit small integer limit. Virtually all well-tested generators are for 32 bit integers and it is a very major bit of work to create a 30 bit generator. Dan and I found a major performance improvement using floating-point numbers as 'integers' when generating random numbers. If large integers handled 32-bit integers well, then the algorithm could be switched back, especially on small machines where floating-point support is poor or missing.
Dave
At 13:07 -0500 11/1/99, Dean_Swan@Mitel.COM wrote:
From: Dean Swan@MITEL on 11/01/99 01:07 PM
Before anybody gets too far along on this, I'd like to encourage folks to look at a few references for work already done in the area of multiple precision arithmetic:
1) "Multiple-Precision Arithmetic in C", Kaliski, Burton S., Jr., Dr.
Dobb's Journal, August 1992 2) "The Z80180 and Big-number Arithmetic", Kaliski, Burton S., Jr., Dr. Dobb's Journal, September 1993 3) "Arbirtrary Precision Floating Point Arithmetic", Moteller, Frederick C., Dr. Dobb's Journal, September 1993 3) "Using the Multiple Precision Library", Rogers, John, Dr. Dobb's Journal, January 1995 4) "Basic Arithmetic with Inifinite Integers", Hamilton, Jeffery W., Dr. Dobb's Journal January 1995 5) Chapter 20, Section 20.6 "Arithmetic at Arbitrary Precision", Numerical Recipes in C, (c) 1988-1992 Cambridge University Press.
http://www.ulib.org/webRoot/Books/Numerical_Recipes/bookcpdf/c20-6.pdf
The first two of these references were written by the Cheif Scientist (at the time, I don't know if he still is) at RSA Laboratories, with cryptographic applications in mind, and he has some interesting things to say about performance.
The _Numerical_Recipes_In_C_ (sorry about the _ underscores for those of you reading your mail with Celeste), shows a neat trick to do fast multiple precision multiplies using the FFT. Since Squeak already has a pretty good implementation of the FFT, this may be worth looking at. It may offer a very efficient way to do LargeInteger multiplies, without new primitives.
A web search on 'multiple or arbitrary precision arithmetic' also will yeild lots of references (of varying quality, as usual). Because of it's uses in cryptography, amongst other things, a lot of work has been done in the area of multiple precision arithmetic since the original implementation of the LargeInteger classes in Squeak.
Also, just my opinion, but if people feel compelled to implement primitives for this stuff, I would definitely like to see them implemented in Smalltalk, as plug-ins. I think Squeak already has too many primitives (all in the name of performance, to be sure), and they just move the whole system further away from one of Dan Ingalls' original design premises, that Smalltalk be a system that could be completely understood by a single person. I would love to see some effort put into simplifying Squeak, without eliminating functionality, of course. Especially since my ideal target environment is the Casio E-105. (BTW, I am making some progress on my Cursor/Sensor changes for the E-105. I'll post an update in the near future).
-Dean Swan dean_swan@mitel.com
_______________________________ David N. Smith IBM T J Watson Research Center Hawthorne, NY _______________________________ Any opinions or recommendations herein are those of the author and not of his employer.
Dave,
just one remark:
- Other users of integers: For example, the random number generator
in Squeak uses floating-point to get around the 30 bit small integer limit. Virtually all well-tested generators are for 32 bit integers and it is a very major bit of work to create a 30 bit generator. Dan and I found a major performance improvement using floating-point numbers as 'integers' when generating random numbers. If large integers handled 32-bit integers well, then the algorithm could be switched back, especially on small machines where floating-point support is poor or missing.
Random generator are very important for such interesting areas like evolutionary algorithms (GA, ES, GP, etc.) and Artificial Life. I think Squeak could be a good platform for such research and _games_ (I like games...)! I was surprised by a random generator with a quiet short period (is this the right word in English? It means repeating seq) in VW which I used for my Diploma thesis...
Stephan
Dear Squeakers,
I'm thanking you _all_ for your very worthable comments.
After my first attempt under the title 'Better integer performance possible?' there wasn't such a great reaction; now I think there _is_ interest in improvements in this area.
Andrew, David (thank you for the hint with the compile times) and Dean have favorized Plugins as implementation choice (for different reasons).
I think this is a possible variant which makes sense, too. But also I think that's not the main point: - it should be possible for a non C-freak just by reading sources in Smalltalk to get a working idea how its possible to make LargeInteger computations in Smalltalk => this is the didactic approach; - its interesting to have LargeInt arithmetics __as fast as possible__ without the coercion to follow the 'didactic' implementation line => cryptographic, maths, random generators, advertising (for people who compare Squeak with other Smalltalks); - then it should be as portable as possible, so don't try near assembler programming.
I cannot see a significant enlargement of the VM by implementing this in _addition_ to the 'didactic' approach at the moment, but there is no pressing to make a decision now: the plugin variant seems to be valuable for development first and if there is a working plugin there should be no problem to put it in the standard VM if there is an increasing demand to do so (num of downloads of plugin, how many programmers suggest to download it to speed up their work, or similar measures).
Some personal remarks: I want to start this around mid of November (then I have some holidays, now I'm working for money and writing these mails ;-)). But after start I have to - compile Squeak first; - understand the plugin mechanism with a simple example, not 'primitive' ;-) example, what I first have written; - read some stuff about large integer computation or not (but probably it makes sense), etc..
My first timeline was to have some results _before_ 2K. Perhaps this is too optimistic thinking, but it's a nice timeline :-)
With best regards,
Stephan
sr (stephan.rudlof@ipk.fhg.de) "Genius doesn't work on an assembly line basis. You can't simply say, 'Today I will be brilliant.'" -- Kirk, "The Ultimate Computer", stardate 4731.3
At 22:44 +0100 11/1/99, Stephan Rudlof wrote:
Random generator are very important for such interesting areas like evolutionary algorithms (GA, ES, GP, etc.) and Artificial Life. I think Squeak could be a good platform for such research and _games_ (I like games...)! I was surprised by a random generator with a quiet short period (is this the right word in English? It means repeating seq) in VW which I used for my Diploma thesis...
Stephan
The last time I looked (several years ago) it still had a horrid generator with a period of 64K. It's easy to fix; there are Park-Miller generators available for most any Smalltalk system.
I fully agree that evolutionary algorithms is another very interesting place that makes heavy use of randoms. Unfortunately, very high performance is needed for serious work; John Kosa (the GP guy) has abandoned LISP and gone to a system based in C and even used multiprocessing to try to get the performance needed for real-world problems. Maybe plugins can help?
Dave
_______________________________ David N. Smith IBM T J Watson Research Center Hawthorne, NY _______________________________ Any opinions or recommendations herein are those of the author and not of his employer.
At 22:44 +0100 11/1/99, Stephan Rudlof wrote:
Random generator are very important for such interesting areas like evolutionary algorithms (GA, ES, GP, etc.) and Artificial Life. I think Squeak could be a good platform for such research and _games_ (I like games...)! I was surprised by a random generator with a quiet short period (is this the right word in English? It means repeating seq) in VW which I used for my Diploma thesis...
The last time I looked (several years ago) it still had a horrid generator with a period of 64K. It's easy to fix; there are Park-Miller generators available for most any Smalltalk system.
I fully agree that evolutionary algorithms is another very interesting place that makes heavy use of randoms. Unfortunately, very high performance is needed for serious work; John Kosa (the GP guy) has abandoned LISP and gone to a system based in C and even used multiprocessing to try to get the performance needed for real-world problems. Maybe plugins can help?
We recently chose tt800 which is a "twisted Generalized Feedback Shift Register" algorithm with a simple C implementation and a periodicity of 2^800. It has bas passed all the DIEHARD tests quite admirably and is useful for most Monte Carlo type simulations.
Here are some of the better PRNG URLs I found in my quest:
G. Marsaglia, DIEHARD Testing Suite: http://stat.fsu.edu/~geo/dihard.html
P. Hellekalek, "Good Random Number Generators Are (Not So) Easy to Find" published in Mathematics and Computers in Simulation 46(5-6) 1998 pp 485-505. http://www.elsevier.nl/cas/tree/store/matcom/sub/1998/46/5-6/1527.pdf
News on Random Number Generators http://random.mat.sbg.ac.at/news/
and finally,
tt800 source code: http://random.mat.sbg.ac.at/ftp/pub/daa/tt800.c
As I am just coming up to speed on Squeak, perhaps someone else would find this to an interesting exercise: porting the C code to Squeak (or making it a part of the VM).
8-)
-- ************************************************* Jeff Szuhay mailto:jeff.szuhay@pstnet.com Lead Macintosh Engineer voice: 412/271-5040 x 227 Psychology Software Tools http://www.pstnet.com/
At 13:25 -0500 11/2/99, Jeff Szuhay wrote:
...SNIP... and finally,
tt800 source code: http://random.mat.sbg.ac.at/ftp/pub/daa/tt800.c
Jeff:
Can you check this link again? I can get to http://random.mat.sbg.ac.at/ftp/pub but the daa directory is missing and nothing else seems to have a tt800.c file.
Dave _______________________________ David N. Smith IBM T J Watson Research Center Hawthorne, NY _______________________________ Any opinions or recommendations herein are those of the author and not of his employer.
At 13:25 -0500 11/2/99, Jeff Szuhay wrote:
...SNIP... and finally,
tt800 source code: http://random.mat.sbg.ac.at/ftp/pub/daa/tt800.c
Jeff:
Can you check this link again? I can get to http://random.mat.sbg.ac.at/ftp/pub but the daa directory is missing and nothing else seems to have a tt800.c file.
Oops...
change .../daa/... to ... /data/... and the page will appear.
8-)
-- ************************************************* Jeff Szuhay mailto:jeff.szuhay@pstnet.com Lead Macintosh Engineer voice: 412/271-5040 x 227 Psychology Software Tools http://www.pstnet.com/
--============_-1270374434==_============ Content-Type: text/plain; charset="us-ascii" ; format="flowed"
At 14:09 -0500 11/2/99, Jeff Szuhay wrote:
At 13:25 -0500 11/2/99, Jeff Szuhay wrote:
...SNIP... and finally,
tt800 source code: http://random.mat.sbg.ac.at/ftp/pub/daa/tt800.c
Jeff:
Can you check this link again? I can get to http://random.mat.sbg.ac.at/ftp/pub but the daa directory is missing and nothing else seems to have a tt800.c file.
Oops...
change .../daa/... to ... /data/... and the page will appear.
OK, I got it and have converted it to Squeak (with some testing assistance from Jeff). The code is attached. It's basically a straight conversion from C. It uses, by its nature, 32-bit long integers and is slow by a factor of about 30 relative to Squeak's Random class.
References and abstracts to the papers are included in the class comment. The original C code (less its tiny main program) is in a method as a comment. There are class methods that test the generator and also assure that it answers the same results as the C version.
This is a good example of why long integers need to be fast, and how easy it is to hit that performance brick wall.
If you are using Random, you can try this by changing the class name to RandomTT800. The #next method answers the next pseudo-random number and #seed: resets the seed. The generator automatically self seeds from the millisecond clock.
Dave --============_-1270374434==_============ Content-Type: text/plain; name="Random-TT800.4Nove333pm.cs" ; x-mac-type="65417070" ; x-mac-creator="43534F6D" Content-Disposition: attachment; filename="Random-TT800.4Nove333pm.cs" Content-Transfer-Encoding: imap_stub
0,800,2,15507,0,
--============_-1270374434==_============ Content-Type: text/plain; charset="us-ascii" ; format="flowed"
_______________________________ David N. Smith IBM T J Watson Research Center Hawthorne, NY _______________________________ Any opinions or recommendations herein are those of the author and not of his employer. --============_-1270374434==_============--
From: Dean Swan@MITEL on 11/01/99 01:07 PM
Before anybody gets too far along on this, I'd like to encourage folks to look at a few references for work already done in the area of multiple precision arithmetic:
1) "Multiple-Precision Arithmetic in C", Kaliski, Burton S., Jr., Dr.
Dobb's Journal, August 1992 2) "The Z80180 and Big-number Arithmetic", Kaliski, Burton S., Jr., Dr. Dobb's Journal, September 1993 3) "Arbirtrary Precision Floating Point Arithmetic", Moteller, Frederick C., Dr. Dobb's Journal, September 1993 3) "Using the Multiple Precision Library", Rogers, John, Dr. Dobb's Journal, January 1995 4) "Basic Arithmetic with Inifinite Integers", Hamilton, Jeffery W., Dr. Dobb's Journal January 1995 5) Chapter 20, Section 20.6 "Arithmetic at Arbitrary Precision", Numerical Recipes in C, (c) 1988-1992 Cambridge University Press.
http://www.ulib.org/webRoot/Books/Numerical_Recipes/bookcpdf/c20-6.pdf
Thanx!
The first two of these references were written by the Cheif Scientist (at the time, I don't know if he still is) at RSA Laboratories, with cryptographic applications in mind, and he has some interesting things to say about performance.
The _Numerical_Recipes_In_C_ (sorry about the _ underscores for those of you reading your mail with Celeste), shows a neat trick to do fast multiple precision multiplies using the FFT. Since Squeak already has a pretty good implementation of the FFT, this may be worth looking at.
I'm not sure, because I have found big differences between the result plots of slow and fast computation of FFT in squeak (recently tested) and one - missing sign - error in the code (didn't have the time to follow this path further). Try it! One additional reason could be different float precision?!
It may offer a very efficient way to do LargeInteger multiplies, without new primitives.
A web search on 'multiple or arbitrary precision arithmetic' also will yeild lots of references (of varying quality, as usual). Because of it's uses in cryptography, amongst other things, a lot of work has been done in the area of multiple precision arithmetic since the original implementation of the LargeInteger classes in Squeak.
Sure, but my first wish was just to make LargeInterger computation say at least twice as fast as before without the claim to become an expert at the edge of the newest methods before...
Also, just my opinion, but if people feel compelled to implement primitives for this stuff, I would definitely like to see them implemented in Smalltalk, as plug-ins. I think Squeak already has too many primitives (all in the name of performance, to be sure), and they just move the whole system further away from one of Dan Ingalls' original design premises, that Smalltalk be a system that could be completely understood by a single person.
That's a very important point! The didactic point of view to give every interested guy the chance to understand _some_ principles of large integer arithmetics. This gives much credit to the dual approach of doing it in Smalltalk _and_ as Plugin.
I would love to see some effort put into simplifying Squeak, without eliminating functionality, of course. Especially since my ideal target environment is the Casio E-105. (BTW, I am making some progress on my Cursor/Sensor changes for the E-105. I'll post an update in the near future).
-Dean Swan dean_swan@mitel.com
squeak-dev@lists.squeakfoundation.org