Out of sync source code

Lex Spoon lex at cc.gatech.edu
Tue Jan 25 17:56:31 UTC 2000


1-8 seconds is perceptible to me.

Many archiving programs already include checksums.  The only problem is
that some programs try to munge around with files marked as "text"--a
classic example of software second-guessing it's users.  So, how about
if just mark the files as something other than text?  And download
things in binary mode?  (My ftp program defaults to binary mode, in
fact, and after 4-6 years I have yet to turn on ascii mode even *once*)


Lex



"Richard A. O'Keefe" <ok at hermes.otago.ac.nz> wrote:
> 	"Dick" wrote:
> 		Please consider the cost of adding computing a checksum for a
> 		multi-megabyte file to the Squeak startup process.
> 	
> I replied.  Since then, I've written my own program, incorporating
> sum, sum -r, and cksum.  This test was done on an 84MHz SPARCstation 5
> running Solaris 2.7; the program was compiled using
> 	cc -dn -native -fast -xO5
> where cc was Workshop Compilers 4.2 C (30 Oct 1996).  It didn't seem to
> matter whether I used -dn or not.
> 
> For this test, I catted together 20 copies of Squeak2.0.image,
> making a 67MB file.  I obtained the following times, where programs in
> ~/bin are mine and the others are system utilities.
> 
> 	wc		=> 57u+6s seconds (locale-dependent!)
>         ~/bin/wc        => 15u+6s seconds (old-fashioned)
>         sum             => 21u+6s seconds
>         ~/bin/csum      =>  2u+4s seconds
>         sum -r          => 19u+6s seconds
>         ~/bin/csum -r   =>  5u+4s seconds
>         cksum           => 11u+6s seconds
>         ~/bin/csum -c   =>  6u+4s seconds
> 
> As before, the system time is pretty constant from one to another;
> the programs that take 4u seconds read into a 1MB buffer while the
> others use smaller ones (probably 8kB).
> 
> Here are the times on a 268MHz Alpha 21064:
> 
> 	sum -o		=> 7.5u+1.3s seconds
> 	~/bin/csum	=> 1.8u+1.3s seconds
> 	sum -r		=> 8.2u+1.2s seconds
> 	~/bin/csum -r	=> 2.6u+1.3s seconds
> 	cksum		=> 4.9u+1.3s seconds
> 	~/bin/csum -c	=> 3.6u+1.3s seconds
> 
> This was with
> cc -fast -arch host -tune host -non_shared -O4 -om -assume whole_program
> gcc -O6 -static
> which both gave the same results.
> 
> The inner loops of sum, sum -r, and cksum are pretty simple.
> They are
> 
>     sum
> 	do sum += *b++; while (--n != 0);  
>     sum -r
> 	do sum = ((((sum<<16)|sum)>>1) + *b++) & 0xFFFF; while (--n != 0);
>     cksum
> 	do sum = crc_table[*b++ ^ (sum>>24)] ^ (sum<<8); while (--n != 0);
> 
> respectively, in my code.  It's hard to see what could be done to tweak them.
> The speed difference may simply be due to making the compiler work harder.
> With these compiler options the C compiler unrolls loops like these quite
> competently.  
> 
> Leave out the I/O cost (fair since the image has to be read anyway),
> and the checksums are calculated at
>     33 MB/s		~/bin/sum
>     13 MB/s		~/bin/sum -r
>     11 MB/s		~/bin/sum -c (like cksum)
> on the slower machine.
> 
> The conclusion that doing a checksum (not necessarily one of these) would
> not perceptibly slow down reading most images stands.  The Squeak 2.7
> image is "5.9MB on disk (6,273,176)" on a Mac, so we could be looking at
> 0.4 seconds on an old machine, 0.1 to 0.2 on a modern one.
> 
> Given the flakiness I've experienced with every version of Squeak
> (Example:  take the "Welcome To" window you get when you first start,
>  and bring up the left margin menu.  Selecting more... crashed in Squeak
> 2.6 and has no discernable effect in 2.7.  In the "Getting Started" window
> it works.  I demonstrated it to another staff member today and ended up
> looking remarkably silly, with that and other problems.)
> it would be nice to know whether they're due to Squeak bugs or network
> corruption.





More information about the Squeak-dev mailing list