Enhancement Request: Adding population count for Integers to the standard image

Thu Dec 9 18:35:00 UTC 2004

Richard,

Aye carumba! I just looked at the implementation of Integer, 
LargePostiveInteger, and LargeNegativeInteger, and am shocked. All this 
time I've been happily going along thinking that large integers are 
implemented as an array of bytes holding a twos complement 
representation (with the high bit of the most significant byte in the 
array indicating negativeness). Most of the things I deal with are all 
unsigned so I paid little attention to negative integers. So... I was 
shocked to discover that SmallIntegers are twos complement while large 
integers are ones complement. I guess it's a complement to whomever 
implemented these classes that it took me so long to get to the point 
where I thought to even wonder about the large integer implementation.

Interestingly, I think your statement that -1 is notionally an infinite 
number of 1 bits, is both true and false. In an abstract sort of way, 
sure, -1 can be represented as an infinite string of 1's. However, 
large negative numbers are represented as one's complement which means 
that -1 would theoretically be defined by an infinite string of 1's 
with a zero appended. But, the are comments all over the place talking 
about the need for normalization, so... you might also say in this case 
-1 is unrepresentable as a LargeNegativeInteger because the normalize 
method will demote it into a SmallInteger.

Huh. You learn something new every day.

But it certainly looks like you've uncovered a bug (or something) in 
the LargePositiveInteger bitAnd: method. The following code 
demonstrates some of my concerns.

| m1 m2 m3 |
m1 := -1 bitShift: 31.
m2 := m1 - 1.
m3 := m1 bitAnd: m2.
m1 inspect.
m2 inspect.
m3 inspect.

What one would expect to happen here given the use of one's complement 
for large negative integers is,
- m1 is set to a LargeNegativeInteger represented by the byte array #( 
0 0 0 128 )
- m2 is set to a LargeNegativeInteger represented by the byte array #( 
1 0 0 128 )
- m3 is set to a LargeNegativeInteger represented by the byte array #( 
0 0 0 128 )

But what happens instead is m3 is, as you have demonstrated, set to a 
SmallInteger representation of 0. This makes me wonder about the 
semantics of logical operations on negative numbers. Is there some 
reason that 16r80000000 bitAnd: 16r80000001 is different than m1 
bitAnd: m2 above?

Am I opening a bit can of worms by asking why large integers aren't 
represented as an array of bytes holding the two's complement?

But with respect to the semantics of popCount on negative numbers, 
thanks for the suggestions. I honestly wasn't thinking about that case.

On Dec 8, 2004, at 8:13 PM, Richard A. O'Keefe wrote:

> "Matthew Hamrick" <Matthew.Hamrick at gibson.com> wrote
> (edited somewhat to take fewer lines and be more ASCII-friendly):
>
> 	popCount
> 	 "Answers the number of bits set in an integer."
> 	 | t1 t2 |
> 	 t1 := self.
> 	 t2 := 0.
> 	 [t1 == 0] whileFalse: [t1 := t1 bitAnd: t1 - 1. t2 := t2 + 1].
> 	 ^ t2
>
> What is this supposed to do when the number is negative?
>
>     -1 bitAnd: -2	=> -2
>     -2 bitAnd: -3	=> -4
>     -4 bitAnd: -5	=> -8
>
> In Squeak 3.6-5429-full.image, (-1 popCount) eventually terminates 
> because
>     m := -1 bitShift: 31.
>     m := m bitAnd: m - 1
> sets m to 0, instead of to the correct value (-1 bitShift: 32).
> First time I've ever had Squeak give me the wrong answer for an
> integer calculation.
>
> Notionally, a Squeak negative integer has infinitely many 1 bits, so I
> suggest
>
>     popCount
>       "self >= 0 => answer the number of 1 bits in the binary 
> representation.
>        self <  0 => answer the number of 0 bits in the binary 
> representation.
>        Two's complement is assumed.
>       "
>       |m n|
>       (m := self) < 0 ifTrue: [m := -1 - self].
>       n := 0.
>       [m == 0] whileFalse: [m := m bitAnd: m-1. n := n+1].
>       ^n
>
> Another alternative would be to make it an error to send #popCount to a
> negative integer.
>
>