Enhancement Request: Adding population count for Integers to
the standard image
Matthew S. Hamrick
matthew.hamrick at gibson.com
Thu Dec 9 19:30:46 UTC 2004
Aye carumba! I just looked at the implementation of Integer,
LargePostiveInteger, and LargeNegativeInteger, and am shocked. All this
time I've been happily going along thinking that large integers are
implemented as an array of bytes holding a twos complement
representation (with the high bit of the most significant byte in the
array indicating negativeness). Most of the things I deal with are all
unsigned so I paid little attention to negative integers. So... I was
shocked to discover that SmallIntegers are twos complement while large
integers are ones complement. I guess it's a complement to whomever
implemented these classes that it took me so long to get to the point
where I thought to even wonder about the large integer implementation.
Interestingly, I think your statement that -1 is notionally an infinite
number of 1 bits, is both true and false. In an abstract sort of way,
sure, -1 can be represented as an infinite string of 1's. However,
large negative numbers are represented as one's complement which means
that -1 would theoretically be defined by an infinite string of 1's
with a zero appended. But, the are comments all over the place talking
about the need for normalization, so... you might also say in this case
-1 is unrepresentable as a LargeNegativeInteger because the normalize
method will demote it into a SmallInteger.
Huh. You learn something new every day.
But it certainly looks like you've uncovered a bug (or something) in
the LargePositiveInteger bitAnd: method. The following code
demonstrates some of my concerns.
| m1 m2 m3 |
m1 := -1 bitShift: 31.
m2 := m1 - 1.
m3 := m1 bitAnd: m2.
What one would expect to happen here given the use of one's complement
for large negative integers is,
- m1 is set to a LargeNegativeInteger represented by the byte array #(
0 0 0 128 )
- m2 is set to a LargeNegativeInteger represented by the byte array #(
1 0 0 128 )
- m3 is set to a LargeNegativeInteger represented by the byte array #(
0 0 0 128 )
But what happens instead is m3 is, as you have demonstrated, set to a
SmallInteger representation of 0. This makes me wonder about the
semantics of logical operations on negative numbers. Is there some
reason that 16r80000000 bitAnd: 16r80000001 is different than m1
bitAnd: m2 above?
Am I opening a bit can of worms by asking why large integers aren't
represented as an array of bytes holding the two's complement?
But with respect to the semantics of popCount on negative numbers,
thanks for the suggestions. I honestly wasn't thinking about that case.
On Dec 8, 2004, at 8:13 PM, Richard A. O'Keefe wrote:
> "Matthew Hamrick" <Matthew.Hamrick at gibson.com> wrote
> (edited somewhat to take fewer lines and be more ASCII-friendly):
> "Answers the number of bits set in an integer."
> | t1 t2 |
> t1 := self.
> t2 := 0.
> [t1 == 0] whileFalse: [t1 := t1 bitAnd: t1 - 1. t2 := t2 + 1].
> ^ t2
> What is this supposed to do when the number is negative?
> -1 bitAnd: -2 => -2
> -2 bitAnd: -3 => -4
> -4 bitAnd: -5 => -8
> In Squeak 3.6-5429-full.image, (-1 popCount) eventually terminates
> m := -1 bitShift: 31.
> m := m bitAnd: m - 1
> sets m to 0, instead of to the correct value (-1 bitShift: 32).
> First time I've ever had Squeak give me the wrong answer for an
> integer calculation.
> Notionally, a Squeak negative integer has infinitely many 1 bits, so I
> "self >= 0 => answer the number of 1 bits in the binary
> self < 0 => answer the number of 0 bits in the binary
> Two's complement is assumed.
> |m n|
> (m := self) < 0 ifTrue: [m := -1 - self].
> n := 0.
> [m == 0] whileFalse: [m := m bitAnd: m-1. n := n+1].
> Another alternative would be to make it an error to send #popCount to a
> negative integer.
More information about the Squeak-dev