2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----
- cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen into:
pByteRes
| z |
- cDigitSub: pByteSmall
len: smallLen
with: pByteLarge
len: largeLen
into: pByteRes
| z limit | <var: #pByteSmall type: 'unsigned char * '> <var: #pByteLarge type: 'unsigned char * '> <var: #pByteRes type: 'unsigned char * '>
z := 0. "Loop invariant is -1<=z<=1"
0 to: smallLen - 1 do:
z := 0.
"Loop invariant is -1<=z<=1"
limit := smallLen - 1.
0 to: limit do: [:i | z := z + (pByteLarge at: i) - (pByteSmall at: i).
pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???
Even if z is declared signed, (z bitAnd: 255) would work on a vast majority of compiler/processor because most compiler/processor would use a 2-complement representation. But it's technically implementation-defined.
+ z := z signedBitShift: -8].
i think this one is OK too on a vast majority of machines, but this is as well technically implementation defined. Some weird compiler/processor pair may as well fill left bits with zeroes, or not even use 2-complement...
z here is a carry and should contain either 0, or -1 (that is UINT_MAX) after this operation.
smallLen to: largeLen - 1 do:
pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant form of (z bitAnd: 255)"
z := z // 256].
limit := largeLen - 1.
smallLen to: limit do: [:i | z := z + (pByteLarge at: i) .
pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
form of (z bitAnd: 255)"
z := z signedBitShift: -8].
pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant form of (z bitAnd: 255)"
!z := z // 256].
Isn't there another funny signed shift in Large Integer division? This would deserve a review too, because division is inlining its own sort of subtraction...
On Tue, Jul 01, 2014 at 09:21:00PM +0200, Nicolas Cellier wrote:
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----
- cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen into:
pByteRes
| z |
- cDigitSub: pByteSmall
len: smallLen
with: pByteLarge
len: largeLen
into: pByteRes
| z limit | <var: #pByteSmall type: 'unsigned char * '> <var: #pByteLarge type: 'unsigned char * '> <var: #pByteRes type: 'unsigned char * '>
z := 0. "Loop invariant is -1<=z<=1"
0 to: smallLen - 1 do:
z := 0.
"Loop invariant is -1<=z<=1"
limit := smallLen - 1.
0 to: limit do: [:i | z := z + (pByteLarge at: i) - (pByteSmall at: i).
pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???
Nicolas,
This sounds right to me.
Do you have a final version of #cDigitSub:len:with:len:into: that you can recommend? The version in VMMaker-nice.342 produces failures in the KernelTests-Numbers tests when compiled in 64-bit mode, but I think that you have since suggested some other improvements in this email thread.
Dave
2014-07-02 0:28 GMT+02:00 David T. Lewis lewis@mail.msen.com:
On Tue, Jul 01, 2014 at 09:21:00PM +0200, Nicolas Cellier wrote:
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----
- cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen
into:
pByteRes
| z |
- cDigitSub: pByteSmall
len: smallLen
with: pByteLarge
len: largeLen
into: pByteRes
| z limit | <var: #pByteSmall type: 'unsigned char * '> <var: #pByteLarge type: 'unsigned char * '> <var: #pByteRes type: 'unsigned char * '>
z := 0. "Loop invariant is -1<=z<=1"
0 to: smallLen - 1 do:
z := 0.
"Loop invariant is -1<=z<=1"
limit := smallLen - 1.
0 to: limit do: [:i | z := z + (pByteLarge at: i) - (pByteSmall at: i).
pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at: i
put:
(z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the
content
pByteRes is unsigned???
Nicolas,
This sounds right to me.
Do you have a final version of #cDigitSub:len:with:len:into: that you can recommend? The version in VMMaker-nice.342 produces failures in the KernelTests-Numbers tests when compiled in 64-bit mode, but I think that you have since suggested some other improvements in this email thread.
Dave
Hi David, that's interesting. VMMaker-nice.342 contains all the 32bits LargeInteger modifications, not just the #cDigitSub:len:with:len:into:. I tried to declare everything as <unsigned int> rather than <usqInt> with the assumption that sizeof(unsigned int)*2==sizeof(unsigned long long). But I never tried to compile a 64 bits VM, I think I lost the recipe for brewing such a flavour. I'm on MacOSX, could you provide a link to the shortest way to do it? Once I can check it, I'll publish a corrected version. Thanks
Nicolas
On Wed, Jul 02, 2014 at 07:44:26AM +0200, Nicolas Cellier wrote:
2014-07-02 0:28 GMT+02:00 David T. Lewis lewis@mail.msen.com:
On Tue, Jul 01, 2014 at 09:21:00PM +0200, Nicolas Cellier wrote:
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----
- cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen
into:
pByteRes
| z |
- cDigitSub: pByteSmall
len: smallLen
with: pByteLarge
len: largeLen
into: pByteRes
| z limit | <var: #pByteSmall type: 'unsigned char * '> <var: #pByteLarge type: 'unsigned char * '> <var: #pByteRes type: 'unsigned char * '>
z := 0. "Loop invariant is -1<=z<=1"
0 to: smallLen - 1 do:
z := 0.
"Loop invariant is -1<=z<=1"
limit := smallLen - 1.
0 to: limit do: [:i | z := z + (pByteLarge at: i) - (pByteSmall at: i).
pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at: i
put:
(z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the
content
pByteRes is unsigned???
Nicolas,
This sounds right to me.
Do you have a final version of #cDigitSub:len:with:len:into: that you can recommend? The version in VMMaker-nice.342 produces failures in the KernelTests-Numbers tests when compiled in 64-bit mode, but I think that you have since suggested some other improvements in this email thread.
Dave
Hi David, that's interesting. VMMaker-nice.342 contains all the 32bits LargeInteger modifications, not just the #cDigitSub:len:with:len:into:. I tried to declare everything as <unsigned int> rather than <usqInt> with the assumption that sizeof(unsigned int)*2==sizeof(unsigned long long). But I never tried to compile a 64 bits VM, I think I lost the recipe for brewing such a flavour. I'm on MacOSX, could you provide a link to the shortest way to do it? Once I can check it, I'll publish a corrected version. Thanks
Hi Nicolas,
Starting with an up to date VMMaker (trunk), I loaded #cDigitSub:len:with:len:into: from VMMaker-nice.342, then compiled the VM on 64-bit Linux and ran the unit tests.
I also ran tests on the 64-bit image with 64-bit VM, and this did *not* show any problem.
Sorry I did not look into it further, I will try to do so as soon as I get some time.
Dave
2014-07-02 13:27 GMT+02:00 David T. Lewis lewis@mail.msen.com:
On Wed, Jul 02, 2014 at 07:44:26AM +0200, Nicolas Cellier wrote:
2014-07-02 0:28 GMT+02:00 David T. Lewis lewis@mail.msen.com:
On Tue, Jul 01, 2014 at 09:21:00PM +0200, Nicolas Cellier wrote:
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into:
(in
category 'C core') -----
- cDigitSub: pByteSmall len: smallLen with: pByteLarge len:
largeLen
into:
pByteRes
| z |
- cDigitSub: pByteSmall
len: smallLen
with: pByteLarge
len: largeLen
into: pByteRes
| z limit | <var: #pByteSmall type: 'unsigned char * '> <var: #pByteLarge type: 'unsigned char * '> <var: #pByteRes type: 'unsigned char * '>
z := 0. "Loop invariant is -1<=z<=1"
0 to: smallLen - 1 do:
z := 0.
"Loop invariant is -1<=z<=1"
limit := smallLen - 1.
0 to: limit do: [:i | z := z + (pByteLarge at: i) - (pByteSmall at: i).
pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at:
i
put:
(z bitAnd: 16rFF) as I suggested would be way way simpler and will
ALWAYS
work. Why the hell invoke the complications of signed arithmetic when the
content
pByteRes is unsigned???
Nicolas,
This sounds right to me.
Do you have a final version of #cDigitSub:len:with:len:into: that you
can
recommend? The version in VMMaker-nice.342 produces failures in the KernelTests-Numbers tests when compiled in 64-bit mode, but I think
that
you have since suggested some other improvements in this email thread.
Dave
Hi David, that's interesting. VMMaker-nice.342 contains all the 32bits LargeInteger modifications, not just the #cDigitSub:len:with:len:into:. I tried to declare everything as <unsigned int> rather than <usqInt> with the assumption that sizeof(unsigned int)*2==sizeof(unsigned long long). But I never tried to compile a 64 bits VM, I think I lost the recipe for brewing such a flavour. I'm on MacOSX, could you provide a link to the shortest way to do it? Once I can check it, I'll publish a corrected version. Thanks
Hi Nicolas,
Starting with an up to date VMMaker (trunk), I loaded #cDigitSub:len:with:len:into: from VMMaker-nice.342, then compiled the VM on 64-bit Linux and ran the unit tests.
Ah, but you can't pick this single method from LargeIntegerPlugin v2.0... It tries to access/process by 32-bits digits instead of 8-bits digits. So it would be necessary to also change the send site for providing the right number of digits... I'll do a rewrite for 8 bits this evening...
Nicolas
I also ran tests on the 64-bit image with 64-bit VM, and this did *not* show
any problem.
Sorry I did not look into it further, I will try to do so as soon as I get some time.
Dave
On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----
- cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen
into: pByteRes
| z |
- cDigitSub: pByteSmall
len: smallLen
with: pByteLarge
len: largeLen
into: pByteRes
| z limit | <var: #pByteSmall type: 'unsigned char * '> <var: #pByteLarge type: 'unsigned char * '> <var: #pByteRes type: 'unsigned char * '>
z := 0. "Loop invariant is -1<=z<=1"
0 to: smallLen - 1 do:
z := 0.
"Loop invariant is -1<=z<=1"
limit := smallLen - 1.
0 to: limit do: [:i | z := z + (pByteLarge at: i) - (pByteSmall at: i).
pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???
I'm not maintaining the plugin. But I broke it in fixing the unsigned division anomaly. I just wanted it to work again as quickly as possibly without expending effort. I made the minimum changes I could to keep it working. I'm much happier to have you maintain the plugin. You have the expertise and experience.
Nicolas, my priority is to have Spur working. I don't want to have to expend lots of energy changing plugins to get Spur working. My submitting this fix is not an endorsement of any kind. It's merely expediency.
Even if z is declared signed, (z bitAnd: 255) would work on a vast majority
of compiler/processor because most compiler/processor would use a 2-complement representation. But it's technically implementation-defined.
z := z signedBitShift: -8].
i think this one is OK too on a vast majority of machines, but this is as well technically implementation defined. Some weird compiler/processor pair may as well fill left bits with zeroes, or not even use 2-complement...
z here is a carry and should contain either 0, or -1 (that is UINT_MAX) after this operation.
smallLen to: largeLen - 1 do:
pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant form of (z bitAnd: 255)"
z := z // 256].
limit := largeLen - 1.
smallLen to: limit do: [:i | z := z + (pByteLarge at: i) .
pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
form of (z bitAnd: 255)"
z := z signedBitShift: -8].
pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant form of (z bitAnd: 255)"
!z := z // 256].
Isn't there another funny signed shift in Large Integer division? This would deserve a review too, because division is inlining its own sort of subtraction...
On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote:
On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----
- cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen
into: pByteRes
| z |
- cDigitSub: pByteSmall
len: smallLen
with: pByteLarge
len: largeLen
into: pByteRes
| z limit | <var: #pByteSmall type: 'unsigned char * '> <var: #pByteLarge type: 'unsigned char * '> <var: #pByteRes type: 'unsigned char * '>
z := 0. "Loop invariant is -1<=z<=1"
0 to: smallLen - 1 do:
z := 0.
"Loop invariant is -1<=z<=1"
limit := smallLen - 1.
0 to: limit do: [:i | z := z + (pByteLarge at: i) - (pByteSmall at: i).
pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???
I'm not maintaining the plugin. But I broke it in fixing the unsigned division anomaly. I just wanted it to work again as quickly as possibly without expending effort. I made the minimum changes I could to keep it working. I'm much happier to have you maintain the plugin. You have the expertise and experience.
Nicolas, my priority is to have Spur working. I don't want to have to expend lots of energy changing plugins to get Spur working. My submitting this fix is not an endorsement of any kind. It's merely expediency.
After catching up with the email thread, I am confused as to what problem we are trying to solve.
As near as I can tell, the situation is:
- The original LargeIntegersPlugin>>cDigitSub:len:with:len:into: works with all combinations of 32/64 bit VMs and images.
- Nicolas has proposed a better implementation, along with the recommendation to use unsigned integer C arithmetic unless there is some specific good reason to do otherwise. This seems right in principle, although the implementation in VMMaker-nice.348 is not working for 64-bit VMs, so some issues remain to be resolved.
- Eliot's original question began with this:
> I recently eliminated the optimization in Slang that replaces a > division by a power of two with a shift, because the code cast the argument > to signed, and hence broke unsigned division. That's what used to be > controlled by the UseRightShiftForDivide class var of CCodeGenerator. > > Yesterday I found out that that optimization is the only thing that's > keeping the LargeIntegers plugin afloat.
- At that point we had a problem in the Spur/Cog VMs that led to some patching of the code generation and so forth, along with this email thread.
So now I am confused. Is the problem that:
- The original implementation was broken?
- There is something different in the Spur/Cog environment that exposed problems in the original implementation?
- The original implementation worked, but the change to CCodeGenerator with regard to its use of UseRightShiftForDivide resulted in a problem?
- Something else?
I'm pretty sure I'm missing something here.
Dave
On Tue, Jul 1, 2014 at 7:51 PM, David T. Lewis lewis@mail.msen.com wrote:
On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote:
On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----
- cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen
into: pByteRes
| z |
- cDigitSub: pByteSmall
len: smallLen
with: pByteLarge
len: largeLen
into: pByteRes
| z limit | <var: #pByteSmall type: 'unsigned char * '> <var: #pByteLarge type: 'unsigned char * '> <var: #pByteRes type: 'unsigned char * '>
z := 0. "Loop invariant is -1<=z<=1"
0 to: smallLen - 1 do:
z := 0.
"Loop invariant is -1<=z<=1"
limit := smallLen - 1.
0 to: limit do: [:i | z := z + (pByteLarge at: i) - (pByteSmall at: i).
pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???
I'm not maintaining the plugin. But I broke it in fixing the unsigned division anomaly. I just wanted it to work again as quickly as possibly without expending effort. I made the minimum changes I could to keep it working. I'm much happier to have you maintain the plugin. You have the expertise and experience.
Nicolas, my priority is to have Spur working. I don't want to have to expend lots of energy changing plugins to get Spur working. My
submitting
this fix is not an endorsement of any kind. It's merely expediency.
After catching up with the email thread, I am confused as to what problem we are trying to solve.
As near as I can tell, the situation is:
- The original LargeIntegersPlugin>>cDigitSub:len:with:len:into: works
with all combinations of 32/64 bit VMs and images.
- Nicolas has proposed a better implementation, along with the
recommendation to use unsigned integer C arithmetic unless there is some specific good reason to do otherwise. This seems right in principle, although the implementation in VMMaker-nice.348 is not working for 64-bit VMs, so some issues remain to be resolved.
Eliot's original question began with this:
I recently eliminated the optimization in Slang that replaces a division by a power of two with a shift, because the code cast the
argument > to signed, and hence broke unsigned division. That's what used to be > controlled by the UseRightShiftForDivide class var of CCodeGenerator. > > Yesterday I found out that that optimization is the only thing that's > keeping the LargeIntegers plugin afloat.
- At that point we had a problem in the Spur/Cog VMs that led to some
patching of the code generation and so forth, along with this email thread.
So now I am confused. Is the problem that:
- The original implementation was broken?
Arguably yes. It relied on incorrect Slang translation to work, specifically the assumption that N / D where D is a power of two is equivalent to (sqInt)N >> S where S is log2 of D. That's wrong on two counts:
If N is -1 then (sqInt)-1 >> S is -1, whereas -1 / D is zero for D > 1. If N is unsigned (sqInt)N >> S ~= N >> S if N has the top bit set.
- There is something different in the Spur/Cog environment that exposed
problems in the original implementation?
Yes. I hit examples where the division optimization was generating incorrect code, e.g. in SpurMemoryManager>>allocateOldSpaceChunkOfBytes:
initialIndex := chunkBytes / self allocationUnit.
even though chunkBytes is unsigned, Slang was generating (sqInt)chunkBytes
3, which generates garbage when chunkBytes >= 2^31. I decided to rip
out the optimization (it is incorrect in the -1 / D case) rather than hack these. That in turn surfaced the bug in the LargeIntegers plugin.
- The original implementation worked, but the change to CCodeGenerator
with regard to its use of UseRightShiftForDivide resulted in a problem?
Yes.
- Something else?
No.
I'm pretty sure I'm missing something here.
You don't appear to be.
On Tue, Jul 1, 2014 at 8:44 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Tue, Jul 1, 2014 at 7:51 PM, David T. Lewis lewis@mail.msen.com wrote:
On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote:
On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----
- cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen
into: pByteRes
| z |
- cDigitSub: pByteSmall
len: smallLen
with: pByteLarge
len: largeLen
into: pByteRes
| z limit | <var: #pByteSmall type: 'unsigned char * '> <var: #pByteLarge type: 'unsigned char * '> <var: #pByteRes type: 'unsigned char * '>
z := 0. "Loop invariant is -1<=z<=1"
0 to: smallLen - 1 do:
z := 0.
"Loop invariant is -1<=z<=1"
limit := smallLen - 1.
0 to: limit do: [:i | z := z + (pByteLarge at: i) - (pByteSmall at: i).
pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and
will
ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???
I'm not maintaining the plugin. But I broke it in fixing the unsigned division anomaly. I just wanted it to work again as quickly as possibly without expending effort. I made the minimum changes I could to keep it working. I'm much happier to have you maintain the plugin. You have
the
expertise and experience.
Nicolas, my priority is to have Spur working. I don't want to have to expend lots of energy changing plugins to get Spur working. My
submitting
this fix is not an endorsement of any kind. It's merely expediency.
After catching up with the email thread, I am confused as to what problem we are trying to solve.
As near as I can tell, the situation is:
- The original LargeIntegersPlugin>>cDigitSub:len:with:len:into: works
with all combinations of 32/64 bit VMs and images.
- Nicolas has proposed a better implementation, along with the
recommendation to use unsigned integer C arithmetic unless there is some specific good reason to do otherwise. This seems right in principle, although the implementation in VMMaker-nice.348 is not working for 64-bit VMs, so some issues remain to be resolved.
Eliot's original question began with this:
I recently eliminated the optimization in Slang that replaces a division by a power of two with a shift, because the code cast the
argument > to signed, and hence broke unsigned division. That's what used to be > controlled by the UseRightShiftForDivide class var of CCodeGenerator. > > Yesterday I found out that that optimization is the only thing that's > keeping the LargeIntegers plugin afloat.
- At that point we had a problem in the Spur/Cog VMs that led to some
patching of the code generation and so forth, along with this email thread.
So now I am confused. Is the problem that:
- The original implementation was broken?
Arguably yes. It relied on incorrect Slang translation to work, specifically the assumption that N / D where D is a power of two is equivalent to (sqInt)N >> S where S is log2 of D. That's wrong on two counts:
If N is -1 then (sqInt)-1 >> S is -1, whereas -1 / D is zero for D > 1. If N is unsigned (sqInt)N >> S ~= N >> S if N has the top bit set.
- There is something different in the Spur/Cog environment that exposed
problems in the original implementation?
Yes. I hit examples where the division optimization was generating incorrect code, e.g. in SpurMemoryManager>>allocateOldSpaceChunkOfBytes:
initialIndex := chunkBytes / self allocationUnit.
even though chunkBytes is unsigned, Slang was generating (sqInt)chunkBytes
3, which generates garbage when chunkBytes >= 2^31. I decided to rip
out the optimization (it is incorrect in the -1 / D case) rather than hack these. That in turn surfaced the bug in the LargeIntegers plugin.
- The original implementation worked, but the change to CCodeGenerator
with regard to its use of UseRightShiftForDivide resulted in a problem?
Yes.
- Something else?
No.
I'm pretty sure I'm missing something here.
You don't appear to be.
and underlying this is that philosophically I believe that a) having Slang
generate correct code and b) having Slang maintain as close a correspondence between Smalltalk and the resulting C is to be preferred 1) at the risk of breaking plugins (which then have to be fixed), and 2) papering over the cracks by using cCode: [] inSmalltalk: [] everywhere.
ANother way of saying this is that I believe Slang should be approachable by the newbie who enters with rational expectations, not something only old salts who know and work-around the many pitfalls. I got horribly burned by Slang often in the first few years of working on Cog. I likened it to being hit on the head with a long stick by one's sensei, except that no enlightenment would ever ensue. One's head would simply hurt. I've tried to make it better, e.g. by adding some simple type inference so that it does the right thing with unsigned and long long types. Alas, when one does this old workarounds, mistakes, or bugs may break.
Eliot,
Thanks for the explanation, much clearer now. I was getting lost in the email threads.
Dave
On Tue, Jul 01, 2014 at 08:51:26PM -0700, Eliot Miranda wrote:
On Tue, Jul 1, 2014 at 8:44 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Tue, Jul 1, 2014 at 7:51 PM, David T. Lewis lewis@mail.msen.com wrote:
On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote:
On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----
- cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen
into: pByteRes
| z |
- cDigitSub: pByteSmall
len: smallLen
with: pByteLarge
len: largeLen
into: pByteRes
| z limit | <var: #pByteSmall type: 'unsigned char * '> <var: #pByteLarge type: 'unsigned char * '> <var: #pByteRes type: 'unsigned char * '>
z := 0. "Loop invariant is -1<=z<=1"
0 to: smallLen - 1 do:
z := 0.
"Loop invariant is -1<=z<=1"
limit := smallLen - 1.
0 to: limit do: [:i | z := z + (pByteLarge at: i) - (pByteSmall at: i).
pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and
will
ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???
I'm not maintaining the plugin. But I broke it in fixing the unsigned division anomaly. I just wanted it to work again as quickly as possibly without expending effort. I made the minimum changes I could to keep it working. I'm much happier to have you maintain the plugin. You have
the
expertise and experience.
Nicolas, my priority is to have Spur working. I don't want to have to expend lots of energy changing plugins to get Spur working. My
submitting
this fix is not an endorsement of any kind. It's merely expediency.
After catching up with the email thread, I am confused as to what problem we are trying to solve.
As near as I can tell, the situation is:
- The original LargeIntegersPlugin>>cDigitSub:len:with:len:into: works
with all combinations of 32/64 bit VMs and images.
- Nicolas has proposed a better implementation, along with the
recommendation to use unsigned integer C arithmetic unless there is some specific good reason to do otherwise. This seems right in principle, although the implementation in VMMaker-nice.348 is not working for 64-bit VMs, so some issues remain to be resolved.
Eliot's original question began with this:
I recently eliminated the optimization in Slang that replaces a division by a power of two with a shift, because the code cast the
argument > to signed, and hence broke unsigned division. That's what used to be > controlled by the UseRightShiftForDivide class var of CCodeGenerator. > > Yesterday I found out that that optimization is the only thing that's > keeping the LargeIntegers plugin afloat.
- At that point we had a problem in the Spur/Cog VMs that led to some
patching of the code generation and so forth, along with this email thread.
So now I am confused. Is the problem that:
- The original implementation was broken?
Arguably yes. It relied on incorrect Slang translation to work, specifically the assumption that N / D where D is a power of two is equivalent to (sqInt)N >> S where S is log2 of D. That's wrong on two counts:
If N is -1 then (sqInt)-1 >> S is -1, whereas -1 / D is zero for D > 1. If N is unsigned (sqInt)N >> S ~= N >> S if N has the top bit set.
- There is something different in the Spur/Cog environment that exposed
problems in the original implementation?
Yes. I hit examples where the division optimization was generating incorrect code, e.g. in SpurMemoryManager>>allocateOldSpaceChunkOfBytes:
initialIndex := chunkBytes / self allocationUnit.
even though chunkBytes is unsigned, Slang was generating (sqInt)chunkBytes
3, which generates garbage when chunkBytes >= 2^31. I decided to rip
out the optimization (it is incorrect in the -1 / D case) rather than hack these. That in turn surfaced the bug in the LargeIntegers plugin.
- The original implementation worked, but the change to CCodeGenerator
with regard to its use of UseRightShiftForDivide resulted in a problem?
Yes.
- Something else?
No.
I'm pretty sure I'm missing something here.
You don't appear to be.
and underlying this is that philosophically I believe that a) having Slang
generate correct code and b) having Slang maintain as close a correspondence between Smalltalk and the resulting C is to be preferred 1) at the risk of breaking plugins (which then have to be fixed), and 2) papering over the cracks by using cCode: [] inSmalltalk: [] everywhere.
ANother way of saying this is that I believe Slang should be approachable by the newbie who enters with rational expectations, not something only old salts who know and work-around the many pitfalls. I got horribly burned by Slang often in the first few years of working on Cog. I likened it to being hit on the head with a long stick by one's sensei, except that no enlightenment would ever ensue. One's head would simply hurt. I've tried to make it better, e.g. by adding some simple type inference so that it does the right thing with unsigned and long long types. Alas, when one does this old workarounds, mistakes, or bugs may break. -- best, Eliot
2014-07-02 13:29 GMT+02:00 David T. Lewis lewis@mail.msen.com:
Eliot,
Thanks for the explanation, much clearer now. I was getting lost in the email threads.
Dave
Yeah, Eliot did not break anything, the LargeInteger plugin was already broken. Only the incorrect substitution of // by >> did make it work. Eliot applied a minimal patch, but IMO invoking signed integer arithmetic when we do not need to is a recipe for more surprises... This require a deeper change if we do not want to feel the ground falling from under our feet again at the least slang change.
Nicolas
On Tue, Jul 01, 2014 at 08:51:26PM -0700, Eliot Miranda wrote:
On Tue, Jul 1, 2014 at 8:44 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Tue, Jul 1, 2014 at 7:51 PM, David T. Lewis lewis@mail.msen.com wrote:
On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote:
On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org: > > Item was changed: > ----- Method:
LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in
> category 'C core') ----- > + cDigitSub: pByteSmall len: smallLen with: pByteLarge len:
largeLen
> into: pByteRes > + | z | > - cDigitSub: pByteSmall > - len: smallLen > - with: pByteLarge > - len: largeLen > - into: pByteRes > - | z limit | > <var: #pByteSmall type: 'unsigned char * '> > <var: #pByteLarge type: 'unsigned char * '> > <var: #pByteRes type: 'unsigned char * '> > > + z := 0. "Loop invariant is -1<=z<=1" > + 0 to: smallLen - 1 do: > - z := 0. > - "Loop invariant is -1<=z<=1" > - limit := smallLen - 1. > - 0 to: limit do: > [:i | > z := z + (pByteLarge at: i) - (pByteSmall at: i). > + pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant
> form of (z bitAnd: 255)" >
Frankly, having z declared unsigned int and just doing pByteRes
at: i
put: (z bitAnd: 16rFF) as I suggested would be way way simpler and
will
ALWAYS work. Why the hell invoke the complications of signed arithmetic when
the
content pByteRes is unsigned???
I'm not maintaining the plugin. But I broke it in fixing the
unsigned
division anomaly. I just wanted it to work again as quickly as
possibly
without expending effort. I made the minimum changes I could to
keep it
working. I'm much happier to have you maintain the plugin. You
have
the
expertise and experience.
Nicolas, my priority is to have Spur working. I don't want to have
to
expend lots of energy changing plugins to get Spur working. My
submitting
this fix is not an endorsement of any kind. It's merely expediency.
After catching up with the email thread, I am confused as to what
problem
we are trying to solve.
As near as I can tell, the situation is:
- The original LargeIntegersPlugin>>cDigitSub:len:with:len:into:
works
with all combinations of 32/64 bit VMs and images.
- Nicolas has proposed a better implementation, along with the
recommendation to use unsigned integer C arithmetic unless there is some specific good reason to do otherwise. This seems right in principle, although the implementation in VMMaker-nice.348 is not working for 64-bit VMs, so some issues
remain
to be resolved.
Eliot's original question began with this:
I recently eliminated the optimization in Slang that replaces a division by a power of two with a shift, because the code cast
the
argument > to signed, and hence broke unsigned division. That's what
used to
be > controlled by the UseRightShiftForDivide class var of CCodeGenerator. > > Yesterday I found out that that optimization is the only thing that's > keeping the LargeIntegers plugin afloat.
- At that point we had a problem in the Spur/Cog VMs that led to some
patching of the code generation and so forth, along with this email thread.
So now I am confused. Is the problem that:
- The original implementation was broken?
Arguably yes. It relied on incorrect Slang translation to work, specifically the assumption that N / D where D is a power of two is equivalent to (sqInt)N >> S where S is log2 of D. That's wrong on two counts:
If N is -1 then (sqInt)-1 >> S is -1, whereas -1 / D is zero for D >
If N is unsigned (sqInt)N >> S ~= N >> S if N has the top bit set.
- There is something different in the Spur/Cog environment that
exposed
problems in the original implementation?
Yes. I hit examples where the division optimization was generating incorrect code, e.g. in
SpurMemoryManager>>allocateOldSpaceChunkOfBytes:
initialIndex := chunkBytes / self allocationUnit.
even though chunkBytes is unsigned, Slang was generating
(sqInt)chunkBytes
3, which generates garbage when chunkBytes >= 2^31. I decided to
rip
out the optimization (it is incorrect in the -1 / D case) rather than
hack
these. That in turn surfaced the bug in the LargeIntegers plugin.
- The original implementation worked, but the change to
CCodeGenerator
with regard to its use of UseRightShiftForDivide resulted in a problem?
Yes.
- Something else?
No.
I'm pretty sure I'm missing something here.
You don't appear to be.
and underlying this is that philosophically I believe that a) having
Slang
generate correct code and b) having Slang maintain as close a correspondence between Smalltalk and the resulting C is to be preferred
at the risk of breaking plugins (which then have to be fixed), and 2) papering over the cracks by using cCode: [] inSmalltalk: [] everywhere.
ANother way of saying this is that I believe Slang should be approachable by the newbie who enters with rational expectations, not something only
old
salts who know and work-around the many pitfalls. I got horribly burned
by
Slang often in the first few years of working on Cog. I likened it to being hit on the head with a long stick by one's sensei, except that no enlightenment would ever ensue. One's head would simply hurt. I've
tried
to make it better, e.g. by adding some simple type inference so that it does the right thing with unsigned and long long types. Alas, when one does this old workarounds, mistakes, or bugs may break. -- best, Eliot
On Wed, Jul 2, 2014 at 4:53 AM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
2014-07-02 13:29 GMT+02:00 David T. Lewis lewis@mail.msen.com:
Eliot,
Thanks for the explanation, much clearer now. I was getting lost in the email threads.
Dave
Yeah, Eliot did not break anything, the LargeInteger plugin was already broken. Only the incorrect substitution of // by >> did make it work. Eliot applied a minimal patch, but IMO invoking signed integer arithmetic when we do not need to is a recipe for more surprises... This require a deeper change if we do not want to feel the ground falling from under our feet again at the least slang change.
Alas I think it's still very broken. I see a lot of failures and errors in IntegerTest, LargePositiveTest & LargeNegativeTest. I'm going to try your 32-bit plugin and if that works, build and release new VMs asap. But I need to go through the plugins that changed in 3021 (generating without the broken division-by-power-of-two optimization) with a fine toothcomb:
B2DPlugin.c B3DAcceleratorPlugin.c BitBltPlugin.c BochsIA32Plugin.c FFTPlugin.c FilePlugin.c IA32ABI.c JPEGReaderPlugin.c Klatt.c MacMenubarPlugin.c MiscPrimitivePlugin.c QuicktimePlugin.c RePlugin.c ScratchPlugin.c SoundGenerationPlugin.c ARM32FFIPlugin.c IA32FFIPlugin.c ZipPlugin.c
This is work I don't want to do. I either revert back to the broken division optimization, and add cCode:inSmalltalk: hacks in Spur, or have to fix these *%^& plugins. Damned if I do, damned if I don't. See what I mean about being hit on the head.
Nicolas
On Tue, Jul 01, 2014 at 08:51:26PM -0700, Eliot Miranda wrote:
On Tue, Jul 1, 2014 at 8:44 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Tue, Jul 1, 2014 at 7:51 PM, David T. Lewis lewis@mail.msen.com wrote:
On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote:
On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
> > 2014-07-01 4:22 GMT+02:00 commits@source.squeak.org: >> >> Item was changed: >> ----- Method:
LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in
>> category 'C core') ----- >> + cDigitSub: pByteSmall len: smallLen with: pByteLarge len:
largeLen
>> into: pByteRes >> + | z | >> - cDigitSub: pByteSmall >> - len: smallLen >> - with: pByteLarge >> - len: largeLen >> - into: pByteRes >> - | z limit | >> <var: #pByteSmall type: 'unsigned char * '> >> <var: #pByteLarge type: 'unsigned char * '> >> <var: #pByteRes type: 'unsigned char * '> >> >> + z := 0. "Loop invariant is -1<=z<=1" >> + 0 to: smallLen - 1 do: >> - z := 0. >> - "Loop invariant is -1<=z<=1" >> - limit := smallLen - 1. >> - 0 to: limit do: >> [:i | >> z := z + (pByteLarge at: i) - (pByteSmall at:
i).
>> + pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant
>> form of (z bitAnd: 255)" >> > > Frankly, having z declared unsigned int and just doing pByteRes
at: i
> put: (z bitAnd: 16rFF) as I suggested would be way way simpler
and
will
> ALWAYS work. > Why the hell invoke the complications of signed arithmetic when
the
> content pByteRes is unsigned??? >
I'm not maintaining the plugin. But I broke it in fixing the
unsigned
division anomaly. I just wanted it to work again as quickly as
possibly
without expending effort. I made the minimum changes I could to
keep it
working. I'm much happier to have you maintain the plugin. You
have
the
expertise and experience.
Nicolas, my priority is to have Spur working. I don't want to
have to
expend lots of energy changing plugins to get Spur working. My
submitting
this fix is not an endorsement of any kind. It's merely
expediency.
After catching up with the email thread, I am confused as to what
problem
we are trying to solve.
As near as I can tell, the situation is:
- The original LargeIntegersPlugin>>cDigitSub:len:with:len:into:
works
with all combinations of 32/64 bit VMs and images.
- Nicolas has proposed a better implementation, along with the
recommendation to use unsigned integer C arithmetic unless there is some specific good reason to do otherwise. This seems right in principle, although the implementation in VMMaker-nice.348 is not working for 64-bit VMs, so some issues
remain
to be resolved.
Eliot's original question began with this:
I recently eliminated the optimization in Slang that replaces
a
> division by a power of two with a shift, because the code
cast the
argument > to signed, and hence broke unsigned division. That's what
used to
be > controlled by the UseRightShiftForDivide class var of CCodeGenerator. > > Yesterday I found out that that optimization is the only thing that's > keeping the LargeIntegers plugin afloat.
- At that point we had a problem in the Spur/Cog VMs that led to
some
patching of the code generation and so forth, along with this email thread.
So now I am confused. Is the problem that:
- The original implementation was broken?
Arguably yes. It relied on incorrect Slang translation to work, specifically the assumption that N / D where D is a power of two is equivalent to (sqInt)N >> S where S is log2 of D. That's wrong on two counts:
If N is -1 then (sqInt)-1 >> S is -1, whereas -1 / D is zero for D
If N is unsigned (sqInt)N >> S ~= N >> S if N has the top bit set.
- There is something different in the Spur/Cog environment that
exposed
problems in the original implementation?
Yes. I hit examples where the division optimization was generating incorrect code, e.g. in
SpurMemoryManager>>allocateOldSpaceChunkOfBytes:
initialIndex := chunkBytes / self allocationUnit.
even though chunkBytes is unsigned, Slang was generating
(sqInt)chunkBytes
3, which generates garbage when chunkBytes >= 2^31. I decided to
rip
out the optimization (it is incorrect in the -1 / D case) rather than
hack
these. That in turn surfaced the bug in the LargeIntegers plugin.
- The original implementation worked, but the change to
CCodeGenerator
with regard to its use of UseRightShiftForDivide resulted in a
problem?
Yes.
- Something else?
No.
I'm pretty sure I'm missing something here.
You don't appear to be.
and underlying this is that philosophically I believe that a) having
Slang
generate correct code and b) having Slang maintain as close a correspondence between Smalltalk and the resulting C is to be preferred
at the risk of breaking plugins (which then have to be fixed), and 2) papering over the cracks by using cCode: [] inSmalltalk: [] everywhere.
ANother way of saying this is that I believe Slang should be
approachable
by the newbie who enters with rational expectations, not something only
old
salts who know and work-around the many pitfalls. I got horribly
burned by
Slang often in the first few years of working on Cog. I likened it to being hit on the head with a long stick by one's sensei, except that no enlightenment would ever ensue. One's head would simply hurt. I've
tried
to make it better, e.g. by adding some simple type inference so that it does the right thing with unsigned and long long types. Alas, when one does this old workarounds, mistakes, or bugs may break. -- best, Eliot
On Wed, Jul 2, 2014 at 7:54 AM, Eliot Miranda eliot.miranda@gmail.com wrote:
On Wed, Jul 2, 2014 at 4:53 AM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
2014-07-02 13:29 GMT+02:00 David T. Lewis lewis@mail.msen.com:
Eliot,
Thanks for the explanation, much clearer now. I was getting lost in the email threads.
Dave
Yeah, Eliot did not break anything, the LargeInteger plugin was already broken. Only the incorrect substitution of // by >> did make it work. Eliot applied a minimal patch, but IMO invoking signed integer arithmetic when we do not need to is a recipe for more surprises... This require a deeper change if we do not want to feel the ground falling from under our feet again at the least slang change.
Alas I think it's still very broken. I see a lot of failures and errors in IntegerTest, LargePositiveTest & LargeNegativeTest. I'm going to try your 32-bit plugin and if that works, build and release new VMs asap. But I need to go through the plugins that changed in 3021 (generating without the broken division-by-power-of-two optimization) with a fine toothcomb:
B2DPlugin.c B3DAcceleratorPlugin.c BitBltPlugin.c BochsIA32Plugin.c FFTPlugin.c FilePlugin.c IA32ABI.c JPEGReaderPlugin.c Klatt.c MacMenubarPlugin.c MiscPrimitivePlugin.c QuicktimePlugin.c RePlugin.c ScratchPlugin.c SoundGenerationPlugin.c ARM32FFIPlugin.c IA32FFIPlugin.c ZipPlugin.c
This is work I don't want to do. I either revert back to the broken division optimization, and add cCode:inSmalltalk: hacks in Spur, or have to fix these *%^& plugins. Damned if I do, damned if I don't. See what I mean about being hit on the head.
Not so bad. The list is smaller than I thought:
B2DPlugin.c BitBltPlugin.c FFTPlugin.c JPEGReaderPlugin.c Klatt.c LargeIntegers.c MiscPrimitivePlugin.c RePlugin.c ScratchPlugin.c SoundGenerationPlugin.c ZipPlugin.c
The others simply changed their declaration of positive32BitValueOf
Nicolas
On Tue, Jul 01, 2014 at 08:51:26PM -0700, Eliot Miranda wrote:
On Tue, Jul 1, 2014 at 8:44 PM, Eliot Miranda <eliot.miranda@gmail.com
wrote:
On Tue, Jul 1, 2014 at 7:51 PM, David T. Lewis lewis@mail.msen.com wrote:
On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote: > > On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < > nicolas.cellier.aka.nice@gmail.com> wrote: > > > > > 2014-07-01 4:22 GMT+02:00 commits@source.squeak.org: > >> > >> Item was changed: > >> ----- Method:
LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in
> >> category 'C core') ----- > >> + cDigitSub: pByteSmall len: smallLen with: pByteLarge len:
largeLen
> >> into: pByteRes > >> + | z | > >> - cDigitSub: pByteSmall > >> - len: smallLen > >> - with: pByteLarge > >> - len: largeLen > >> - into: pByteRes > >> - | z limit | > >> <var: #pByteSmall type: 'unsigned char * '> > >> <var: #pByteLarge type: 'unsigned char * '> > >> <var: #pByteRes type: 'unsigned char * '> > >> > >> + z := 0. "Loop invariant is -1<=z<=1" > >> + 0 to: smallLen - 1 do: > >> - z := 0. > >> - "Loop invariant is -1<=z<=1" > >> - limit := smallLen - 1. > >> - 0 to: limit do: > >> [:i | > >> z := z + (pByteLarge at: i) - (pByteSmall at:
i).
> >> + pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant > >> form of (z bitAnd: 255)" > >> > > > > Frankly, having z declared unsigned int and just doing
pByteRes at: i
> > put: (z bitAnd: 16rFF) as I suggested would be way way simpler
and
will > > ALWAYS work. > > Why the hell invoke the complications of signed arithmetic when
the
> > content pByteRes is unsigned??? > > > > I'm not maintaining the plugin. But I broke it in fixing the
unsigned
> division anomaly. I just wanted it to work again as quickly as
possibly
> without expending effort. I made the minimum changes I could to
keep it
> working. I'm much happier to have you maintain the plugin. You
have
the > expertise and experience. > > Nicolas, my priority is to have Spur working. I don't want to
have to
> expend lots of energy changing plugins to get Spur working. My submitting > this fix is not an endorsement of any kind. It's merely
expediency.
>
After catching up with the email thread, I am confused as to what
problem
we are trying to solve.
As near as I can tell, the situation is:
- The original LargeIntegersPlugin>>cDigitSub:len:with:len:into:
works
with all combinations of 32/64 bit VMs and images.
- Nicolas has proposed a better implementation, along with the
recommendation to use unsigned integer C arithmetic unless there is some
specific
good reason to do otherwise. This seems right in principle, although the implementation in VMMaker-nice.348 is not working for 64-bit VMs, so some issues
remain
to be resolved.
Eliot's original question began with this:
I recently eliminated the optimization in Slang that
replaces a
> division by a power of two with a shift, because the code
cast the
argument > to signed, and hence broke unsigned division. That's what
used to
be > controlled by the UseRightShiftForDivide class var of CCodeGenerator. > > Yesterday I found out that that optimization is the only
thing
that's > keeping the LargeIntegers plugin afloat.
- At that point we had a problem in the Spur/Cog VMs that led to
some
patching of the code generation and so forth, along with this email
thread.
So now I am confused. Is the problem that:
- The original implementation was broken?
Arguably yes. It relied on incorrect Slang translation to work, specifically the assumption that N / D where D is a power of two is equivalent to (sqInt)N >> S where S is log2 of D. That's wrong on
two
counts:
If N is -1 then (sqInt)-1 >> S is -1, whereas -1 / D is zero for D
If N is unsigned (sqInt)N >> S ~= N >> S if N has the top bit set.
- There is something different in the Spur/Cog environment that
exposed
problems in the original implementation?
Yes. I hit examples where the division optimization was generating incorrect code, e.g. in
SpurMemoryManager>>allocateOldSpaceChunkOfBytes:
initialIndex := chunkBytes / self allocationUnit.
even though chunkBytes is unsigned, Slang was generating
(sqInt)chunkBytes
> 3, which generates garbage when chunkBytes >= 2^31. I decided to
rip
out the optimization (it is incorrect in the -1 / D case) rather
than hack
these. That in turn surfaced the bug in the LargeIntegers plugin.
- The original implementation worked, but the change to
CCodeGenerator
with regard to its use of UseRightShiftForDivide resulted in a
problem?
Yes.
- Something else?
No.
I'm pretty sure I'm missing something here.
You don't appear to be.
and underlying this is that philosophically I believe that a) having
Slang
generate correct code and b) having Slang maintain as close a correspondence between Smalltalk and the resulting C is to be
preferred 1)
at the risk of breaking plugins (which then have to be fixed), and 2) papering over the cracks by using cCode: [] inSmalltalk: [] everywhere.
ANother way of saying this is that I believe Slang should be
approachable
by the newbie who enters with rational expectations, not something
only old
salts who know and work-around the many pitfalls. I got horribly
burned by
Slang often in the first few years of working on Cog. I likened it to being hit on the head with a long stick by one's sensei, except that no enlightenment would ever ensue. One's head would simply hurt. I've
tried
to make it better, e.g. by adding some simple type inference so that it does the right thing with unsigned and long long types. Alas, when one does this old workarounds, mistakes, or bugs may break. -- best, Eliot
-- best, Eliot
2014-07-02 2:36 GMT+02:00 Eliot Miranda eliot.miranda@gmail.com:
On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----
- cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen
into: pByteRes
| z |
- cDigitSub: pByteSmall
len: smallLen
with: pByteLarge
len: largeLen
into: pByteRes
| z limit | <var: #pByteSmall type: 'unsigned char * '> <var: #pByteLarge type: 'unsigned char * '> <var: #pByteRes type: 'unsigned char * '>
z := 0. "Loop invariant is -1<=z<=1"
0 to: smallLen - 1 do:
z := 0.
"Loop invariant is -1<=z<=1"
limit := smallLen - 1.
0 to: limit do: [:i | z := z + (pByteLarge at: i) - (pByteSmall at: i).
pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???
I'm not maintaining the plugin. But I broke it in fixing the unsigned division anomaly. I just wanted it to work again as quickly as possibly without expending effort. I made the minimum changes I could to keep it working. I'm much happier to have you maintain the plugin. You have the expertise and experience.
Nicolas, my priority is to have Spur working. I don't want to have to expend lots of energy changing plugins to get Spur working. My submitting this fix is not an endorsement of any kind. It's merely expediency.
Yes, I understand, the smallest modification that could possibly work. But this sign-tolerant form of (z bitAnd: 255) just make me sick ;) I'll try and fix it once I understand the 64 bits VM problem.
cheers
Even if z is declared signed, (z bitAnd: 255) would work on a vast
majority of compiler/processor because most compiler/processor would use a 2-complement representation. But it's technically implementation-defined.
z := z signedBitShift: -8].
i think this one is OK too on a vast majority of machines, but this is as well technically implementation defined. Some weird compiler/processor pair may as well fill left bits with zeroes, or not even use 2-complement...
z here is a carry and should contain either 0, or -1 (that is UINT_MAX) after this operation.
smallLen to: largeLen - 1 do:
pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant form of (z bitAnd: 255)"
z := z // 256].
limit := largeLen - 1.
smallLen to: limit do: [:i | z := z + (pByteLarge at: i) .
pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
form of (z bitAnd: 255)"
z := z signedBitShift: -8].
pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant form of (z bitAnd: 255)"
!z := z // 256].
Isn't there another funny signed shift in Large Integer division? This would deserve a review too, because division is inlining its own sort of subtraction...
-- best, Eliot
vm-dev@lists.squeakfoundation.org