Re: [Vm-dev] VM Maker: VMMaker.oscog-eem.790.mcz

List overview All Threads
Download

newer

older

Re: [Vm-dev] About 32bits LargeInt...

Using MACROS from standard headers...

Nicolas Cellier

1 Jul 2014 1 Jul '14

9:21 p.m.

2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:

...

Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----

cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen into:

pByteRes

```
  | z |
```

cDigitSub: pByteSmall
```
          len: smallLen
```
```
          with: pByteLarge
```
```
          len: largeLen
```
```
          into: pByteRes
```

  | z limit |
  <var: #pByteSmall type: 'unsigned char * '>
  <var: #pByteLarge type: 'unsigned char * '>
  <var: #pByteRes type: 'unsigned char * '>

```
  z := 0. "Loop invariant is -1<=z<=1"
```
```
  0 to: smallLen - 1 do:
```

```
  z := 0.
```
```
  "Loop invariant is -1<=z<=1"
```
```
  limit := smallLen - 1.
```

  0 to: limit do:
          [:i |
          z := z + (pByteLarge at: i) - (pByteSmall at: i).

          pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant

form of (z bitAnd: 255)"

Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???

Even if z is declared signed, (z bitAnd: 255) would work on a vast majority of compiler/processor because most compiler/processor would use a 2-complement representation. But it's technically implementation-defined.

+ z := z signedBitShift: -8].

...

i think this one is OK too on a vast majority of machines, but this is as well technically implementation defined. Some weird compiler/processor pair may as well fill left bits with zeroes, or not even use 2-complement...

z here is a carry and should contain either 0, or -1 (that is UINT_MAX) after this operation.

...

```
  smallLen to: largeLen - 1 do:
```

          pByteRes at: i put: z - (z // 256 * 256).

          "sign-tolerant form of (z bitAnd: 255)"

```
          z := z // 256].
```
```
  limit := largeLen - 1.
```

  smallLen to: limit do:
          [:i |
          z := z + (pByteLarge at: i) .

          pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant

form of (z bitAnd: 255)"

```
          z := z signedBitShift: -8].
```

          pByteRes at: i put: z - (z // 256 * 256).

          "sign-tolerant form of (z bitAnd: 255)"

```
          z := z // 256].
```
!

Isn't there another funny signed shift in Large Integer division? This would deserve a review too, because division is inlining its own sort of subtraction...

Attachments:

attachment.html (text/html — 3.8 KB)

Show replies by date

David T. Lewis

2 Jul 2 Jul

12:28 a.m.

New subject: VM Maker: VMMaker.oscog-eem.790.mcz

On Tue, Jul 01, 2014 at 09:21:00PM +0200, Nicolas Cellier wrote:

...

2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:

...
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----

cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen into:

pByteRes
  | z |
cDigitSub: pByteSmall
          len: smallLen
          with: pByteLarge
          len: largeLen
          into: pByteRes
  | z limit |
  <var: #pByteSmall type: 'unsigned char * '>
  <var: #pByteLarge type: 'unsigned char * '>
  <var: #pByteRes type: 'unsigned char * '>
  z := 0. "Loop invariant is -1<=z<=1"
  0 to: smallLen - 1 do:
  z := 0.
  "Loop invariant is -1<=z<=1"
  limit := smallLen - 1.
  0 to: limit do:
          [:i |
          z := z + (pByteLarge at: i) - (pByteSmall at: i).
          pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???

Nicolas,

This sounds right to me.

Do you have a final version of #cDigitSub:len:with:len:into: that you can recommend? The version in VMMaker-nice.342 produces failures in the KernelTests-Numbers tests when compiled in 64-bit mode, but I think that you have since suggested some other improvements in this email thread.

Dave

Nicolas Cellier

7:44 a.m.

New subject: VM Maker: VMMaker.oscog-eem.790.mcz

2014-07-02 0:28 GMT+02:00 David T. Lewis lewis@mail.msen.com:

...

On Tue, Jul 01, 2014 at 09:21:00PM +0200, Nicolas Cellier wrote:

...
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:

...
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----

cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen

into:

...
...
pByteRes
  | z |
cDigitSub: pByteSmall
          len: smallLen
          with: pByteLarge
          len: largeLen
          into: pByteRes
  | z limit |
  <var: #pByteSmall type: 'unsigned char * '>
  <var: #pByteLarge type: 'unsigned char * '>
  <var: #pByteRes type: 'unsigned char * '>
  z := 0. "Loop invariant is -1<=z<=1"
  0 to: smallLen - 1 do:
  z := 0.
  "Loop invariant is -1<=z<=1"
  limit := smallLen - 1.
  0 to: limit do:
          [:i |
          z := z + (pByteLarge at: i) - (pByteSmall at: i).
          pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant

...
...
form of (z bitAnd: 255)"

Frankly, having z declared unsigned int and just doing pByteRes at: i

put:

...
(z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the

content

...
pByteRes is unsigned???

Nicolas,

This sounds right to me.

Do you have a final version of #cDigitSub:len:with:len:into: that you can recommend? The version in VMMaker-nice.342 produces failures in the KernelTests-Numbers tests when compiled in 64-bit mode, but I think that you have since suggested some other improvements in this email thread.

Dave

Hi David, that's interesting. VMMaker-nice.342 contains all the 32bits LargeInteger modifications, not just the #cDigitSub:len:with:len:into:. I tried to declare everything as <unsigned int> rather than <usqInt> with the assumption that sizeof(unsigned int)*2==sizeof(unsigned long long). But I never tried to compile a 64 bits VM, I think I lost the recipe for brewing such a flavour. I'm on MacOSX, could you provide a link to the shortest way to do it? Once I can check it, I'll publish a corrected version. Thanks

Nicolas

David T. Lewis

1:27 p.m.

New subject: VM Maker: VMMaker.oscog-eem.790.mcz

On Wed, Jul 02, 2014 at 07:44:26AM +0200, Nicolas Cellier wrote:

...

2014-07-02 0:28 GMT+02:00 David T. Lewis lewis@mail.msen.com:

...
On Tue, Jul 01, 2014 at 09:21:00PM +0200, Nicolas Cellier wrote:

...
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:

...
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----

cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen

into:

...
...
pByteRes
  | z |
cDigitSub: pByteSmall
          len: smallLen
          with: pByteLarge
          len: largeLen
          into: pByteRes
  | z limit |
  <var: #pByteSmall type: 'unsigned char * '>
  <var: #pByteLarge type: 'unsigned char * '>
  <var: #pByteRes type: 'unsigned char * '>
  z := 0. "Loop invariant is -1<=z<=1"
  0 to: smallLen - 1 do:
  z := 0.
  "Loop invariant is -1<=z<=1"
  limit := smallLen - 1.
  0 to: limit do:
          [:i |
          z := z + (pByteLarge at: i) - (pByteSmall at: i).
          pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant

...
...
form of (z bitAnd: 255)"

Frankly, having z declared unsigned int and just doing pByteRes at: i

put:

...
(z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the

content

...
pByteRes is unsigned???

Nicolas,

This sounds right to me.

Do you have a final version of #cDigitSub:len:with:len:into: that you can recommend? The version in VMMaker-nice.342 produces failures in the KernelTests-Numbers tests when compiled in 64-bit mode, but I think that you have since suggested some other improvements in this email thread.

Dave
Hi David, that's interesting. VMMaker-nice.342 contains all the 32bits LargeInteger modifications, not just the #cDigitSub:len:with:len:into:. I tried to declare everything as <unsigned int> rather than <usqInt> with the assumption that sizeof(unsigned int)*2==sizeof(unsigned long long). But I never tried to compile a 64 bits VM, I think I lost the recipe for brewing such a flavour. I'm on MacOSX, could you provide a link to the shortest way to do it? Once I can check it, I'll publish a corrected version. Thanks

Hi Nicolas,

Starting with an up to date VMMaker (trunk), I loaded #cDigitSub:len:with:len:into: from VMMaker-nice.342, then compiled the VM on 64-bit Linux and ran the unit tests.

I also ran tests on the 64-bit image with 64-bit VM, and this did *not* show any problem.

Sorry I did not look into it further, I will try to do so as soon as I get some time.

Dave

Nicolas Cellier

1:44 p.m.

New subject: VM Maker: VMMaker.oscog-eem.790.mcz

2014-07-02 13:27 GMT+02:00 David T. Lewis lewis@mail.msen.com:

...

On Wed, Jul 02, 2014 at 07:44:26AM +0200, Nicolas Cellier wrote:

...
2014-07-02 0:28 GMT+02:00 David T. Lewis lewis@mail.msen.com:

...
On Tue, Jul 01, 2014 at 09:21:00PM +0200, Nicolas Cellier wrote:

...
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:

...
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into:

(in

...
...
...
...
category 'C core') -----

cDigitSub: pByteSmall len: smallLen with: pByteLarge len:

largeLen

...
...
into:

...
...
pByteRes
  | z |
cDigitSub: pByteSmall
          len: smallLen
          with: pByteLarge
          len: largeLen
          into: pByteRes
  | z limit |
  <var: #pByteSmall type: 'unsigned char * '>
  <var: #pByteLarge type: 'unsigned char * '>
  <var: #pByteRes type: 'unsigned char * '>
  z := 0. "Loop invariant is -1<=z<=1"
  0 to: smallLen - 1 do:
  z := 0.
  "Loop invariant is -1<=z<=1"
  limit := smallLen - 1.
  0 to: limit do:
          [:i |
          z := z + (pByteLarge at: i) - (pByteSmall at: i).
          pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant

...
...
form of (z bitAnd: 255)"

Frankly, having z declared unsigned int and just doing pByteRes at:
i

...
...
put:

...
(z bitAnd: 16rFF) as I suggested would be way way simpler and will

ALWAYS

...
...
...
work. Why the hell invoke the complications of signed arithmetic when the

content

...
pByteRes is unsigned???

Nicolas,

This sounds right to me.

Do you have a final version of #cDigitSub:len:with:len:into: that you

can

...
...
recommend? The version in VMMaker-nice.342 produces failures in the KernelTests-Numbers tests when compiled in 64-bit mode, but I think

that

...
...
you have since suggested some other improvements in this email thread.

Dave

Hi David, that's interesting. VMMaker-nice.342 contains all the 32bits LargeInteger modifications, not just the #cDigitSub:len:with:len:into:. I tried to declare everything as <unsigned int> rather than <usqInt> with the assumption that sizeof(unsigned int)*2==sizeof(unsigned long long). But I never tried to compile a 64 bits VM, I think I lost the recipe for brewing such a flavour. I'm on MacOSX, could you provide a link to the shortest way to do it? Once I can check it, I'll publish a corrected version. Thanks

Hi Nicolas,

...

Starting with an up to date VMMaker (trunk), I loaded #cDigitSub:len:with:len:into: from VMMaker-nice.342, then compiled the VM on 64-bit Linux and ran the unit tests.

Ah, but you can't pick this single method from LargeIntegerPlugin v2.0... It tries to access/process by 32-bits digits instead of 8-bits digits. So it would be necessary to also change the send site for providing the right number of digits... I'll do a rewrite for 8 bits this evening...

Nicolas

I also ran tests on the 64-bit image with 64-bit VM, and this did *not* show

...

any problem.

Sorry I did not look into it further, I will try to do so as soon as I get some time.

Dave

Eliot Miranda

2:36 a.m.

New subject: VM Maker: VMMaker.oscog-eem.790.mcz

On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...

2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:

...
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----

cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen

into: pByteRes
  | z |
cDigitSub: pByteSmall
          len: smallLen
          with: pByteLarge
          len: largeLen
          into: pByteRes
  | z limit |
  <var: #pByteSmall type: 'unsigned char * '>
  <var: #pByteLarge type: 'unsigned char * '>
  <var: #pByteRes type: 'unsigned char * '>
  z := 0. "Loop invariant is -1<=z<=1"
  0 to: smallLen - 1 do:
  z := 0.
  "Loop invariant is -1<=z<=1"
  limit := smallLen - 1.
  0 to: limit do:
          [:i |
          z := z + (pByteLarge at: i) - (pByteSmall at: i).
          pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???

I'm not maintaining the plugin. But I broke it in fixing the unsigned division anomaly. I just wanted it to work again as quickly as possibly without expending effort. I made the minimum changes I could to keep it working. I'm much happier to have you maintain the plugin. You have the expertise and experience.

Nicolas, my priority is to have Spur working. I don't want to have to expend lots of energy changing plugins to get Spur working. My submitting this fix is not an endorsement of any kind. It's merely expediency.

Even if z is declared signed, (z bitAnd: 255) would work on a vast majority

...

of compiler/processor because most compiler/processor would use a 2-complement representation. But it's technically implementation-defined.
          z := z signedBitShift: -8].
...
i think this one is OK too on a vast majority of machines, but this is as well technically implementation defined. Some weird compiler/processor pair may as well fill left bits with zeroes, or not even use 2-complement...

z here is a carry and should contain either 0, or -1 (that is UINT_MAX) after this operation.

...
  smallLen to: largeLen - 1 do:
          pByteRes at: i put: z - (z // 256 * 256).
          "sign-tolerant form of (z bitAnd: 255)"
          z := z // 256].
  limit := largeLen - 1.
  smallLen to: limit do:
          [:i |
          z := z + (pByteLarge at: i) .
          pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
form of (z bitAnd: 255)"
          z := z signedBitShift: -8].
          pByteRes at: i put: z - (z // 256 * 256).
          "sign-tolerant form of (z bitAnd: 255)"
          z := z // 256].
!
Isn't there another funny signed shift in Large Integer division? This would deserve a review too, because division is inlining its own sort of subtraction...

-- best, Eliot

David T. Lewis

4:51 a.m.

New subject: What problem are we trying to solve? (was: VM Maker: VMMaker.oscog-eem.790.mcz)

On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote:

...

On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:

...
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----

cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen

into: pByteRes
  | z |
cDigitSub: pByteSmall
          len: smallLen
          with: pByteLarge
          len: largeLen
          into: pByteRes
  | z limit |
  <var: #pByteSmall type: 'unsigned char * '>
  <var: #pByteLarge type: 'unsigned char * '>
  <var: #pByteRes type: 'unsigned char * '>
  z := 0. "Loop invariant is -1<=z<=1"
  0 to: smallLen - 1 do:
  z := 0.
  "Loop invariant is -1<=z<=1"
  limit := smallLen - 1.
  0 to: limit do:
          [:i |
          z := z + (pByteLarge at: i) - (pByteSmall at: i).
          pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???
I'm not maintaining the plugin. But I broke it in fixing the unsigned division anomaly. I just wanted it to work again as quickly as possibly without expending effort. I made the minimum changes I could to keep it working. I'm much happier to have you maintain the plugin. You have the expertise and experience.

Nicolas, my priority is to have Spur working. I don't want to have to expend lots of energy changing plugins to get Spur working. My submitting this fix is not an endorsement of any kind. It's merely expediency.

After catching up with the email thread, I am confused as to what problem we are trying to solve.

As near as I can tell, the situation is:

- The original LargeIntegersPlugin>>cDigitSub:len:with:len:into: works with all combinations of 32/64 bit VMs and images.

- Nicolas has proposed a better implementation, along with the recommendation to use unsigned integer C arithmetic unless there is some specific good reason to do otherwise. This seems right in principle, although the implementation in VMMaker-nice.348 is not working for 64-bit VMs, so some issues remain to be resolved.

- Eliot's original question began with this:

> I recently eliminated the optimization in Slang that replaces a > division by a power of two with a shift, because the code cast the argument > to signed, and hence broke unsigned division. That's what used to be > controlled by the UseRightShiftForDivide class var of CCodeGenerator. > > Yesterday I found out that that optimization is the only thing that's > keeping the LargeIntegers plugin afloat.

- At that point we had a problem in the Spur/Cog VMs that led to some patching of the code generation and so forth, along with this email thread.

So now I am confused. Is the problem that:

- The original implementation was broken?

- There is something different in the Spur/Cog environment that exposed problems in the original implementation?

- The original implementation worked, but the change to CCodeGenerator with regard to its use of UseRightShiftForDivide resulted in a problem?

- Something else?

I'm pretty sure I'm missing something here.

Dave

Eliot Miranda

5:44 a.m.

New subject: What problem are we trying to solve? (was: VM Maker: VMMaker.oscog-eem.790.mcz)

On Tue, Jul 1, 2014 at 7:51 PM, David T. Lewis lewis@mail.msen.com wrote:

...

On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote:

...
On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:

...
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----

cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen

into: pByteRes
  | z |
cDigitSub: pByteSmall
          len: smallLen
          with: pByteLarge
          len: largeLen
          into: pByteRes
  | z limit |
  <var: #pByteSmall type: 'unsigned char * '>
  <var: #pByteLarge type: 'unsigned char * '>
  <var: #pByteRes type: 'unsigned char * '>
  z := 0. "Loop invariant is -1<=z<=1"
  0 to: smallLen - 1 do:
  z := 0.
  "Loop invariant is -1<=z<=1"
  limit := smallLen - 1.
  0 to: limit do:
          [:i |
          z := z + (pByteLarge at: i) - (pByteSmall at: i).
          pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant

...
...
...
form of (z bitAnd: 255)"

Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???

I'm not maintaining the plugin. But I broke it in fixing the unsigned division anomaly. I just wanted it to work again as quickly as possibly without expending effort. I made the minimum changes I could to keep it working. I'm much happier to have you maintain the plugin. You have the expertise and experience.

Nicolas, my priority is to have Spur working. I don't want to have to expend lots of energy changing plugins to get Spur working. My

submitting

...
this fix is not an endorsement of any kind. It's merely expediency.

After catching up with the email thread, I am confused as to what problem we are trying to solve.

As near as I can tell, the situation is:

The original LargeIntegersPlugin>>cDigitSub:len:with:len:into: works

with all combinations of 32/64 bit VMs and images.

Nicolas has proposed a better implementation, along with the

recommendation to use unsigned integer C arithmetic unless there is some specific good reason to do otherwise. This seems right in principle, although the implementation in VMMaker-nice.348 is not working for 64-bit VMs, so some issues remain to be resolved.

Eliot's original question began with this:

...
I recently eliminated the optimization in Slang that replaces a division by a power of two with a shift, because the code cast the

argument > to signed, and hence broke unsigned division. That's what used to be > controlled by the UseRightShiftForDivide class var of CCodeGenerator. > > Yesterday I found out that that optimization is the only thing that's > keeping the LargeIntegers plugin afloat.

At that point we had a problem in the Spur/Cog VMs that led to some

patching of the code generation and so forth, along with this email thread.

So now I am confused. Is the problem that:

The original implementation was broken?

Arguably yes. It relied on incorrect Slang translation to work, specifically the assumption that N / D where D is a power of two is equivalent to (sqInt)N >> S where S is log2 of D. That's wrong on two counts:

If N is -1 then (sqInt)-1 >> S is -1, whereas -1 / D is zero for D > 1. If N is unsigned (sqInt)N >> S ~= N >> S if N has the top bit set.

- There is something different in the Spur/Cog environment that exposed

...

problems in the original implementation?

Yes. I hit examples where the division optimization was generating incorrect code, e.g. in SpurMemoryManager>>allocateOldSpaceChunkOfBytes:

initialIndex := chunkBytes / self allocationUnit.

even though chunkBytes is unsigned, Slang was generating (sqInt)chunkBytes

...

...
3, which generates garbage when chunkBytes >= 2^31. I decided to rip

out the optimization (it is incorrect in the -1 / D case) rather than hack these. That in turn surfaced the bug in the LargeIntegers plugin.

- The original implementation worked, but the change to CCodeGenerator

...

with regard to its use of UseRightShiftForDivide resulted in a problem?

Yes.

...

Something else?

No.

I'm pretty sure I'm missing something here.

...

You don't appear to be.

-- best, Eliot

Eliot Miranda

5:51 a.m.

New subject: What problem are we trying to solve? (was: VM Maker: VMMaker.oscog-eem.790.mcz)

On Tue, Jul 1, 2014 at 8:44 PM, Eliot Miranda eliot.miranda@gmail.com wrote:

...

On Tue, Jul 1, 2014 at 7:51 PM, David T. Lewis lewis@mail.msen.com wrote:

...
On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote:

...
On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:

...
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----

cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen

into: pByteRes
  | z |
cDigitSub: pByteSmall
          len: smallLen
          with: pByteLarge
          len: largeLen
          into: pByteRes
  | z limit |
  <var: #pByteSmall type: 'unsigned char * '>
  <var: #pByteLarge type: 'unsigned char * '>
  <var: #pByteRes type: 'unsigned char * '>
  z := 0. "Loop invariant is -1<=z<=1"
  0 to: smallLen - 1 do:
  z := 0.
  "Loop invariant is -1<=z<=1"
  limit := smallLen - 1.
  0 to: limit do:
          [:i |
          z := z + (pByteLarge at: i) - (pByteSmall at: i).
          pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant

...
...
...
form of (z bitAnd: 255)"

Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and

will

...
...
ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???

I'm not maintaining the plugin. But I broke it in fixing the unsigned division anomaly. I just wanted it to work again as quickly as possibly without expending effort. I made the minimum changes I could to keep it working. I'm much happier to have you maintain the plugin. You have

the

...
expertise and experience.

Nicolas, my priority is to have Spur working. I don't want to have to expend lots of energy changing plugins to get Spur working. My

submitting

...
this fix is not an endorsement of any kind. It's merely expediency.

After catching up with the email thread, I am confused as to what problem we are trying to solve.

As near as I can tell, the situation is:

The original LargeIntegersPlugin>>cDigitSub:len:with:len:into: works

with all combinations of 32/64 bit VMs and images.

Nicolas has proposed a better implementation, along with the

recommendation to use unsigned integer C arithmetic unless there is some specific good reason to do otherwise. This seems right in principle, although the implementation in VMMaker-nice.348 is not working for 64-bit VMs, so some issues remain to be resolved.

Eliot's original question began with this:

...
I recently eliminated the optimization in Slang that replaces a division by a power of two with a shift, because the code cast the

argument > to signed, and hence broke unsigned division. That's what used to be > controlled by the UseRightShiftForDivide class var of CCodeGenerator. > > Yesterday I found out that that optimization is the only thing that's > keeping the LargeIntegers plugin afloat.

At that point we had a problem in the Spur/Cog VMs that led to some

patching of the code generation and so forth, along with this email thread.

So now I am confused. Is the problem that:

The original implementation was broken?
Arguably yes. It relied on incorrect Slang translation to work, specifically the assumption that N / D where D is a power of two is equivalent to (sqInt)N >> S where S is log2 of D. That's wrong on two counts:

If N is -1 then (sqInt)-1 >> S is -1, whereas -1 / D is zero for D > 1. If N is unsigned (sqInt)N >> S ~= N >> S if N has the top bit set.

There is something different in the Spur/Cog environment that exposed

...
problems in the original implementation?
Yes. I hit examples where the division optimization was generating incorrect code, e.g. in SpurMemoryManager>>allocateOldSpaceChunkOfBytes:

initialIndex := chunkBytes / self allocationUnit.

even though chunkBytes is unsigned, Slang was generating (sqInt)chunkBytes

...
...
3, which generates garbage when chunkBytes >= 2^31. I decided to rip

out the optimization (it is incorrect in the -1 / D case) rather than hack these. That in turn surfaced the bug in the LargeIntegers plugin.

The original implementation worked, but the change to CCodeGenerator

...
with regard to its use of UseRightShiftForDivide resulted in a problem?

Yes.

...

Something else?

No.

I'm pretty sure I'm missing something here.

...
You don't appear to be.

and underlying this is that philosophically I believe that a) having Slang

generate correct code and b) having Slang maintain as close a correspondence between Smalltalk and the resulting C is to be preferred 1) at the risk of breaking plugins (which then have to be fixed), and 2) papering over the cracks by using cCode: [] inSmalltalk: [] everywhere.

ANother way of saying this is that I believe Slang should be approachable by the newbie who enters with rational expectations, not something only old salts who know and work-around the many pitfalls. I got horribly burned by Slang often in the first few years of working on Cog. I likened it to being hit on the head with a long stick by one's sensei, except that no enlightenment would ever ensue. One's head would simply hurt. I've tried to make it better, e.g. by adding some simple type inference so that it does the right thing with unsigned and long long types. Alas, when one does this old workarounds, mistakes, or bugs may break.

-- best, Eliot

David T. Lewis

1:29 p.m.

New subject: What problem are we trying to solve? (was: VM Maker: VMMaker.oscog-eem.790.mcz)

Eliot,

Thanks for the explanation, much clearer now. I was getting lost in the email threads.

Dave

On Tue, Jul 01, 2014 at 08:51:26PM -0700, Eliot Miranda wrote:

...

On Tue, Jul 1, 2014 at 8:44 PM, Eliot Miranda eliot.miranda@gmail.com wrote:

...
On Tue, Jul 1, 2014 at 7:51 PM, David T. Lewis lewis@mail.msen.com wrote:

...
On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote:

...
On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:

...
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----

cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen

into: pByteRes
  | z |
cDigitSub: pByteSmall
          len: smallLen
          with: pByteLarge
          len: largeLen
          into: pByteRes
  | z limit |
  <var: #pByteSmall type: 'unsigned char * '>
  <var: #pByteLarge type: 'unsigned char * '>
  <var: #pByteRes type: 'unsigned char * '>
  z := 0. "Loop invariant is -1<=z<=1"
  0 to: smallLen - 1 do:
  z := 0.
  "Loop invariant is -1<=z<=1"
  limit := smallLen - 1.
  0 to: limit do:
          [:i |
          z := z + (pByteLarge at: i) - (pByteSmall at: i).
          pByteRes at: i put: z - (z // 256 * 256).
"sign-tolerant

...
...
...
form of (z bitAnd: 255)"

Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and

will

...
...
ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???

I'm not maintaining the plugin. But I broke it in fixing the unsigned division anomaly. I just wanted it to work again as quickly as possibly without expending effort. I made the minimum changes I could to keep it working. I'm much happier to have you maintain the plugin. You have

the

...
expertise and experience.

Nicolas, my priority is to have Spur working. I don't want to have to expend lots of energy changing plugins to get Spur working. My

submitting

...
this fix is not an endorsement of any kind. It's merely expediency.

After catching up with the email thread, I am confused as to what problem we are trying to solve.

As near as I can tell, the situation is:

The original LargeIntegersPlugin>>cDigitSub:len:with:len:into: works

with all combinations of 32/64 bit VMs and images.

Nicolas has proposed a better implementation, along with the

recommendation to use unsigned integer C arithmetic unless there is some specific good reason to do otherwise. This seems right in principle, although the implementation in VMMaker-nice.348 is not working for 64-bit VMs, so some issues remain to be resolved.

Eliot's original question began with this:

...
I recently eliminated the optimization in Slang that replaces a division by a power of two with a shift, because the code cast the

argument > to signed, and hence broke unsigned division. That's what used to be > controlled by the UseRightShiftForDivide class var of CCodeGenerator. > > Yesterday I found out that that optimization is the only thing that's > keeping the LargeIntegers plugin afloat.

At that point we had a problem in the Spur/Cog VMs that led to some

patching of the code generation and so forth, along with this email thread.

So now I am confused. Is the problem that:

The original implementation was broken?
Arguably yes. It relied on incorrect Slang translation to work, specifically the assumption that N / D where D is a power of two is equivalent to (sqInt)N >> S where S is log2 of D. That's wrong on two counts:

If N is -1 then (sqInt)-1 >> S is -1, whereas -1 / D is zero for D > 1. If N is unsigned (sqInt)N >> S ~= N >> S if N has the top bit set.

There is something different in the Spur/Cog environment that exposed

...
problems in the original implementation?
Yes. I hit examples where the division optimization was generating incorrect code, e.g. in SpurMemoryManager>>allocateOldSpaceChunkOfBytes:

initialIndex := chunkBytes / self allocationUnit.

even though chunkBytes is unsigned, Slang was generating (sqInt)chunkBytes

...
...
3, which generates garbage when chunkBytes >= 2^31. I decided to rip

out the optimization (it is incorrect in the -1 / D case) rather than hack these. That in turn surfaced the bug in the LargeIntegers plugin.

The original implementation worked, but the change to CCodeGenerator

...
with regard to its use of UseRightShiftForDivide resulted in a problem?

Yes.

...

Something else?

No.

I'm pretty sure I'm missing something here.

...
You don't appear to be.

and underlying this is that philosophically I believe that a) having Slang
generate correct code and b) having Slang maintain as close a correspondence between Smalltalk and the resulting C is to be preferred 1) at the risk of breaking plugins (which then have to be fixed), and 2) papering over the cracks by using cCode: [] inSmalltalk: [] everywhere.

ANother way of saying this is that I believe Slang should be approachable by the newbie who enters with rational expectations, not something only old salts who know and work-around the many pitfalls. I got horribly burned by Slang often in the first few years of working on Cog. I likened it to being hit on the head with a long stick by one's sensei, except that no enlightenment would ever ensue. One's head would simply hurt. I've tried to make it better, e.g. by adding some simple type inference so that it does the right thing with unsigned and long long types. Alas, when one does this old workarounds, mistakes, or bugs may break. -- best, Eliot

Nicolas Cellier

1:53 p.m.

New subject: What problem are we trying to solve? (was: VM Maker: VMMaker.oscog-eem.790.mcz)

2014-07-02 13:29 GMT+02:00 David T. Lewis lewis@mail.msen.com:

...

Eliot,

Thanks for the explanation, much clearer now. I was getting lost in the email threads.

Dave

Yeah, Eliot did not break anything, the LargeInteger plugin was already broken. Only the incorrect substitution of // by >> did make it work. Eliot applied a minimal patch, but IMO invoking signed integer arithmetic when we do not need to is a recipe for more surprises... This require a deeper change if we do not want to feel the ground falling from under our feet again at the least slang change.

Nicolas

On Tue, Jul 01, 2014 at 08:51:26PM -0700, Eliot Miranda wrote:

...

...
On Tue, Jul 1, 2014 at 8:44 PM, Eliot Miranda eliot.miranda@gmail.com wrote:

...
On Tue, Jul 1, 2014 at 7:51 PM, David T. Lewis lewis@mail.msen.com wrote:

...
On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote:

...
On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org: > > Item was changed: > ----- Method:

LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in

...
...
...
...
...
> category 'C core') ----- > + cDigitSub: pByteSmall len: smallLen with: pByteLarge len:

largeLen

...
...
...
...
...
> into: pByteRes > + | z | > - cDigitSub: pByteSmall > - len: smallLen > - with: pByteLarge > - len: largeLen > - into: pByteRes > - | z limit | > <var: #pByteSmall type: 'unsigned char * '> > <var: #pByteLarge type: 'unsigned char * '> > <var: #pByteRes type: 'unsigned char * '> > > + z := 0. "Loop invariant is -1<=z<=1" > + 0 to: smallLen - 1 do: > - z := 0. > - "Loop invariant is -1<=z<=1" > - limit := smallLen - 1. > - 0 to: limit do: > [:i | > z := z + (pByteLarge at: i) - (pByteSmall at: i). > + pByteRes at: i put: z - (z // 256 * 256).

"sign-tolerant

...
...
> form of (z bitAnd: 255)" >

Frankly, having z declared unsigned int and just doing pByteRes

at: i

...
...
...
...
...
put: (z bitAnd: 16rFF) as I suggested would be way way simpler and

will

...
...
ALWAYS work. Why the hell invoke the complications of signed arithmetic when

the

...
...
...
...
...
content pByteRes is unsigned???

I'm not maintaining the plugin. But I broke it in fixing the

unsigned

...
...
...
...
division anomaly. I just wanted it to work again as quickly as

possibly

...
...
...
...
without expending effort. I made the minimum changes I could to

keep it

...
...
...
...
working. I'm much happier to have you maintain the plugin. You

have

...
...
...
the

...
expertise and experience.

Nicolas, my priority is to have Spur working. I don't want to have

to

...
...
...
...
expend lots of energy changing plugins to get Spur working. My

submitting

...
this fix is not an endorsement of any kind. It's merely expediency.

After catching up with the email thread, I am confused as to what

problem

...
...
...
we are trying to solve.

As near as I can tell, the situation is:

The original LargeIntegersPlugin>>cDigitSub:len:with:len:into:

works

...
...
...
with all combinations of 32/64 bit VMs and images.

Nicolas has proposed a better implementation, along with the

recommendation to use unsigned integer C arithmetic unless there is some specific good reason to do otherwise. This seems right in principle, although the implementation in VMMaker-nice.348 is not working for 64-bit VMs, so some issues

remain

...
...
...
to be resolved.

Eliot's original question began with this:

...
I recently eliminated the optimization in Slang that replaces a division by a power of two with a shift, because the code cast

the

...
...
...
argument > to signed, and hence broke unsigned division. That's what

used to

...
...
...
be > controlled by the UseRightShiftForDivide class var of CCodeGenerator. > > Yesterday I found out that that optimization is the only thing that's > keeping the LargeIntegers plugin afloat.

At that point we had a problem in the Spur/Cog VMs that led to some

patching of the code generation and so forth, along with this email thread.

So now I am confused. Is the problem that:

The original implementation was broken?

Arguably yes. It relied on incorrect Slang translation to work, specifically the assumption that N / D where D is a power of two is equivalent to (sqInt)N >> S where S is log2 of D. That's wrong on two counts:

If N is -1 then (sqInt)-1 >> S is -1, whereas -1 / D is zero for D >

...
...
If N is unsigned (sqInt)N >> S ~= N >> S if N has the top bit set.

There is something different in the Spur/Cog environment that

exposed

...
...
...
problems in the original implementation?
Yes. I hit examples where the division optimization was generating incorrect code, e.g. in
SpurMemoryManager>>allocateOldSpaceChunkOfBytes:

...
...
initialIndex := chunkBytes / self allocationUnit.

even though chunkBytes is unsigned, Slang was generating

(sqInt)chunkBytes

...
...
...
...
3, which generates garbage when chunkBytes >= 2^31. I decided to

rip

...
...
out the optimization (it is incorrect in the -1 / D case) rather than

hack

...
...
these. That in turn surfaced the bug in the LargeIntegers plugin.

The original implementation worked, but the change to

CCodeGenerator

...
...
...
with regard to its use of UseRightShiftForDivide resulted in a problem?

Yes.

...

Something else?

No.

I'm pretty sure I'm missing something here.

...
You don't appear to be.

and underlying this is that philosophically I believe that a) having

Slang

...
generate correct code and b) having Slang maintain as close a correspondence between Smalltalk and the resulting C is to be preferred

...
at the risk of breaking plugins (which then have to be fixed), and 2) papering over the cracks by using cCode: [] inSmalltalk: [] everywhere.

ANother way of saying this is that I believe Slang should be approachable by the newbie who enters with rational expectations, not something only

old

...
salts who know and work-around the many pitfalls. I got horribly burned

by

...
Slang often in the first few years of working on Cog. I likened it to being hit on the head with a long stick by one's sensei, except that no enlightenment would ever ensue. One's head would simply hurt. I've

tried

...
to make it better, e.g. by adding some simple type inference so that it does the right thing with unsigned and long long types. Alas, when one does this old workarounds, mistakes, or bugs may break. -- best, Eliot

Eliot Miranda

4:54 p.m.

New subject: What problem are we trying to solve? (was: VM Maker: VMMaker.oscog-eem.790.mcz)

On Wed, Jul 2, 2014 at 4:53 AM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...

2014-07-02 13:29 GMT+02:00 David T. Lewis lewis@mail.msen.com:

...
Eliot,

Thanks for the explanation, much clearer now. I was getting lost in the email threads.

Dave

Yeah, Eliot did not break anything, the LargeInteger plugin was already broken. Only the incorrect substitution of // by >> did make it work. Eliot applied a minimal patch, but IMO invoking signed integer arithmetic when we do not need to is a recipe for more surprises... This require a deeper change if we do not want to feel the ground falling from under our feet again at the least slang change.

Alas I think it's still very broken. I see a lot of failures and errors in IntegerTest, LargePositiveTest & LargeNegativeTest. I'm going to try your 32-bit plugin and if that works, build and release new VMs asap. But I need to go through the plugins that changed in 3021 (generating without the broken division-by-power-of-two optimization) with a fine toothcomb:

B2DPlugin.c B3DAcceleratorPlugin.c BitBltPlugin.c BochsIA32Plugin.c FFTPlugin.c FilePlugin.c IA32ABI.c JPEGReaderPlugin.c Klatt.c MacMenubarPlugin.c MiscPrimitivePlugin.c QuicktimePlugin.c RePlugin.c ScratchPlugin.c SoundGenerationPlugin.c ARM32FFIPlugin.c IA32FFIPlugin.c ZipPlugin.c

This is work I don't want to do. I either revert back to the broken division optimization, and add cCode:inSmalltalk: hacks in Spur, or have to fix these *%^& plugins. Damned if I do, damned if I don't. See what I mean about being hit on the head.

...

Nicolas

On Tue, Jul 01, 2014 at 08:51:26PM -0700, Eliot Miranda wrote:

...
...
On Tue, Jul 1, 2014 at 8:44 PM, Eliot Miranda eliot.miranda@gmail.com wrote:

...
On Tue, Jul 1, 2014 at 7:51 PM, David T. Lewis lewis@mail.msen.com wrote:

...
On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote:

...
On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

> > 2014-07-01 4:22 GMT+02:00 commits@source.squeak.org: >> >> Item was changed: >> ----- Method:

LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in

...
...
...
...
>> category 'C core') ----- >> + cDigitSub: pByteSmall len: smallLen with: pByteLarge len:

largeLen

...
...
...
...
>> into: pByteRes >> + | z | >> - cDigitSub: pByteSmall >> - len: smallLen >> - with: pByteLarge >> - len: largeLen >> - into: pByteRes >> - | z limit | >> <var: #pByteSmall type: 'unsigned char * '> >> <var: #pByteLarge type: 'unsigned char * '> >> <var: #pByteRes type: 'unsigned char * '> >> >> + z := 0. "Loop invariant is -1<=z<=1" >> + 0 to: smallLen - 1 do: >> - z := 0. >> - "Loop invariant is -1<=z<=1" >> - limit := smallLen - 1. >> - 0 to: limit do: >> [:i | >> z := z + (pByteLarge at: i) - (pByteSmall at:

i).

...
...
...
...
>> + pByteRes at: i put: z - (z // 256 * 256).

"sign-tolerant

...
>> form of (z bitAnd: 255)" >> > > Frankly, having z declared unsigned int and just doing pByteRes

at: i

...
...
...
...
> put: (z bitAnd: 16rFF) as I suggested would be way way simpler

and

...
...
...
will

...
> ALWAYS work. > Why the hell invoke the complications of signed arithmetic when

the

...
...
...
...
> content pByteRes is unsigned??? >

I'm not maintaining the plugin. But I broke it in fixing the

unsigned

...
...
...
...
division anomaly. I just wanted it to work again as quickly as

possibly

...
...
...
...
without expending effort. I made the minimum changes I could to

keep it

...
...
...
...
working. I'm much happier to have you maintain the plugin. You

have

...
...
...
the

...
expertise and experience.

Nicolas, my priority is to have Spur working. I don't want to

have to

...
...
...
...
expend lots of energy changing plugins to get Spur working. My

submitting

...
this fix is not an endorsement of any kind. It's merely

expediency.

...
...
...
...
After catching up with the email thread, I am confused as to what

problem

...
...
...
we are trying to solve.

As near as I can tell, the situation is:

The original LargeIntegersPlugin>>cDigitSub:len:with:len:into:

works

...
...
...
with all combinations of 32/64 bit VMs and images.

Nicolas has proposed a better implementation, along with the

recommendation to use unsigned integer C arithmetic unless there is some specific good reason to do otherwise. This seems right in principle, although the implementation in VMMaker-nice.348 is not working for 64-bit VMs, so some issues

remain

...
...
...
to be resolved.

Eliot's original question began with this:

...
I recently eliminated the optimization in Slang that replaces

a

...
...
...
 > division by a power of two with a shift, because the code
cast the

...
...
...
argument > to signed, and hence broke unsigned division. That's what

used to

...
...
...
be > controlled by the UseRightShiftForDivide class var of CCodeGenerator. > > Yesterday I found out that that optimization is the only thing that's > keeping the LargeIntegers plugin afloat.

At that point we had a problem in the Spur/Cog VMs that led to

some

...
...
...
patching of the code generation and so forth, along with this email thread.

So now I am confused. Is the problem that:

The original implementation was broken?

Arguably yes. It relied on incorrect Slang translation to work, specifically the assumption that N / D where D is a power of two is equivalent to (sqInt)N >> S where S is log2 of D. That's wrong on two counts:

If N is -1 then (sqInt)-1 >> S is -1, whereas -1 / D is zero for D

...
If N is unsigned (sqInt)N >> S ~= N >> S if N has the top bit set.

There is something different in the Spur/Cog environment that

exposed

...
...
...
problems in the original implementation?
Yes. I hit examples where the division optimization was generating incorrect code, e.g. in
SpurMemoryManager>>allocateOldSpaceChunkOfBytes:

...
...
initialIndex := chunkBytes / self allocationUnit.

even though chunkBytes is unsigned, Slang was generating

(sqInt)chunkBytes

...
...
...
...
3, which generates garbage when chunkBytes >= 2^31. I decided to

rip

...
...
out the optimization (it is incorrect in the -1 / D case) rather than

hack

...
...
these. That in turn surfaced the bug in the LargeIntegers plugin.

The original implementation worked, but the change to

CCodeGenerator

...
...
...
with regard to its use of UseRightShiftForDivide resulted in a

problem?

...
...
...
Yes.

...

Something else?

No.

I'm pretty sure I'm missing something here.

...
You don't appear to be.

and underlying this is that philosophically I believe that a) having

Slang

...
generate correct code and b) having Slang maintain as close a correspondence between Smalltalk and the resulting C is to be preferred

...
at the risk of breaking plugins (which then have to be fixed), and 2) papering over the cracks by using cCode: [] inSmalltalk: [] everywhere.

ANother way of saying this is that I believe Slang should be

approachable

...
by the newbie who enters with rational expectations, not something only

old

...
salts who know and work-around the many pitfalls. I got horribly

burned by

...
Slang often in the first few years of working on Cog. I likened it to being hit on the head with a long stick by one's sensei, except that no enlightenment would ever ensue. One's head would simply hurt. I've

tried

...
to make it better, e.g. by adding some simple type inference so that it does the right thing with unsigned and long long types. Alas, when one does this old workarounds, mistakes, or bugs may break. -- best, Eliot

-- best, Eliot

Eliot Miranda

5:32 p.m.

New subject: What problem are we trying to solve? (was: VM Maker: VMMaker.oscog-eem.790.mcz)

On Wed, Jul 2, 2014 at 7:54 AM, Eliot Miranda eliot.miranda@gmail.com wrote:

...

On Wed, Jul 2, 2014 at 4:53 AM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...
2014-07-02 13:29 GMT+02:00 David T. Lewis lewis@mail.msen.com:

...
Eliot,

Thanks for the explanation, much clearer now. I was getting lost in the email threads.

Dave

Yeah, Eliot did not break anything, the LargeInteger plugin was already broken. Only the incorrect substitution of // by >> did make it work. Eliot applied a minimal patch, but IMO invoking signed integer arithmetic when we do not need to is a recipe for more surprises... This require a deeper change if we do not want to feel the ground falling from under our feet again at the least slang change.

Alas I think it's still very broken. I see a lot of failures and errors in IntegerTest, LargePositiveTest & LargeNegativeTest. I'm going to try your 32-bit plugin and if that works, build and release new VMs asap. But I need to go through the plugins that changed in 3021 (generating without the broken division-by-power-of-two optimization) with a fine toothcomb:

B2DPlugin.c B3DAcceleratorPlugin.c BitBltPlugin.c BochsIA32Plugin.c FFTPlugin.c FilePlugin.c IA32ABI.c JPEGReaderPlugin.c Klatt.c MacMenubarPlugin.c MiscPrimitivePlugin.c QuicktimePlugin.c RePlugin.c ScratchPlugin.c SoundGenerationPlugin.c ARM32FFIPlugin.c IA32FFIPlugin.c ZipPlugin.c

This is work I don't want to do. I either revert back to the broken division optimization, and add cCode:inSmalltalk: hacks in Spur, or have to fix these *%^& plugins. Damned if I do, damned if I don't. See what I mean about being hit on the head.

Not so bad. The list is smaller than I thought:

B2DPlugin.c BitBltPlugin.c FFTPlugin.c JPEGReaderPlugin.c Klatt.c LargeIntegers.c MiscPrimitivePlugin.c RePlugin.c ScratchPlugin.c SoundGenerationPlugin.c ZipPlugin.c

The others simply changed their declaration of positive32BitValueOf

...

...
Nicolas

On Tue, Jul 01, 2014 at 08:51:26PM -0700, Eliot Miranda wrote:

...
...
On Tue, Jul 1, 2014 at 8:44 PM, Eliot Miranda <eliot.miranda@gmail.com

wrote:

...
On Tue, Jul 1, 2014 at 7:51 PM, David T. Lewis lewis@mail.msen.com wrote:

...
On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote: > > On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < > nicolas.cellier.aka.nice@gmail.com> wrote: > > > > > 2014-07-01 4:22 GMT+02:00 commits@source.squeak.org: > >> > >> Item was changed: > >> ----- Method:

LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in

...
...
...
> >> category 'C core') ----- > >> + cDigitSub: pByteSmall len: smallLen with: pByteLarge len:

largeLen

...
...
...
> >> into: pByteRes > >> + | z | > >> - cDigitSub: pByteSmall > >> - len: smallLen > >> - with: pByteLarge > >> - len: largeLen > >> - into: pByteRes > >> - | z limit | > >> <var: #pByteSmall type: 'unsigned char * '> > >> <var: #pByteLarge type: 'unsigned char * '> > >> <var: #pByteRes type: 'unsigned char * '> > >> > >> + z := 0. "Loop invariant is -1<=z<=1" > >> + 0 to: smallLen - 1 do: > >> - z := 0. > >> - "Loop invariant is -1<=z<=1" > >> - limit := smallLen - 1. > >> - 0 to: limit do: > >> [:i | > >> z := z + (pByteLarge at: i) - (pByteSmall at:

i).

...
...
...
> >> + pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant > >> form of (z bitAnd: 255)" > >> > > > > Frankly, having z declared unsigned int and just doing

pByteRes at: i

...
...
...
> > put: (z bitAnd: 16rFF) as I suggested would be way way simpler

and

...
...
...
will > > ALWAYS work. > > Why the hell invoke the complications of signed arithmetic when

the

...
...
...
> > content pByteRes is unsigned??? > > > > I'm not maintaining the plugin. But I broke it in fixing the

unsigned

...
...
...
> division anomaly. I just wanted it to work again as quickly as

possibly

...
...
...
> without expending effort. I made the minimum changes I could to

keep it

...
...
...
> working. I'm much happier to have you maintain the plugin. You

have

...
...
...
the > expertise and experience. > > Nicolas, my priority is to have Spur working. I don't want to

have to

...
...
...
> expend lots of energy changing plugins to get Spur working. My submitting > this fix is not an endorsement of any kind. It's merely

expediency.

...
...
...
>

After catching up with the email thread, I am confused as to what

problem

...
...
...
we are trying to solve.

As near as I can tell, the situation is:

The original LargeIntegersPlugin>>cDigitSub:len:with:len:into:

works

...
...
...
with all combinations of 32/64 bit VMs and images.

Nicolas has proposed a better implementation, along with the

recommendation to use unsigned integer C arithmetic unless there is some

specific

...
...
...
good reason to do otherwise. This seems right in principle, although the implementation in VMMaker-nice.348 is not working for 64-bit VMs, so some issues

remain

...
...
...
to be resolved.

Eliot's original question began with this:

...
I recently eliminated the optimization in Slang that

replaces a

...
...
...
 > division by a power of two with a shift, because the code
cast the

...
...
...
argument > to signed, and hence broke unsigned division. That's what

used to

...
...
...
be > controlled by the UseRightShiftForDivide class var of CCodeGenerator. > > Yesterday I found out that that optimization is the only

thing

...
...
...
that's > keeping the LargeIntegers plugin afloat.

At that point we had a problem in the Spur/Cog VMs that led to

some

...
...
...
patching of the code generation and so forth, along with this email

thread.

...
...
...
So now I am confused. Is the problem that:

The original implementation was broken?

Arguably yes. It relied on incorrect Slang translation to work, specifically the assumption that N / D where D is a power of two is equivalent to (sqInt)N >> S where S is log2 of D. That's wrong on

two

...
...
counts:

If N is -1 then (sqInt)-1 >> S is -1, whereas -1 / D is zero for D

...
If N is unsigned (sqInt)N >> S ~= N >> S if N has the top bit set.

There is something different in the Spur/Cog environment that

exposed

...
...
...
problems in the original implementation?
Yes. I hit examples where the division optimization was generating incorrect code, e.g. in
SpurMemoryManager>>allocateOldSpaceChunkOfBytes:

...
...
initialIndex := chunkBytes / self allocationUnit.

even though chunkBytes is unsigned, Slang was generating

(sqInt)chunkBytes

...
...
...
> 3, which generates garbage when chunkBytes >= 2^31. I decided to

rip

...
...
out the optimization (it is incorrect in the -1 / D case) rather

than hack

...
...
these. That in turn surfaced the bug in the LargeIntegers plugin.

The original implementation worked, but the change to

CCodeGenerator

...
...
...
with regard to its use of UseRightShiftForDivide resulted in a

problem?

...
...
...
Yes.

...

Something else?

No.

I'm pretty sure I'm missing something here.

...
You don't appear to be.

and underlying this is that philosophically I believe that a) having

Slang

...
generate correct code and b) having Slang maintain as close a correspondence between Smalltalk and the resulting C is to be

preferred 1)

...
at the risk of breaking plugins (which then have to be fixed), and 2) papering over the cracks by using cCode: [] inSmalltalk: [] everywhere.

ANother way of saying this is that I believe Slang should be

approachable

...
by the newbie who enters with rational expectations, not something

only old

...
salts who know and work-around the many pitfalls. I got horribly

burned by

...
Slang often in the first few years of working on Cog. I likened it to being hit on the head with a long stick by one's sensei, except that no enlightenment would ever ensue. One's head would simply hurt. I've

tried

...
to make it better, e.g. by adding some simple type inference so that it does the right thing with unsigned and long long types. Alas, when one does this old workarounds, mistakes, or bugs may break. -- best, Eliot
-- best, Eliot

-- best, Eliot

Nicolas Cellier

7:35 a.m.

New subject: VM Maker: VMMaker.oscog-eem.790.mcz

2014-07-02 2:36 GMT+02:00 Eliot Miranda eliot.miranda@gmail.com:

...

On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com> wrote:

...
2014-07-01 4:22 GMT+02:00 commits@source.squeak.org:

...
Item was changed: ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in category 'C core') -----

cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen

into: pByteRes
  | z |
cDigitSub: pByteSmall
          len: smallLen
          with: pByteLarge
          len: largeLen
          into: pByteRes
  | z limit |
  <var: #pByteSmall type: 'unsigned char * '>
  <var: #pByteLarge type: 'unsigned char * '>
  <var: #pByteRes type: 'unsigned char * '>
  z := 0. "Loop invariant is -1<=z<=1"
  0 to: smallLen - 1 do:
  z := 0.
  "Loop invariant is -1<=z<=1"
  limit := smallLen - 1.
  0 to: limit do:
          [:i |
          z := z + (pByteLarge at: i) - (pByteSmall at: i).
          pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
form of (z bitAnd: 255)"
Frankly, having z declared unsigned int and just doing pByteRes at: i put: (z bitAnd: 16rFF) as I suggested would be way way simpler and will ALWAYS work. Why the hell invoke the complications of signed arithmetic when the content pByteRes is unsigned???
I'm not maintaining the plugin. But I broke it in fixing the unsigned division anomaly. I just wanted it to work again as quickly as possibly without expending effort. I made the minimum changes I could to keep it working. I'm much happier to have you maintain the plugin. You have the expertise and experience.

Nicolas, my priority is to have Spur working. I don't want to have to expend lots of energy changing plugins to get Spur working. My submitting this fix is not an endorsement of any kind. It's merely expediency.

Yes, I understand, the smallest modification that could possibly work. But this sign-tolerant form of (z bitAnd: 255) just make me sick ;) I'll try and fix it once I understand the 64 bits VM problem.

cheers

...

Even if z is declared signed, (z bitAnd: 255) would work on a vast

...
majority of compiler/processor because most compiler/processor would use a 2-complement representation. But it's technically implementation-defined.
          z := z signedBitShift: -8].
...
i think this one is OK too on a vast majority of machines, but this is as well technically implementation defined. Some weird compiler/processor pair may as well fill left bits with zeroes, or not even use 2-complement...

z here is a carry and should contain either 0, or -1 (that is UINT_MAX) after this operation.

...
  smallLen to: largeLen - 1 do:
          pByteRes at: i put: z - (z // 256 * 256).
          "sign-tolerant form of (z bitAnd: 255)"
          z := z // 256].
  limit := largeLen - 1.
  smallLen to: limit do:
          [:i |
          z := z + (pByteLarge at: i) .
          pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
form of (z bitAnd: 255)"
          z := z signedBitShift: -8].
          pByteRes at: i put: z - (z // 256 * 256).
          "sign-tolerant form of (z bitAnd: 255)"
          z := z // 256].
!
Isn't there another funny signed shift in Large Integer division? This would deserve a review too, because division is inlining its own sort of subtraction...
-- best, Eliot

3609

Age (days ago)

3610

Last active (days ago)

vm-dev@lists.squeakfoundation.org

13 comments

3 participants

tags (0)

participants (3)

David T. Lewis
Eliot Miranda
Nicolas Cellier