On Tue, Jul 1, 2014 at 7:51 PM, David T. Lewis <lewis@mail.msen.com> wrote:

On Tue, Jul 01, 2014 at 05:36:54PM -0700, Eliot Miranda wrote:
>
> On Tue, Jul 1, 2014 at 12:21 PM, Nicolas Cellier <
> nicolas.cellier.aka.nice@gmail.com> wrote:
>
> >
> > 2014-07-01 4:22 GMT+02:00 <commits@source.squeak.org>:
> >>
> >> Item was changed:
> >>   ----- Method: LargeIntegersPlugin>>cDigitSub:len:with:len:into: (in
> >> category 'C core') -----
> >> + cDigitSub: pByteSmall len: smallLen with: pByteLarge len: largeLen
> >> into: pByteRes
> >> +       | z |
> >> - cDigitSub: pByteSmall
> >> -               len: smallLen
> >> -               with: pByteLarge
> >> -               len: largeLen
> >> -               into: pByteRes
> >> -       | z limit |
> >>         <var: #pByteSmall type: 'unsigned char * '>
> >>         <var: #pByteLarge type: 'unsigned char * '>
> >>         <var: #pByteRes type: 'unsigned char * '>
> >>
> >> +       z := 0. "Loop invariant is -1<=z<=1"
> >> +       0 to: smallLen - 1 do:
> >> -       z := 0.
> >> -       "Loop invariant is -1<=z<=1"
> >> -       limit := smallLen - 1.
> >> -       0 to: limit do:
> >>                 [:i |
> >>                 z := z + (pByteLarge at: i) - (pByteSmall at: i).
> >> +               pByteRes at: i put: z - (z // 256 * 256). "sign-tolerant
> >> form of (z bitAnd: 255)"
> >>
> >
> > Frankly, having z declared unsigned int and just doing  pByteRes at: i
> > put: (z bitAnd: 16rFF) as I suggested would be way way simpler and will
> > ALWAYS work.
> > Why the hell invoke the complications of signed arithmetic when the
> > content pByteRes is unsigned???
> >
>
> I'm not maintaining the plugin.  But I broke it in fixing the unsigned
> division anomaly.  I just wanted it to work again as quickly as possibly
> without expending effort.  I made the minimum changes I could to keep it
> working.  I'm much happier to have you maintain the plugin.  You have the
> expertise and experience.
>
> Nicolas, my priority is to have Spur working.  I don't want to have to
> expend lots of energy changing plugins to get Spur working.  My submitting
> this fix is not an endorsement of any kind.  It's merely expediency.
>

After catching up with the email thread, I am confused as to what problem we
are trying to solve.

As near as I can tell, the situation is:

 - The original LargeIntegersPlugin>>cDigitSub:len:with:len:into: works with all
   combinations of 32/64 bit VMs and images.

 - Nicolas has proposed a better implementation, along with the recommendation
   to use unsigned integer C arithmetic unless there is some specific good reason
   to do otherwise. This seems right in principle, although the implementation in
   VMMaker-nice.348 is not working for 64-bit VMs, so some issues remain to be
   resolved.

 - Eliot's original question began with this:

     > I recently eliminated the optimization in Slang that replaces a
     > division by a power of two with a shift, because the code cast the argument
     > to signed, and hence broke unsigned division.  That's what used to be
     > controlled by the UseRightShiftForDivide class var of CCodeGenerator.
     >
     > Yesterday I found out that that optimization is the only thing that's
     > keeping the LargeIntegers plugin afloat.

 - At that point we had a problem in the Spur/Cog VMs that led to some patching
   of the code generation and so forth, along with this email thread.

So now I am confused. Is the problem that:

  - The original implementation was broken?

Arguably yes.  It relied on incorrect Slang translation to work, specifically the assumption that N / D where D is a power of two is equivalent to (sqInt)N >> S where S is log2 of D.  That's wrong on two counts:

  If N is -1 then (sqInt)-1 >> S is -1, whereas -1 / D is zero for D > 1.
  If N is unsigned (sqInt)N >> S ~= N >> S if N has the top bit set.


  - There is something different in the Spur/Cog environment that exposed
    problems in the original implementation?

Yes.  I hit examples where the division optimization was generating incorrect code, e.g. in SpurMemoryManager>>allocateOldSpaceChunkOfBytes:

initialIndex := chunkBytes / self allocationUnit.

even though chunkBytes is unsigned, Slang was generating (sqInt)chunkBytes >> 3, which generates garbage when chunkBytes >= 2^31.  I decided to rip out the optimization (it is incorrect in the -1 / D case) rather than hack these.  That in turn surfaced the bug in the LargeIntegers plugin.

  - The original implementation worked, but the change to CCodeGenerator with
    regard to its use of UseRightShiftForDivide resulted in a problem?

Yes.
 
  - Something else?

No.

I'm pretty sure I'm missing something here.

You don't appear to be. 
--
best,
Eliot