[Vm-dev] [Pharo-dev] Random corrupted data when copying from very large byte array

Alistair Grant akgrant0710 at gmail.com
Mon Jan 22 09:42:05 UTC 2018


Hi Eliot,

On Sat, Jan 20, 2018 at 09:19:04AM +0100, Alistair Grant wrote:
> Hi Eliot,
> 
> On 19 January 2018 at 23:04, Eliot Miranda <eliot.miranda at gmail.com> wrote:
> > Hi Alistair, Hi Cl??ment,
> >
> > On Fri, Jan 19, 2018 at 12:53 PM, Alistair Grant <akgrant0710 at gmail.com>
> > wrote:
> >>
> >> Hi Cl??ment,
> >>
> >> On 19 January 2018 at 17:21, Alistair Grant <akgrant0710 at gmail.com> wrote:
> >> > Hi Cl??ment,
> >> >
> >> > On 19 January 2018 at 17:04, Cl??ment Bera <bera.clement at gmail.com>
> >> > wrote:
> >> >> Does not seem to be related to prim 105.
> >> >>
> >
> >
> > I suspect that the problem is the same compactor bug I've been trying to
> > reproduce all week, and have just fixed.  Could you try and reproduce with a
> > VM built from the latest commit?
> 
> Happy to, but I'm out all day today, so it will be tomorrow or Monday.
> 
> Cheers,
> Alistair
> (on the run...)


I've tested this with 2 images and 3 VMs in all 6
combinations:

- "Old VM":   commit date: Wed Jan 10 23:39:30 2018 -0800, gcc 4.8.5
- "New VM":   commit date: Sat Jan 20 13:52:26 2018 +0100, gcc 4.8.5
- "GCC 5 VM": commit date: Sat Jan 20 13:52:26 2018 +0100, gcc 5.4.0
- Clean image: commit id: b28d466f 
- Work image:  commit id: eb0a6fb1

The gcc 5 is only there because I was playing with it.  The results may 
be useful, or completely misleading. :-)

Each time I ran "5 timesRepeat: [ self test4 ]"
with the halts replaced with a count increment.
test4 is the method provided in Cyrille's original message.

Result summary:

- Old VM + Work image:  5, 5, 5, 0, 0
- Old VM + Clean image: 5, 5, 0, 0, 0
- New VM + Work image:  5, 0, 5, 5, 5
- New VM + Clean image: 0, 0, 1, 5, 5
- GCC 5 + Work image:   0, 0, 0, 0, 0
- GCC 5 + Clean image:  0, 0, 0, 0, 0



Old VM:
5.0-201801110739  Thursday 11 January  09:30:12 CET 2018 gcc 4.8.5 [Production Spur VM]
CoInterpreter VMMaker.oscog-eem.2302 uuid: 55ec8f63-cdbe-4e79-8f22-48fdea585b88 Jan 11 2018
StackToRegisterMappingCogit VMMaker.oscog-eem.2302 uuid: 55ec8f63-cdbe-4e79-8f22-48fdea585b88 Jan 11 2018
VM: 201801110739 alistair at alistair-xps13:snap/pharo-snap/pharo-vm/opensmalltalk-vm $ Date: Wed Jan 10 23:39:30 2018 -0800 $
Plugins: 201801110739 alistair at alistair-xps13:snap/pharo-snap/pharo-vm/opensmalltalk-vm $
Linux b07d7880072c 4.13.0-26-generic #29~16.04.2-Ubuntu SMP Tue Jan 9 22:00:44 UTC 2018 i686 i686 i686 GNU/Linux
plugin path: /snap/pharo7/x1/usr/bin/pharo-vm32/ [default: /snap/pharo7/x1/usr/bin/pharo-vm32/]



New VM:
5.0-201801201252  Saturday 20 January  21:24:16 CET 2018 gcc 4.8.5 [Production Spur VM]
CoInterpreter VMMaker.oscog-eem.2320 uuid: e2692e35-5fc8-4623-95d0-b445b3329f75 Jan 20 2018
StackToRegisterMappingCogit VMMaker.oscog-eem.2320 uuid: e2692e35-5fc8-4623-95d0-b445b3329f75 Jan 20 2018
VM: 201801201252 alistair at d62ce50f4930:snap/pharo-snap/pharo-vm/opensmalltalk-vm $ Date: Sat Jan 20 13:52:26 2018 +0100 $
Plugins: 201801201252 alistair at d62ce50f4930:snap/pharo-snap/pharo-vm/opensmalltalk-vm $
Linux 73cbbaa49451 4.13.0-26-generic #29~16.04.2-Ubuntu SMP Tue Jan 9 22:00:44 UTC 2018 i686 i686 i686 GNU/Linux
plugin path: /snap/pharo7/x1/usr/bin/pharo-vm32/ [default: /snap/pharo7/x1/usr/bin/pharo-vm32/]



GCC 5 VM:
5.0-201801201252  Sun Jan 21 14:41:41 UTC 2018 gcc 5.4.0 [Production Spur VM]
CoInterpreter VMMaker.oscog-eem.2320 uuid: e2692e35-5fc8-4623-95d0-b445b3329f75 Jan 21 2018
StackToRegisterMappingCogit VMMaker.oscog-eem.2320 uuid: e2692e35-5fc8-4623-95d0-b445b3329f75 Jan 21 2018
VM: 201801201252 alistair at alistair-xps13:squeak/opensmalltalk-vm $ Date: Sat Jan 20 13:52:26 2018 +0100 $
Plugins: 201801201252 alistair at alistair-xps13:squeak/opensmalltalk-vm $
Linux ec9d95d2105a 4.13.0-26-generic #29~16.04.2-Ubuntu SMP Tue Jan 9 22:00:44 UTC 2018 i686 i686 i686 GNU/Linux
plugin path: /home/alistair/squeak/opensmalltalk-vm/products/phcogspurlinuxht/lib/pharo/5.0-201801201252 [default: /home/alistair/squeak/opensmalltalk-vm/products/phcogspurlinuxht/lib/pharo/5.0-201801201252/]


HTH,
Alistair



> > Some details:
> > The SpurPlanningCompactor works by using the fact that all Spur objects have
> > room for a forwarding pointer.  The compactor make three passes:
> >
> > - the first pass through memory works out where objects will go, replacig
> > their first fields with where they will go, and saving their first fields in
> > a buffer (savedFirstFieldsSpace).
> > - the second pass scans all pointer objects, replacing their fields with
> > where the objects referenced will go (following the forwarding pointers),
> > and also relocates any pointer fields in savedFirstFieldsSpace
> > - the final pass slides objects down, restoring their relocated first fields
> >
> > The buffer used for savedFirstFieldsSpace determines how many passes are
> > used.  The system will either use eden (which is empty when compaction
> > occurs) or a large free chunk or allocate a new segment, depending on
> > whatever yields the largest space.  So in the right circumstances eden will
> > be used and more than one pass required.
> >
> > The bug was that when multiple passes are used the compactor forgot to
> > unmark the corpse left behind when the object was moved.  Instead of the
> > corpse being changed into free space it was retained, but its first field
> > would be that of the forwarding pointer to its new location, not the actual
> > first field.  So on 32-bits a ByteArray that should have been collected
> > would have its first 4 bytes appear to be invalid, and on 64-bits its first
> > 8 bytes.  Because the heap on 64-bits can grow larger it could be that the
> > bug shows itself much less frequently than on 32-bits. When compaction can
> > be completed in a single pass all corpses are correctly collected, so most
> > of the time the bug is hidden.
> >
> > This is the commit:
> > commit 0fe1e1ea108e53501a0e728736048062c83a66ce
> > Author: Eliot Miranda <eliot.miranda at gmail.com>
> > Date:   Fri Jan 19 13:17:57 2018 -0800
> >
> >     CogVM source as per VMMaker.oscog-eem.2320
> >
> >     Spur:
> >     Fix a bad bug in SpurPlnningCompactor.
> > unmarkObjectsFromFirstFreeObject,
> >     used when the compactor requires more than one pass due to insufficient
> >     savedFirstFieldsSpace, expects the corpse of a moved object to be
> > unmarked,
> >     but copyAndUnmarkObject:to:bytes:firstField: only unmarked the target.
> >     Unmarking the corpse before the copy unmarks both.  This fixes a crash
> > with
> >     ReleaseBuilder class>>saveAsNewRelease when non-use of cacheDuring:
> > creates
> >     lots of files, enough to push the system into the multi-pass regime.
> >


More information about the Vm-dev mailing list