[Vm-dev] [Pharo-dev] Random corrupted data when copying from very large byte array

Eliot Miranda eliot.miranda at gmail.com
Tue Jan 23 00:47:39 UTC 2018


Hi Alistair,

On Mon, Jan 22, 2018 at 1:42 AM, Alistair Grant <akgrant0710 at gmail.com>
wrote:

>
> Hi Eliot,
>
> On Sat, Jan 20, 2018 at 09:19:04AM +0100, Alistair Grant wrote:
> > Hi Eliot,
> >
> > On 19 January 2018 at 23:04, Eliot Miranda <eliot.miranda at gmail.com>
> wrote:
> > > Hi Alistair, Hi Cl??ment,
> > >
> > > On Fri, Jan 19, 2018 at 12:53 PM, Alistair Grant <
> akgrant0710 at gmail.com>
> > > wrote:
> > >>
> > >> Hi Cl??ment,
> > >>
> > >> On 19 January 2018 at 17:21, Alistair Grant <akgrant0710 at gmail.com>
> wrote:
> > >> > Hi Cl??ment,
> > >> >
> > >> > On 19 January 2018 at 17:04, Cl??ment Bera <bera.clement at gmail.com>
> > >> > wrote:
> > >> >> Does not seem to be related to prim 105.
> > >> >>
> > >
> > >
> > > I suspect that the problem is the same compactor bug I've been trying
> to
> > > reproduce all week, and have just fixed.  Could you try and reproduce
> with a
> > > VM built from the latest commit?
> >
> > Happy to, but I'm out all day today, so it will be tomorrow or Monday.
> >
> > Cheers,
> > Alistair
> > (on the run...)
>
>
> I've tested this with 2 images and 3 VMs in all 6
> combinations:
>
> - "Old VM":   commit date: Wed Jan 10 23:39:30 2018 -0800, gcc 4.8.5
> - "New VM":   commit date: Sat Jan 20 13:52:26 2018 +0100, gcc 4.8.5
> - "GCC 5 VM": commit date: Sat Jan 20 13:52:26 2018 +0100, gcc 5.4.0
> - Clean image: commit id: b28d466f
> - Work image:  commit id: eb0a6fb1
>
> The gcc 5 is only there because I was playing with it.  The results may
> be useful, or completely misleading. :-)
>
> Each time I ran "5 timesRepeat: [ self test4 ]"
> with the halts replaced with a count increment.
> test4 is the method provided in Cyrille's original message.
>
> Result summary:
>
> - Old VM + Work image:  5, 5, 5, 0, 0
> - Old VM + Clean image: 5, 5, 0, 0, 0
> - New VM + Work image:  5, 0, 5, 5, 5
> - New VM + Clean image: 0, 0, 1, 5, 5
> - GCC 5 + Work image:   0, 0, 0, 0, 0
> - GCC 5 + Clean image:  0, 0, 0, 0, 0
>

This is strong evidence for the issue being a compiler bug with 4.8.x
If exactly the same input source for the Vm wrks with gcc 5 but not with
4.8.x then there is a small chance it is due to the Vm relying on undefined
behavior, but I doubt it.
Assuming it is a gcc bug then
- it should be documented in the HowToBuild files for the relevant platforms
- Ci builds should start using gcc 5 and dispense with gcc 4.8.x
- since the problem is fixed with gcc 5 there seems little point trying to
identify which version of gcc introduces the problem and communicating the
problem to the gcc maintainers

What's the status of the bug on Windows and Mac OS X?


Old VM:
> 5.0-201801110739  Thursday 11 January  09:30:12 CET 2018 gcc 4.8.5
> [Production Spur VM]
> CoInterpreter VMMaker.oscog-eem.2302 uuid: 55ec8f63-cdbe-4e79-8f22-48fdea585b88
> Jan 11 2018
> StackToRegisterMappingCogit VMMaker.oscog-eem.2302 uuid:
> 55ec8f63-cdbe-4e79-8f22-48fdea585b88 Jan 11 2018
> VM: 201801110739 alistair at alistair-xps13:snap/pharo-snap/pharo-vm/opensmalltalk-vm
> $ Date: Wed Jan 10 23:39:30 2018 -0800 $
> Plugins: 201801110739 alistair at alistair-xps13:snap/pharo-snap/pharo-vm/opensmalltalk-vm
> $
> Linux b07d7880072c 4.13.0-26-generic #29~16.04.2-Ubuntu SMP Tue Jan 9
> 22:00:44 UTC 2018 i686 i686 i686 GNU/Linux
> plugin path: /snap/pharo7/x1/usr/bin/pharo-vm32/ [default:
> /snap/pharo7/x1/usr/bin/pharo-vm32/]
>
>
>
> New VM:
> 5.0-201801201252  Saturday 20 January  21:24:16 CET 2018 gcc 4.8.5
> [Production Spur VM]
> CoInterpreter VMMaker.oscog-eem.2320 uuid: e2692e35-5fc8-4623-95d0-b445b3329f75
> Jan 20 2018
> StackToRegisterMappingCogit VMMaker.oscog-eem.2320 uuid:
> e2692e35-5fc8-4623-95d0-b445b3329f75 Jan 20 2018
> VM: 201801201252 alistair at d62ce50f4930:snap/pharo-snap/pharo-vm/opensmalltalk-vm
> $ Date: Sat Jan 20 13:52:26 2018 +0100 $
> Plugins: 201801201252 alistair at d62ce50f4930:snap/pharo-snap/pharo-vm/opensmalltalk-vm
> $
> Linux 73cbbaa49451 4.13.0-26-generic #29~16.04.2-Ubuntu SMP Tue Jan 9
> 22:00:44 UTC 2018 i686 i686 i686 GNU/Linux
> plugin path: /snap/pharo7/x1/usr/bin/pharo-vm32/ [default:
> /snap/pharo7/x1/usr/bin/pharo-vm32/]
>
>
>
> GCC 5 VM:
> 5.0-201801201252  Sun Jan 21 14:41:41 UTC 2018 gcc 5.4.0 [Production Spur
> VM]
> CoInterpreter VMMaker.oscog-eem.2320 uuid: e2692e35-5fc8-4623-95d0-b445b3329f75
> Jan 21 2018
> StackToRegisterMappingCogit VMMaker.oscog-eem.2320 uuid:
> e2692e35-5fc8-4623-95d0-b445b3329f75 Jan 21 2018
> VM: 201801201252 alistair at alistair-xps13:squeak/opensmalltalk-vm $ Date:
> Sat Jan 20 13:52:26 2018 +0100 $
> Plugins: 201801201252 alistair at alistair-xps13:squeak/opensmalltalk-vm $
> Linux ec9d95d2105a 4.13.0-26-generic #29~16.04.2-Ubuntu SMP Tue Jan 9
> 22:00:44 UTC 2018 i686 i686 i686 GNU/Linux
> plugin path: /home/alistair/squeak/opensmalltalk-vm/products/
> phcogspurlinuxht/lib/pharo/5.0-201801201252 [default:
> /home/alistair/squeak/opensmalltalk-vm/products/
> phcogspurlinuxht/lib/pharo/5.0-201801201252/]
>
>
> HTH,
> Alistair
>
>
>
> > > Some details:
> > > The SpurPlanningCompactor works by using the fact that all Spur
> objects have
> > > room for a forwarding pointer.  The compactor make three passes:
> > >
> > > - the first pass through memory works out where objects will go,
> replacig
> > > their first fields with where they will go, and saving their first
> fields in
> > > a buffer (savedFirstFieldsSpace).
> > > - the second pass scans all pointer objects, replacing their fields
> with
> > > where the objects referenced will go (following the forwarding
> pointers),
> > > and also relocates any pointer fields in savedFirstFieldsSpace
> > > - the final pass slides objects down, restoring their relocated first
> fields
> > >
> > > The buffer used for savedFirstFieldsSpace determines how many passes
> are
> > > used.  The system will either use eden (which is empty when compaction
> > > occurs) or a large free chunk or allocate a new segment, depending on
> > > whatever yields the largest space.  So in the right circumstances eden
> will
> > > be used and more than one pass required.
> > >
> > > The bug was that when multiple passes are used the compactor forgot to
> > > unmark the corpse left behind when the object was moved.  Instead of
> the
> > > corpse being changed into free space it was retained, but its first
> field
> > > would be that of the forwarding pointer to its new location, not the
> actual
> > > first field.  So on 32-bits a ByteArray that should have been collected
> > > would have its first 4 bytes appear to be invalid, and on 64-bits its
> first
> > > 8 bytes.  Because the heap on 64-bits can grow larger it could be that
> the
> > > bug shows itself much less frequently than on 32-bits. When compaction
> can
> > > be completed in a single pass all corpses are correctly collected, so
> most
> > > of the time the bug is hidden.
> > >
> > > This is the commit:
> > > commit 0fe1e1ea108e53501a0e728736048062c83a66ce
> > > Author: Eliot Miranda <eliot.miranda at gmail.com>
> > > Date:   Fri Jan 19 13:17:57 2018 -0800
> > >
> > >     CogVM source as per VMMaker.oscog-eem.2320
> > >
> > >     Spur:
> > >     Fix a bad bug in SpurPlnningCompactor.
> > > unmarkObjectsFromFirstFreeObject,
> > >     used when the compactor requires more than one pass due to
> insufficient
> > >     savedFirstFieldsSpace, expects the corpse of a moved object to be
> > > unmarked,
> > >     but copyAndUnmarkObject:to:bytes:firstField: only unmarked the
> target.
> > >     Unmarking the corpse before the copy unmarks both.  This fixes a
> crash
> > > with
> > >     ReleaseBuilder class>>saveAsNewRelease when non-use of cacheDuring:
> > > creates
> > >     lots of files, enough to push the system into the multi-pass
> regime.
> > >
>



-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20180122/db694f7b/attachment-0001.html>


More information about the Vm-dev mailing list