[Vm-dev] [Pharo-dev] Image crashing on startup, apparently during GC

tesonep at gmail.com tesonep at gmail.com
Sat Mar 31 13:12:49 UTC 2018


Hi all,
Esteban,
   one thing we can do is to add a job to the Pharo CI to try to perform
the bootstrap and testing of it using the latest VM.
This of course will no replace manual testing but it can give us the result
of running a quite complex process using the latest VM.
Even more, the last problems with stability were detected when trying to
execute a complete bootstrap.

Today I will make a pull request (to never be integrated) using the latest
VM instead of the stable one.
Later we can migrate it to be an independent job, that is always there to
validate the new latest VM.
Moreover, if the VM is published using a kind of semantic versioning, we
can refer to an specific VM in the bootstrap process.
This will allow us to check new VMs just with a PR (taking advance of all
the automatic process that now is working) and also to reproduce the
building using the exact VM used during the generation of an image, this is
a weak point we have today, where we all always using Stable, and not a
defined version.


Cheers,

On Sat, Mar 31, 2018 at 3:03 PM, Esteban Lorenzano <estebanlm at gmail.com>
wrote:

>
>
>
> On 30 Mar 2018, at 23:56, Nicolas Cellier <nicolas.cellier.aka.nice@
> gmail.com> wrote:
>
>
>
> 2018-03-24 11:11 GMT+01:00 Esteban Lorenzano <estebanlm at gmail.com>:
>
>>
>> hi,
>>
>> > On 24 Mar 2018, at 09:50, Cyril Ferlicot D. <cyril.ferlicot at gmail.com>
>> wrote:
>> >
>> > Le 23/03/2018 à 21:52, Eliot Miranda a écrit :
>> >> Hi Damien,
>> >>
>> >> Indeed the image is corrupt at start-up.  See below.
>> >>
>> >>
>> >> Right.  This VM is prior to the bug fixes in VMMaker.oscog-eem.2320:
>> >>
>> >> Spur:
>> >> Fix a bad bug in SpurPlnningCompactor.
>> >>  unmarkObjectsFromFirstFreeObject, used when the compactor requires
>> more
>> >> than one pass due to insufficient savedFirstFieldsSpace, expects the
>> >> corpse of a moved object to be unmarked, but
>> >> copyAndUnmarkObject:to:bytes:firstField: only unmarked the target.
>> >> Unmarking the corpse before the copy unmarks both.  This fixes a crash
>> >> with ReleaseBuilder class>>saveAsNewRelease when non-use of
>> cacheDuring:
>> >> creates lots of files, enough to push the system into the multi-pass
>> regime.
>> >>
>> >>
>> >> Pharo urgently needs to upgrade the VM to one more up to date than 2017
>> >> 08 27 (in fact more up-to-date than opensmalltalk/vm commit
>> >> 0fe1e1ea108e53501a0e728736048062c83a66ce, Fri Jan 19 13:17:57 2018
>> >> -0800).  The bug that VMMaker.oscog-eem.2320 fixes can result in image
>> >> corruption in large images, and can occur (as it has here) at start-up,
>> >> causing one's work to be irretrievably lost.
>> >>
>> >
>> > Hi Eliot,
>> >
>> > I think that there is a lot of people who would like to get a newer
>> > stable vm for Pharo 6.1 and 7. The problem is that it is hard to know
>> > which VM are stable enough to be promoted as stable.
>> >
>> > Some weeks ago Esteban tried to promote a VM as stable and he had to
>> > revert it the same day because a regression occurred in the VM.
>> >
>> > If you're able to tell us which vms are stable in those present at
>> > http://files.pharo.org/vm/pharo-spur32/ and
>> > http://files.pharo.org/vm/pharo-spur64/ it would be a great help.
>> >
>> > Even better would be for the pharo community to have a way to know which
>> > vms are stable or not without having to ask you.
>>
>> there is no “stable” branch in Cog, and that’s a problem.
>> “released” versions (the version you can find as stable) are not working
>> for Pharo :(
>>
>> I tried to promote versions from end feb and that crashed.
>>
>> next week I will try again, maybe now they are stable enough… one thing
>> is true: the versions that we consider stable (from oct/17) present
>> problems that are already solved on latest.
>>
>> Esteban
>>
>>
>> >
>> > Have a nice day.
>> >
>> >>
>> >> --
>> >> _,,,^..^,,,_
>> >> best, Eliot
>> > --
>> > Cyril Ferlicot
>> > https://ferlicot.fr
>> >
>>
>>
> Hi,
> Several problems are mixed here, let's try and decouple:
> - 1) there are ongoing development in the core of VM that may introduce
> some instability
> - 2) there are ongoing development in some plugins also
> - 3) there are infrastructure problems preventing to produce artifacts
> whatever the intrinsic stability of the VM
>
>
> you are right in all points, but for me this is a problem of process.
>
> - we have no defined milestones so nobody knows if they can jump to help.
> - plugin development happens “by his own” and nobody knows what happens,
> why happens and how it happens.
> - infrastructure is not bad and a lot of efforts has been made to make it
> work. But code sources are scattered around the world and the only thing
> that reunites them is the hand of the one who generates the C sources.
>
> IMHO, is this “disconnection” what causes most of the problems.
>
> cheers,
> Esteban
>
>
> For 1) development happens in VMMaker, and we have to be relying on
> experts. Today that is Eliot and Clement.
> We all want 64bits VM, improved GC, improved become:, write barrier,
> ephemerons, threaded FFI calls and adaptive optimization.
> Pharo is relying on these progress, they are vital.
> IMO, we are reaching a good level of confidence, and I hope to see some
> VMMaker version blessed as stable pretty soon.
>
> Instead of whining, the best we can do for reaching this state is help
> them by providing accurate bug reports and even better reproducible cases.
> Thanks to all who are working in this direction.
>
> For 2) we had a few problems, but again this is for improving important
> features (SSL...)
> Much of the development happens in feature branches already.
> But since we are targetting so many platforms, and don't have automated
> tests that scale yet, we still need beta testers.
> We can discuss about the introduction of such beta features wrt release
> cycles, that will be a good thing.
> Ideally we should tend toward continuous integration and have very short
> cycles, but we're not yet there.
>
> For 3)  we had a lot of problems, like staled links, invalid credentials,
> evolution of the version of tools at automated build site, etc...
>
> If we don't build the artifacts, then we can't even have a chance to test
> the stability of 1) and 2)
> We have to understand that 3) is absolutely vital.
>
> May I remind that for a very long period last year, the build were broken
> due to lack of work at Pharo side.
> Fortunately, this has changed in 2018.
> Fabio has been working REALLY hard to improve 3), and without the help of
> Esteban,I don't think he could have reached the holy green build status.
> We will never thank them enough for that. This also shows that cooperation
> may pay.
>
> But this is still very fragile.
> If we want to make progress, we should ask why it is so.
> We could analyze the regressions, and decide if the complexity is
> sustainable, or eventually drop some drag.
> We are chasing many hares by building the VM for Newspeak/Pharo/Squeak
> i386/x86_64/ARM Spur/Stack/V3 Sista/lowcode linux/Macosx/Windows ...
> If it happens that a fix vital for Pharo/Squeak does break Newspeak tests,
> then it slows down the progress...
> Maybe we would want to decouple a bit more the problems there too (they
> may come from some image side weakness).
>
> Last two years I've also observed some work exclusively done in the Pharo
> fork of the opensmalltalk VM.
> This was counter productive. Work must be produced upstream, or it's
> wasted.
>
>
> This happened just once or twice. And it was because people were ignorant
> of “joint" so they continued contributing as before (and people were
> pointed to right place when we had the opportunity).
> And I disagree this was counterproductive because I took the effort to
> merge the changes into osvm. This worked fine until I stopped to do that
> job, but well… just one PR got stalled there for months and Alistair
> integrated it recently.
>
> What *did happen* and I’m still not ready to let it go is a lot of the
> small changes that we presented to be rejected (or ignored) without further
> consideration. But well, let’s keep it positive and not enter to sterile
> discussions, I just think you are wrong with this argument.
>
> cheers,
> Esteban
>
> I once thought that the Pharo fork could be the place for the pharo team
> to manage official stable versions.
> But I agree that this is too much duplicated work and would be very happy
> to see the work happen upstream too.
>
> If you have constructive ideas that will help decoupling all these
> problems, we are all ear.
>
> PS: i did not post this answer for avoiding sterile discussion, but since
> Phil asked...
>
>
>
>


-- 
Pablo Tesone.
tesonep at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20180331/102759c9/attachment.html>


More information about the Vm-dev mailing list