[Vm-dev] [Pharo-dev] Image crashing on startup, apparently during GC

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Sat Mar 31 15:34:25 UTC 2018


2018-03-31 15:03 GMT+02:00 Esteban Lorenzano <estebanlm at gmail.com>:

>
>
>
> On 30 Mar 2018, at 23:56, Nicolas Cellier <nicolas.cellier.aka.nice@
> gmail.com> wrote:
>
>
>
> 2018-03-24 11:11 GMT+01:00 Esteban Lorenzano <estebanlm at gmail.com>:
>
>>
>> hi,
>>
>> > On 24 Mar 2018, at 09:50, Cyril Ferlicot D. <cyril.ferlicot at gmail.com>
>> wrote:
>> >
>> > Le 23/03/2018 à 21:52, Eliot Miranda a écrit :
>> >> Hi Damien,
>> >>
>> >> Indeed the image is corrupt at start-up.  See below.
>> >>
>> >>
>> >> Right.  This VM is prior to the bug fixes in VMMaker.oscog-eem.2320:
>> >>
>> >> Spur:
>> >> Fix a bad bug in SpurPlnningCompactor.
>> >>  unmarkObjectsFromFirstFreeObject, used when the compactor requires
>> more
>> >> than one pass due to insufficient savedFirstFieldsSpace, expects the
>> >> corpse of a moved object to be unmarked, but
>> >> copyAndUnmarkObject:to:bytes:firstField: only unmarked the target.
>> >> Unmarking the corpse before the copy unmarks both.  This fixes a crash
>> >> with ReleaseBuilder class>>saveAsNewRelease when non-use of
>> cacheDuring:
>> >> creates lots of files, enough to push the system into the multi-pass
>> regime.
>> >>
>> >>
>> >> Pharo urgently needs to upgrade the VM to one more up to date than 2017
>> >> 08 27 (in fact more up-to-date than opensmalltalk/vm commit
>> >> 0fe1e1ea108e53501a0e728736048062c83a66ce, Fri Jan 19 13:17:57 2018
>> >> -0800).  The bug that VMMaker.oscog-eem.2320 fixes can result in image
>> >> corruption in large images, and can occur (as it has here) at start-up,
>> >> causing one's work to be irretrievably lost.
>> >>
>> >
>> > Hi Eliot,
>> >
>> > I think that there is a lot of people who would like to get a newer
>> > stable vm for Pharo 6.1 and 7. The problem is that it is hard to know
>> > which VM are stable enough to be promoted as stable.
>> >
>> > Some weeks ago Esteban tried to promote a VM as stable and he had to
>> > revert it the same day because a regression occurred in the VM.
>> >
>> > If you're able to tell us which vms are stable in those present at
>> > http://files.pharo.org/vm/pharo-spur32/ and
>> > http://files.pharo.org/vm/pharo-spur64/ it would be a great help.
>> >
>> > Even better would be for the pharo community to have a way to know which
>> > vms are stable or not without having to ask you.
>>
>> there is no “stable” branch in Cog, and that’s a problem.
>> “released” versions (the version you can find as stable) are not working
>> for Pharo :(
>>
>> I tried to promote versions from end feb and that crashed.
>>
>> next week I will try again, maybe now they are stable enough… one thing
>> is true: the versions that we consider stable (from oct/17) present
>> problems that are already solved on latest.
>>
>> Esteban
>>
>>
>> >
>> > Have a nice day.
>> >
>> >>
>> >> --
>> >> _,,,^..^,,,_
>> >> best, Eliot
>> > --
>> > Cyril Ferlicot
>> > https://ferlicot.fr
>> >
>>
>>
> Hi,
> Several problems are mixed here, let's try and decouple:
> - 1) there are ongoing development in the core of VM that may introduce
> some instability
> - 2) there are ongoing development in some plugins also
> - 3) there are infrastructure problems preventing to produce artifacts
> whatever the intrinsic stability of the VM
>
>
> you are right in all points, but for me this is a problem of process.
>
> - we have no defined milestones so nobody knows if they can jump to help.
> - plugin development happens “by his own” and nobody knows what happens,
> why happens and how it happens.
> - infrastructure is not bad and a lot of efforts has been made to make it
> work. But code sources are scattered around the world and the only thing
> that reunites them is the hand of the one who generates the C sources.
>
> IMHO, is this “disconnection” what causes most of the problems.
>
> cheers,
> Esteban
>
>
Hi Esteban,
I see no fatality, and github also provides tools for that.
Look, there is the project page on github
https://github.com/OpenSmalltalk/opensmalltalk-vm/projects/1

Maybe the Pharo team is willing to collaborate and take active parts in
definition of milestones?


>
> For 1) development happens in VMMaker, and we have to be relying on
> experts. Today that is Eliot and Clement.
> We all want 64bits VM, improved GC, improved become:, write barrier,
> ephemerons, threaded FFI calls and adaptive optimization.
> Pharo is relying on these progress, they are vital.
> IMO, we are reaching a good level of confidence, and I hope to see some
> VMMaker version blessed as stable pretty soon.
>
> Instead of whining, the best we can do for reaching this state is help
> them by providing accurate bug reports and even better reproducible cases.
> Thanks to all who are working in this direction.
>
> For 2) we had a few problems, but again this is for improving important
> features (SSL...)
> Much of the development happens in feature branches already.
> But since we are targetting so many platforms, and don't have automated
> tests that scale yet, we still need beta testers.
> We can discuss about the introduction of such beta features wrt release
> cycles, that will be a good thing.
> Ideally we should tend toward continuous integration and have very short
> cycles, but we're not yet there.
>
> For 3)  we had a lot of problems, like staled links, invalid credentials,
> evolution of the version of tools at automated build site, etc...
>
> If we don't build the artifacts, then we can't even have a chance to test
> the stability of 1) and 2)
> We have to understand that 3) is absolutely vital.
>
> May I remind that for a very long period last year, the build were broken
> due to lack of work at Pharo side.
> Fortunately, this has changed in 2018.
> Fabio has been working REALLY hard to improve 3), and without the help of
> Esteban,I don't think he could have reached the holy green build status.
> We will never thank them enough for that. This also shows that cooperation
> may pay.
>
> But this is still very fragile.
> If we want to make progress, we should ask why it is so.
> We could analyze the regressions, and decide if the complexity is
> sustainable, or eventually drop some drag.
> We are chasing many hares by building the VM for Newspeak/Pharo/Squeak
> i386/x86_64/ARM Spur/Stack/V3 Sista/lowcode linux/Macosx/Windows ...
> If it happens that a fix vital for Pharo/Squeak does break Newspeak tests,
> then it slows down the progress...
> Maybe we would want to decouple a bit more the problems there too (they
> may come from some image side weakness).
>
> Last two years I've also observed some work exclusively done in the Pharo
> fork of the opensmalltalk VM.
> This was counter productive. Work must be produced upstream, or it's
> wasted.
>
>
> This happened just once or twice. And it was because people were ignorant
> of “joint" so they continued contributing as before (and people were
> pointed to right place when we had the opportunity).
> And I disagree this was counterproductive because I took the effort to
> merge the changes into osvm. This worked fine until I stopped to do that
> job, but well… just one PR got stalled there for months and Alistair
> integrated it recently.
>
> What *did happen* and I’m still not ready to let it go is a lot of the
> small changes that we presented to be rejected (or ignored) without further
> consideration. But well, let’s keep it positive and not enter to sterile
> discussions, I just think you are wrong with this argument.
>
> No, it's important, we (opensmalltalk-vm team) can't let such bad feeling
and frustration creep in.
Every contribution counts, that does not mean that every PR will be
accepted, but we owe an explanation if not.
Some are accepted instantly, some are accepted after modification requests,
some are rejected (I hope with some rationalization).
What is problematic is that some were ignored for too long time, I regret
the situation, but there is no deliberate intention to ignore them, just
lack of manpower IMO.
For example, the recent work of Alistair shows that there is no fatality
here, it's just that someone has to do the hard work (kudos!).

Also, the question is coupled with stability: having a red status does not
help, for almost every PR that I accepted, I had to dig into travis console
reports and compare to status of previous build in order to know if it was
a regression, or just a long time failing case... This does not scale!

I'm all for more distributed power, and that should come with
responsibilities, first a cooperative "you break it, you fix it" attitude.

Or maybe do you want clarified decision process?
For now, people that feel interested by a PR raise their voice.
I don't know if we need something more formal
For important design decisions there is vm-dev mailing list to discuss
about that.

cheers
Nicolas


> cheers,
> Esteban
>
> I once thought that the Pharo fork could be the place for the pharo team
> to manage official stable versions.
> But I agree that this is too much duplicated work and would be very happy
> to see the work happen upstream too.
>
> If you have constructive ideas that will help decoupling all these
> problems, we are all ear.
>
> PS: i did not post this answer for avoiding sterile discussion, but since
> Phil asked...
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20180331/ccaa2219/attachment-0001.html>


More information about the Vm-dev mailing list