[squeak-dev] error when updating Squeak4.4-12327 to trunk

Nicolas Cellier nicolas.cellier.aka.nice at gmail.com
Mon Mar 11 23:26:24 UTC 2013


2013/3/12 Eliot Miranda <eliot.miranda at gmail.com>:
> On Sun, Mar 10, 2013 at 12:10 PM, Nicolas Cellier
> <nicolas.cellier.aka.nice at gmail.com> wrote:
>> OK, see the VM thread, I now think that problems does not come from
>> COG, but from ClassBuilder which in some cases fail to clean a cache
>> (primitive 116).
>> The problem does not show up in interpreter VM thanks to primitive 119
>> (this primitives does not unlink send in cogit).
>
> it does unlink sends, but only for that selector.  But is it really
> the case that it is a missing cache flush or is it a bug in Cog with
> its cache flushing?  I realised the way to test this is to try the
> Stack VM and see if it crashes or not.  I just tried that but now
> neither Cog nor the Stack VM crash although both fail the load with an
> MNU of #do: to UndefinedObject in Environment>>bindingOf:ifAbsent:.
> So how do I get the system back to a state where I can reproduce the
> Cog crash to compare the Stack and Cog VMs with each other?
>
> (Apologies for being unresponsive; I've just moved into a new
> apartment and only got my internet connection yesterday afternoon; at
> least its fast (for the states) :) ).
>
>

Well, primitive 119 does indeed seem clean the cache.
I was confused because there are two primitive 119

primitiveFlushCacheSelective (Interpreter)
primitiveFlushCacheBySelector (StackInterpreter)

It's really a drag to carry all those dead code when you want to
analyze quickly :(

So my first correction (avoid using MethodDictionary new in
ClassBuilder) was probably useless.

What happened is that while recompiling all the new Parser methods,
the old Parser compiled methods are still in use, and thus re-added to
the cache.
So my second attempt  (clean the cache again just before mutation in
ClassBuilder) did the trick.

As for going back in update process, taking an updated trunk, browsing
the update configuration, and loading them in backward order seems to
work.
Or the other way around, from an older 4.4 image, apply all updates up
to nice.221, I think Bert posted a script to automate that.

Nicolas

>> I have attempted a ClassBuilder fix and posted new updates from
>> nice-222 to cwp-227.
>>
>> Can I please ask our testers contribution once again?
>>
>> Nicolas
>>
>> 2013/3/8 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
>>> 2013/3/8 Bert Freudenberg <bert at freudenbergs.de>:
>>>>
>>>> On 2013-03-08, at 10:55, Frank Shearar <frank.shearar at gmail.com> wrote:
>>>>
>>>>> On 7 March 2013 23:25, Frank Shearar <frank.shearar at gmail.com> wrote:
>>>>>> On 7 March 2013 23:11, Bert Freudenberg <bert at freudenbergs.de> wrote:
>>>>>>> On 2013-03-07, at 23:42, Frank Shearar <frank.shearar at gmail.com> wrote:
>>>>>>>
>>>>>>>>>> On 6 March 2013 15:59, Ken G. Brown <kbrown at mac.com> wrote:
>>>>>>>>>>> Running on COG 2397, and after updating fresh Squeak4.4-12327 Release to
>>>>>>>>>>> 12332, updating to Trunk  fails at first attempt in the same place, then by
>>>>>>>>>>> abandoning and trying the update again, it apparently completes to 12511.
>>>>>>>>>>>
>>>>>>>>>>>  Ken G. Brown
>>>>>>>>>>>
>>>>>>>>>>>> With COG 2678, pretty well the same. First attempt it timed out during
>>>>>>>>>>>> the same update-nice-223, then trying again from what had already been
>>>>>>>>>>>> loaded, got the following during the same update, during compiling
>>>>>>>>>>>> SMLoader-fbs-78 as before:
>>>>>>>>
>>>>>>>> What I find strange about all this is that we take a 4.4-12327 image
>>>>>>>> and whatever the latest Cog is and update it all the way without any
>>>>>>>> probems quite a few times a day on the CI server.
>>>>>>>>
>>>>>>>> frank
>>>>>>>
>>>>>>> Looks like it's an intermittent problem, unfortunately:
>>>>>>>
>>>>>>> I just updated the new all-in-one-cog to latest trunk, no problem. This is a 4.4-12327 image with Cog VM 2697.
>>>>>>>
>>>>>>> I then tried what Ken described: update the fresh image first from the squeak44 stream, then switch to trunk, then update again.
>>>>>>>
>>>>>>> BOOM. Cog crash. Didn't save the log unfortunately.
>>>>>>>
>>>>>>> Tried again. Update, switch to trunk, update again. No crash. What?!
>>>>>>>
>>>>>>> Once more. Update, switch to trunk, update. Crash! See below.
>>>>>>>
>>>>>>> Tried yet again, with switching to trunk immediately in a fresh image. Crashes, too, same place.
>>>>>>>
>>>>>>> So it does crash, just not always. But it's been more than 50% in my case.
>>>>>>
>>>>>> Ah, interesting. The CI jobs, naturally, don't update from squeak44;
>>>>>> they switch to trunk and update just like that. Which I would have
>>>>>> thought would make no difference...
>>>>>
>>>>> Actually, I lie. Here's an example of the CI jobs hitting the same
>>>>> issue: http://build.squeak.org/job/SqueakTrunk/204/console And further
>>>>> if you look at http://build.squeak.org/job/SqueakTrunk/ and choose to
>>>>> see the failing tests you'll see times (say around build #184) where
>>>>> the test failure count is unusually low. And
>>>>> http://build.squeak.org/job/SqueakTrunk/buildTimeTrend shows grey
>>>>> streaks where builds die.
>>>>
>>>> Curious that it still runs the tests at all if the update failed ...
>>>>
>>>> So Cog crashes, but has someone tried to replicate this on an interpreter?
>>>>
>>>> - Bert -
>>>>
>>>
>>> I think that the problem comes form COG which tries to use an obsolete
>>> method sent AFTER the recompilation of Parser which is not the
>>> expected behavior.
>>> I have triggered such kind of strange behavior that does not happen on
>>> an Interpreter VM, see the thread opened by Jeff Gonis '[Vm-dev] Cog
>>> VM Crash on Windows'
>>> For me, it must be related to a cache that is not cleaned-up, I don't know why.
>>>
>>> Nicolas
>>
>
>
>
> --
> best,
> Eliot
>


More information about the Squeak-dev mailing list