[Vm-dev] VM crash with message 'could not grow remembered set'

Clément Bera bera.clement at gmail.com
Tue Oct 24 06:31:06 UTC 2017


On Mon, Oct 23, 2017 at 7:35 PM, Eliot Miranda <eliot.miranda at gmail.com>
wrote:

>
>
>
> On Fri, Oct 20, 2017 at 5:06 AM, Clément Bera <bera.clement at gmail.com>
> wrote:
>
>>
>> Hi,
>>
>> Thanks for you example, I could reproduce.
>>
>> It seems when the VM tries to grow the remember set while there is not
>> enough free space in old space to grow it, it does that error.
>>
>> In your case, the remember set grows in the middle of a GC, and during GC
>> there is not enough free space in old space to allocate a larger remember
>> set. The full GC includes a scavenge, the scavenge tenures objects leading
>> to a growth of the remembered table, and as old space is not reclaimed yet
>> (later in the full GC phase), there is not enough free space for it. I
>> don't think at this point we can do a scavenge for Remembered table
>> shrinkage (we're already in the middle of a scavenge, which is part of the
>> full GC). Hence I think the best bet is to allocate a new old space memory
>> segment, even though that operation can fail, it's still better than
>> crashing. There are other solutions I can think of but I don't like any of
>> them.
>>
>> In SpurGenerationScavenger>>growRememberedSet, we have:
>>
>> ...
>> newObj := manager allocatePinnedSlots: numSlots * 2.
>> newObj ifNil:
>> [newObj := manager allocatePinnedSlots: numSlots + 1024.
>> newObj ifNil:
>> [self error: 'could not grow remembered set']].
>> ...
>>
>> If I replace:
>>
>> self error: 'could not grow remembered set'
>>
>> by:
>>
>> (manager growOldSpaceByAtLeast: numSlots + 1024) ifNil: [self error:
>> 'could not grow remembered set'].
>> newObj := manager allocatePinnedSlots: numSlots + 1024. "cannot fail"
>>
>> Then your example works (in 5min45sec on my machine).
>>
>> I would like to have Eliot's opinion before integrating as I am not sure
>> if growing old space in the middle of a scavenge performed during a full GC
>> is a good idea, there might be some strange uncommon interactions with the
>> rest of the GC logic I don't see right now.
>>
>> Eliot what do you think ?
>>
>
> I need to take a look.  I don't like the values that
> setRememberedSetRedZone computes for fudge with very large remembered set
> sizes.  And I need to look at previous scavenges to see if the remembered
> set is being correctly managed before it overflows.
>

In the end I committed my fix, which solves the Cuis problem. If you find a
better solution revert it.


>
>
>>
>>
>> On Thu, Oct 19, 2017 at 9:52 PM, Phil B <pbpublist at gmail.com> wrote:
>>
>>>
>>> Clément,
>>>
>>> I was curious as to whether you or Eliot were able to get anything
>>> useful from this or not.
>>>
>>> Thanks,
>>> Phil
>>>
>>>
>>> On Oct 13, 2017 2:52 AM, "Clément Bera" <bera.clement at gmail.com> wrote:
>>>
>>>
>>>
>>>
>>> On Thu, Oct 12, 2017 at 9:41 PM, Phil B <pbpublist at gmail.com> wrote:
>>>
>>>>
>>>> Clément,
>>>>
>>>> On Oct 11, 2017 4:09 AM, "Clément Bera" <bera.clement at gmail.com> wrote:
>>>>
>>>>
>>>> Hi,
>>>>
>>>> Without a way to reproduce, it is difficult to deal with the problem.
>>>>
>>>> Hopefully, this will allow you to do so: https://github.com/pbella/VmIs
>>>> sueCouldNotGrow
>>>>
>>>> This turned out to be tricky to provide a repo case for since I'm not
>>>> sure exactly what is triggering it so I reproduced the type of work I'm
>>>> throwing at the VM (it's a bulk parser/loader) where there's lots of
>>>> continuous allocation going on with the occasional saving of a result to
>>>> generate lots of garbage.  This should run in 5-10 minutes depending on the
>>>> speed of your system.
>>>>
>>>> The main caveat is that I'm only able to get this example to reliably
>>>> reproduce with the included VM with the commented VM parameters applied.
>>>> So I'm not sure if this is an issue only with this particular VM/parameter
>>>> combination or if it's just generally a difficult to reproduce issue.
>>>>
>>>
>>> Ok.
>>>
>>> Today I am very busy.
>>>
>>> I will try to have a look tomorrow, else Eliot said he could have a look
>>> next week. 5-10 min means if I want to simulate I must likely will need to
>>> start simulation tonight and debug tomorrow morning.
>>>
>>>
>>>>
>>>> Thanks,
>>>> Phil
>>>>
>>>>
>>>
>>>
>>> --
>>> Clément Béra
>>> Pharo consortium engineer
>>> https://clementbera.wordpress.com/
>>> Bâtiment B 40, avenue Halley 59650 Villeneuve d'Ascq
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Clément Béra
>> Pharo consortium engineer
>> https://clementbera.wordpress.com/
>> Bâtiment B 40, avenue Halley 59650 Villeneuve d'Ascq
>>
>>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>
>


-- 
Clément Béra
Pharo consortium engineer
https://clementbera.wordpress.com/
Bâtiment B 40, avenue Halley 59650 Villeneuve d'Ascq
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20171024/6b837811/attachment-0001.html>


More information about the Vm-dev mailing list