[Vm-dev] VM crash with message 'could not grow remembered set'

Phil B pbpublist at gmail.com
Thu Oct 12 19:16:53 UTC 2017


Clément,

On Oct 11, 2017 4:09 AM, "Clément Bera" <bera.clement at gmail.com> wrote:


Hi,

Without a way to reproduce, it is difficult to deal with the problem.


I managed to get a reproducible example... Will post details shortly.


I had reports also from the Pharo community with some problems when scaling
up, but it seems most of that noise is gone since Spur has a new compactor.
There are still some issues, such as GC pauses, that we are trying to deal
with. I wrote this post here
<https://clementbera.wordpress.com/2017/03/12/tuning-the-pharo-garbage-collector/>
to
help people dealing with larger images (couple Gbs). There are things that
you can change from the image, with the vm parameters, that are recommended
for larger images. For example in Java for a couple Gb heap the VM scales
up automatically young space size to 200Mb, in our case the default is set
to 4Mb and you need to use vm parameters to set it up.


Funny you mention that post as it might have some bearing on the issue 😀


The thing with an image-side exception is that it will execute additional
code and allocate new objects. To do that, we need first to deal with the
problem. For example we could do a scavenge to try to decrease the number
of RT entries, but maybe the overflow happened in a specific execution
state where a scavenge is not possible. The command line error messages
don't have this kind of problems. We could try to do something along those
lines but it is not that simple.


What I had in mind was something along the lines of an optional check
*before* hitting an absolute resource limit that would raise an exception.

For example, let's say we had:
Smalltalk vmParameterAt: X put: 95. "A VM parameter which doesn't currently
exist accepting a small int in the range of 0-99 where 0 represents don't
warn and any other value represents the warning threshold as a percentage.
Just to keep things simple, use a single setting as the warning threshold
for all scare resources that we know the upper bounds for"

So at some point you run some code that crosses threshold for some
scarce/fixed resource... maybe it's stack pages, maybe semaphores..
whatever.  Then the VM could raise a VMThresholdExceeded exception (or
whatever it made sense to call it) in the process that triggered it with a
message string indicating what limit got hit.  This would most likely need
to be a resettable one-shot trigger to be useful (I.e. to ensure that it
doesn't trigger a cascade of exceptions).  That would be a much nicer
troubleshooting starting point than a stack trace at the command line.


Regards,


On Tue, Oct 10, 2017 at 9:58 PM, Phil B <pbpublist at gmail.com> wrote:

>
> Clément,
>
> Thanks for the info.  This is a Spur image.  Unfortunately it has some
> sensitive information so I'll have to see if I can reproduce the issue in
> one I can share.
>
> On a related note, I seem to be running into more of these kinds of VM
> issues as I'm attempting to scale up my image sizes (I can only imagine the
> fun I'll be having with multi GB images) and am thinking it would be
> helpful if the VM had the ability (even if it requires some sort of debug
> build) to raise an exception in the image when a fixed resource exceeded X%
> of it's maximum value.  Has a capability along those lines been considered?
>
> Thanks,
> Phil
>
> On Oct 10, 2017 1:12 AM, "Clément Bera" <bera.clement at gmail.com> wrote:
>
>
> Hi,
>
> This is another limit than the out of memory error. Too many references
> from old objects to young objects. The limit cannot be changed directly
> from the image. However, if using Spur, you can try to change the young
> space size, which also changes the remembered table size and might fix your
> problem. To do so you can do:
> Smalltalk vm parameterAt: 45 put: (Smalltalk vm parameterAt: 44) * 4. And
> then *restart the image.*
> Check here
> <https://clementbera.wordpress.com/2017/03/12/tuning-the-pharo-garbage-collector/> section
> TUNING NEW SPACE SIZE for more info about that.
>
> If you are using Spur, could you send us the image (if it is 500Mb could
> you put it to download on dropbox or something like that ?) ? That way we
> can reproduce and see what is possible. Normally in Spur if the remembered
> table grows too big a tenure to shrink the remembered table happens, so
> that error should not happen. Eliot is currently moving to another place,
> so he might be busy. If he is available to answer, I guess he will have a
> look, if he is not, I can have a look today or thursday. However I am not
> interested in fixing pre-Spur VMs.
>
> Regards,
>
>
>
>
> On Mon, Oct 9, 2017 at 11:57 PM, Phil B <pbpublist at gmail.com> wrote:
>
>>
>> Is this effectively an out of memory error or am I hitting some other
>> internal VM limit?  (I.e. can the limit be increased or is  it a hard
>> limit?) I'm running into this when using the reference finder tool in a
>> Cuis image.  (It's a moderately large image at ~500 meg)
>>
>>
>
>
>
>


-- 
Clément Béra
Pharo consortium engineer
https://clementbera.wordpress.com/
Bâtiment B 40, avenue Halley 59650 Villeneuve d'Ascq
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20171012/3fca491a/attachment.html>


More information about the Vm-dev mailing list