[Vm-dev] Maximum value of -stackpages VM parameter?

Thu Jun 15 20:33:03 UTC 2017

Hi Phil,

    via vmParameterAt: you'll access

60 number of stack page overflows since startup (read-only; Cog VMs only)

a stack page overflow occurs when a computation that sends deeply enough
fills a page with activations and to continue needs to extend onto a fresh
page.  It is expected that this number be high.

61 number of stack page divorces since startup (read-only; Cog VMs only)

a stack page divorce occurs when either a stack overflow or a process
switch require a new page but all pages are in use and so the least
recently used page is "divorced"; its activations are converted into
context objects on the heap, emptying the page and allowing its reuse.
This is the number we'd like to keep low by upping the number of stack
pages, but not so much that we slow down GC.

68 the average number of live stack pages when scanned by GC (at
scavenge/gc/become et al)
69 the maximum number of live stack pages when scanned by GC (at
scavenge/gc/become et al)

These two (sorry, just noticed they're not in the method comment, at least
in Squeak) can be used to monitor how many stack pages are in use as the
system runs.

>From these two we can tell whether a large number of pages leads to a high
load on the scavenger scanning stack pages.  If the average is low while
the number of stack pages is high then the application has a pattern that
is insensitive to the number of stack pages and then one can increase the
number of stack pages without seeing much GC overhead.  But I expect this
is unlikely; these two were added to monitor GC performance at Cadence and
indeed we see that increasing the number of stack pages in use also
increases the average number of stack pages in use at GC time.

That said, in my current VMMaker image, in this session merely used for
browsing, I see

#42 50 number of stack pages available   (default)
#43 0 desired number of stack pages (i.e. select default)

#60 89,370 number of stack page overflows since startup
#61 0 number of stack page divorces since startup

#68 11.35 the average number of live stack pages when scanned by
scavenge/gc/become
#69 16 the maximum number of live stack pages when scanned by
scavenge/gc/become

So in normal development use it looks like stack page use is minimal.

On Thu, Jun 15, 2017 at 1:10 PM, Phil B <pbpublist at gmail.com> wrote:

>
> Eliot,
>
> Thanks for the tip, I'll give that a shot.  Also, is it possible to check
> the amount of stack usage from the image? (I.e. just to get a rough idea of
> where things stand that's reasonably fast)
>
> Phil
>
>
> On Jun 12, 2017 7:24 PM, "Eliot Miranda" <eliot.miranda at gmail.com> wrote:
>
>
> Hi Phil,
>
> On Jun 12, 2017, at 2:25 PM, Phil B <pbpublist at gmail.com> wrote:
>
> Eliot,
>
> Thanks for the info, that's good to know.  I probably should have been
> explicit in that I am only bumping it up this high to troubleshoot a rather
> annoying startup bug in my code. When it crashes as a result of the stack
> overflow the trace is pretty useless (iirc, about 1/2 a page of INVALID
> REFERENCE so I'm mostly flying blind.)  Bumping up the limit is allowing me
> to get a better view of where things are going wrong and I plan to drop
> back once I've resolved it.
>
>
> A better way to debug thus will be to set a breakpoint in the scavenger
> and the GC on every GC.  Stack overflow in a language like Smalltalk where
> activations are objects means that the heap grows as the stack grows.  (The
> stack pages in the stack zone can be seen as an allocation cache for the
> most recent activations, reducing the pressure on the GC).  So if run under
> gdb (lldb on Mac) and you print the stack in each GC you should be able to
> at least see where the infinite recursion is coming from before the system
> runs out of memory:
>
> (gdb) b doScavenge
> breakpoint 1 set at NNNN
> (gdb) commands 1
> call printStackCallStackOf(framePointer)
> end
> (gdb) run myimage.image
>
> You can use
> (gdb) call pushOutputFile("stack.log")
> to get the vm to send subsequent output to a file and
> (gdb) call popOutputFile()
> to close the log.
>
>
> Thanks,
> Phil
>
> On Jun 12, 2017 4:43 PM, "Eliot Miranda" <eliot.miranda at gmail.com> wrote:
>
>
> Hi Phil,
>
>
> > On Jun 12, 2017, at 12:50 PM, Phil B <pbpublist at gmail.com> wrote:
> >
> > In trying to troubleshoot an issue, I needed to bump up the stackpages
> parameter.  On 64-bit Linux, a value of 600 worked but 1000 segfaulted so I
> was just wondering what the limit(s) are for it?
>
> There are no explicit limits.  The set fault you're seeing is as a result
> of the stack pages being allocated on the c stack.  When the number is high
> the stack overflows and boom.
>
> A word to the wise: too high a value and scavenging performance falls
> (stack pages are implicitly roots into new space), and become performance
> falls (all activations in stack space are scanned post become to avoid a
> read barrier on inst var fetch).
>
> The default value was 192, a value chosen to exceed qwaq server process
> usage, but both at Cadence and in Spur profiling we found that was not a
> good value and pulled it back to 64 (IIRC).
>
> I'm curious as to why are you exploring such high values.
>
>
>
>
>
>

-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20170615/891b1b5d/attachment.html>