[Vm-dev] Unix heartbeat thread vs itimer

Fabio Niephaus lists at fniephaus.com
Fri Jan 6 21:47:14 UTC 2017


Hi Eliot,

On Fri, Jan 6, 2017 at 8:23 PM Eliot Miranda <eliot.miranda at gmail.com>
wrote:

>
> Hi Fabio, Hi Guille,
>
> On Fri, Jan 6, 2017 at 9:44 AM, Fabio Niephaus <lists at fniephaus.com>
> wrote:
>
>
> On Fri, Jan 6, 2017 at 6:33 PM Eliot Miranda <eliot.miranda at gmail.com>
> wrote:
>
>
> Hi Guille,
>
> > On Jan 6, 2017, at 6:44 AM, Guillermo Polito <guillermopolito at gmail.com>
> wrote:
> >
> > Hi,
> >
> > I was checking the code in sqUnixHeartbeat.c to see how the heartbeat
> thread/itimer worked. It somehow bothers me that there are different
> compiled artifacts, one per option.
> >
> > What do you think about having a VM that manages that as an argument
> provided when we launch the VM? This would add some flexibility that we
> don't have right now because we make the decision at compile time.
>
> I think it's a fine idea but it isn't really the issue.  The issue is that
> the itimer mechanism is problematic, especially for foreign code, and is
> therefore a stop gap.  The itimer interrupts long-running system calls,
> which means that things like sound libraries break (at Qwaq I had to fix
> ALSA to get it to work with the itimer heartbeat).  Since Pharo is becoming
> more reliant on external code it may impact us more going forward.
>
> The real issue is that linux's requirement that thread priorities be set
> in per-application file in /etc/security/limits.d (IIRC) is a big.  Neither
> Windows nor Mac OS X requires such nonsense, and a threaded heartbeat is
> used on those systems without any issue at all.  Why linux erected this
> mess in the first place is something I don't understand.
>
> I had to implement the itimer heartbeat to get Qwaq forums running on
> Linux running pre 2.6 kernels, but had many other problems to solve as a
> result (ALSA, database connects).
>
> Were it that the vm merely had to detect whether it could use the threaded
> heartbeat then things would be easy.  Instead one can only use the thing if
> one has superuser permissions to install a file in /etc, just to use a
> thread of higher priority than the main one.
>
>
> Thanks for the explanation, Eliot. I had no idea how bad the issues are
> with the itimer, but I'm glad you also see the user-facing issue with the
> heartbeat.
>
>
> An alternative might be to lower the priority of the main thread.  Then
> the file installation would be unnecessary.
>
>
> Could you elaborate a little bit more on this idea? How could this impact
> the vm? What could be the drawbacks here?
>
>
> First of all, for the heartbeat thread to work reliably it must run at
> higher priority than the thread running Smalltalk code.  This is because
> its job is to cause Smalltalk code to break out at regular intervals to
> check for events.  If the Smalltalk code is compute-intensive then it will
> prevent the heartbeat thread from running unless the heartbeat thread is
> running at a higher priority, and so it will be impossible to receive input
> keys, etc. (Note that if event collection was in a separate thread it would
> suffer the same issue; compute intensive code would block the event
> collection thread unless it was running at higher priority).
>
> Right now, Linux restricts creating threads with priority higher than the
> default to those programs that have a /etc/security/limits.d/program.conf
> file that specifies the highest priority thread the program can create.
> And prior to the 2.6.12 kernel only superuser processes could create
> higher-priority threads.  I do know that prior to 2.6.12 one couldn't
> create threads of *lower* priority than the default either (I would have
> used this if I could).
>
> If 2.6.12 allows a program to create threads with lower priorities
> *without* needing a /etc/security/limits.d/program.conf, or more
> conveniently to allow a thread's priority to be lowered, then the idea is:
> 1. at start-up create a heartbeat thread at the normal priority
> 2. lower the priority of the main VM thread below the heartbeat thread.
> Alternatively, one could spawn a new lower-priority thread to run
> Smalltalk code, but this may be be much more work.
>
> The draw-back is that running Smalltalk in a thread whose priority is
> lower than the default *might* impact performance with lots of other
> processes running.  This depends on whether the scheduler conflates thread
> priorities with process priorities (which was the default with old linux
> threads, which were akin to processes).
>
> Sop there are some tests to perform:
>
> a) see if one can lower the priority of a thread without having a
> /etc/security/limits.d/program.conf in place
> b) write a simple performance test (nfib?) in a program that can be run
> either with its thread having normal or lower priority, and run two
> instances of the program at the same time and see if they take
> significantly different times to compute their result
>
> If a) is possible and b) shows no significant difference in the wall-times
> of the two programs then we can modify the linux heartbeat code to *lower*
> the priority of the main Smalltalk thread if it finds it can't create a
> heartbeat thread with higher priority.
>
>
> I hope this answers your questions.
>

Yes, it does. Thanks!


>
>
> As a footnote let me describe why we use a heartbeat at all.  When I
> started working on the VisualWorks VM (HPS) in the '90s it had no heartbeat
> (IIRC, it might have only been the Windows VM that worked like this).
> Instead there was a counter decremented in every frame-building send (i.e.
> in the jitted machine code that activated a Smalltalk send), and when this
> counter went to zero the VM broke out and checked for events.  This counter
> was initialized to 256 (IIRC).  Consequently there was an enormous
> frequency of event checks, until, that is, ione did something that reduced
> the frequency of frame-building sends.  One day I was doing something which
> invoked lots of long-running large integer primitives and I noticed that
> when I tried to interrupt the program it took many seconds before the
> system stopped.  What was happening was that the large integer primitives
> were taking so long that the counter took many seconds to count down to 0.
> The system didn't check for events very often.  So the problems with a
> counter are that
> a) a read-modify-write cycle for a counter is in itself very expensive in
> a high-frequency operation like building a frame
> b) in normal operation the counter causes far too many check-fore-event
> calls
> c) in abnormal operation the counter causes infrequent check-fore-event
> calls
>
> One solution on Unix is an interval timer (which my old BrouHaHa VMs used,
> but it did;t have much of an FFI so the problems it caused weren't
> pressing).
>
> The natural solution is a heartbeat thread, and this is used in a number
> of VMs.  One gets a regular event check frequency at very low cost.  In
> Smalltalk VMs which do context-to-stack mapping it is natural to organize
> the stack as a set of pages and hence to have frame building sends check a
> stack limit (guarding the end of the page).  The heartbeat simply sets the
> stack limit to the highest possible address to cause a stack limit check
> failure on the next send, and the stack check failure code checks if the
> stack limit has been set to the highest dress and calls the event check
> instead of handling the stack page overflow.  In the HotSpot Java VM, if
> the platform supports it, a frame building send writes a byte to a guard
> page.  Modern professors have write buffers so the write has very low cost
> (because it is never read) and is effectively free.  So the heartbeat
> changes the guard page's permissions to take away write permission and
> cause an exception.  The exception handler then checks and causes the VM to
> check for events.  For this to work, all of writes, removing and setting
> page write permissions and handling exceptions must be sufficiently cheap.
> Anyone looking for a low-level project for the Cog VM could take a look at
> this mechanism.  I've chosen to stick with the simple stack limit approach.
>

And thanks for all this info. I didn't really know much about the itimer vs
heartbeat thread topic TBH. I was just surprised that this is so
complicated, because I thought that event handling would be relatively
easy. However, I still don't completely understand why other applications
(e.g. games) don't have these event problems, but that's something I can
look into myself this weekend :)

Fabio


>
> Fabio
>
>
>
> To summarize, the itimer heartbeat is to be avoided as much as possible.
> It causes hard to debug issues with external code, has to be turned off and
> on around fork.  It's a stop gap.  Having to install a file in /etc just to
> be able to use a thread is insane (and AFAICT unique to linux).  Whatever
> you do in the short term to deal with these problems I'll support, but in
> the long term we simply want a threaded heartbeat without needing to
> install anything.
>
> >
> > The code in sqUnixHeartbeat.c is not a lot nor very complex, it should
> not be difficult to do...
> >
> > Also, what would be the drawbacks besides an increase on the vm size?
>
> I hope I've explained above that I expect the drawbacks will be
> intermittent failures of external code.
>
> >
> > Guille
>
>
> _,,,^..^,,,_
> best, Eliot
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20170106/26f7d686/attachment.html>


More information about the Vm-dev mailing list