[Vm-dev] Unix heartbeat thread vs itimer

Denis Kudriashov dionisiydk at gmail.com
Tue Jan 10 20:08:03 UTC 2017


Will event-driven VM fix this problem completely? Or heartbeat will be
needed anyway?

2017-01-06 20:23 GMT+01:00 Eliot Miranda <eliot.miranda at gmail.com>:

> Hi Fabio, Hi Guille,
> On Fri, Jan 6, 2017 at 9:44 AM, Fabio Niephaus <lists at fniephaus.com>
> wrote:
>> On Fri, Jan 6, 2017 at 6:33 PM Eliot Miranda <eliot.miranda at gmail.com>
>> wrote:
>>> Hi Guille,
>>> > On Jan 6, 2017, at 6:44 AM, Guillermo Polito <
>>> guillermopolito at gmail.com> wrote:
>>> >
>>> > Hi,
>>> >
>>> > I was checking the code in sqUnixHeartbeat.c to see how the heartbeat
>>> thread/itimer worked. It somehow bothers me that there are different
>>> compiled artifacts, one per option.
>>> >
>>> > What do you think about having a VM that manages that as an argument
>>> provided when we launch the VM? This would add some flexibility that we
>>> don't have right now because we make the decision at compile time.
>>> I think it's a fine idea but it isn't really the issue.  The issue is
>>> that the itimer mechanism is problematic, especially for foreign code, and
>>> is therefore a stop gap.  The itimer interrupts long-running system calls,
>>> which means that things like sound libraries break (at Qwaq I had to fix
>>> ALSA to get it to work with the itimer heartbeat).  Since Pharo is becoming
>>> more reliant on external code it may impact us more going forward.
>>> The real issue is that linux's requirement that thread priorities be set
>>> in per-application file in /etc/security/limits.d (IIRC) is a big.  Neither
>>> Windows nor Mac OS X requires such nonsense, and a threaded heartbeat is
>>> used on those systems without any issue at all.  Why linux erected this
>>> mess in the first place is something I don't understand.
>>> I had to implement the itimer heartbeat to get Qwaq forums running on
>>> Linux running pre 2.6 kernels, but had many other problems to solve as a
>>> result (ALSA, database connects).
>>> Were it that the vm merely had to detect whether it could use the
>>> threaded heartbeat then things would be easy.  Instead one can only use the
>>> thing if one has superuser permissions to install a file in /etc, just to
>>> use a thread of higher priority than the main one.
>> Thanks for the explanation, Eliot. I had no idea how bad the issues are
>> with the itimer, but I'm glad you also see the user-facing issue with the
>> heartbeat.
>>> An alternative might be to lower the priority of the main thread.  Then
>>> the file installation would be unnecessary.
>> Could you elaborate a little bit more on this idea? How could this impact
>> the vm? What could be the drawbacks here?
> First of all, for the heartbeat thread to work reliably it must run at
> higher priority than the thread running Smalltalk code.  This is because
> its job is to cause Smalltalk code to break out at regular intervals to
> check for events.  If the Smalltalk code is compute-intensive then it will
> prevent the heartbeat thread from running unless the heartbeat thread is
> running at a higher priority, and so it will be impossible to receive input
> keys, etc. (Note that if event collection was in a separate thread it would
> suffer the same issue; compute intensive code would block the event
> collection thread unless it was running at higher priority).
> Right now, Linux restricts creating threads with priority higher than the
> default to those programs that have a /etc/security/limits.d/program.conf
> file that specifies the highest priority thread the program can create.
> And prior to the 2.6.12 kernel only superuser processes could create
> higher-priority threads.  I do know that prior to 2.6.12 one couldn't
> create threads of *lower* priority than the default either (I would have
> used this if I could).
> If 2.6.12 allows a program to create threads with lower priorities
> *without* needing a /etc/security/limits.d/program.conf, or more
> conveniently to allow a thread's priority to be lowered, then the idea is:
> 1. at start-up create a heartbeat thread at the normal priority
> 2. lower the priority of the main VM thread below the heartbeat thread.
> Alternatively, one could spawn a new lower-priority thread to run
> Smalltalk code, but this may be be much more work.
> The draw-back is that running Smalltalk in a thread whose priority is
> lower than the default *might* impact performance with lots of other
> processes running.  This depends on whether the scheduler conflates thread
> priorities with process priorities (which was the default with old linux
> threads, which were akin to processes).
> Sop there are some tests to perform:
> a) see if one can lower the priority of a thread without having a
> /etc/security/limits.d/program.conf in place
> b) write a simple performance test (nfib?) in a program that can be run
> either with its thread having normal or lower priority, and run two
> instances of the program at the same time and see if they take
> significantly different times to compute their result
> If a) is possible and b) shows no significant difference in the wall-times
> of the two programs then we can modify the linux heartbeat code to *lower*
> the priority of the main Smalltalk thread if it finds it can't create a
> heartbeat thread with higher priority.
> I hope this answers your questions.
> As a footnote let me describe why we use a heartbeat at all.  When I
> started working on the VisualWorks VM (HPS) in the '90s it had no heartbeat
> (IIRC, it might have only been the Windows VM that worked like this).
> Instead there was a counter decremented in every frame-building send (i.e.
> in the jitted machine code that activated a Smalltalk send), and when this
> counter went to zero the VM broke out and checked for events.  This counter
> was initialized to 256 (IIRC).  Consequently there was an enormous
> frequency of event checks, until, that is, ione did something that reduced
> the frequency of frame-building sends.  One day I was doing something which
> invoked lots of long-running large integer primitives and I noticed that
> when I tried to interrupt the program it took many seconds before the
> system stopped.  What was happening was that the large integer primitives
> were taking so long that the counter took many seconds to count down to 0.
> The system didn't check for events very often.  So the problems with a
> counter are that
> a) a read-modify-write cycle for a counter is in itself very expensive in
> a high-frequency operation like building a frame
> b) in normal operation the counter causes far too many check-fore-event
> calls
> c) in abnormal operation the counter causes infrequent check-fore-event
> calls
> One solution on Unix is an interval timer (which my old BrouHaHa VMs used,
> but it did;t have much of an FFI so the problems it caused weren't
> pressing).
> The natural solution is a heartbeat thread, and this is used in a number
> of VMs.  One gets a regular event check frequency at very low cost.  In
> Smalltalk VMs which do context-to-stack mapping it is natural to organize
> the stack as a set of pages and hence to have frame building sends check a
> stack limit (guarding the end of the page).  The heartbeat simply sets the
> stack limit to the highest possible address to cause a stack limit check
> failure on the next send, and the stack check failure code checks if the
> stack limit has been set to the highest dress and calls the event check
> instead of handling the stack page overflow.  In the HotSpot Java VM, if
> the platform supports it, a frame building send writes a byte to a guard
> page.  Modern professors have write buffers so the write has very low cost
> (because it is never read) and is effectively free.  So the heartbeat
> changes the guard page's permissions to take away write permission and
> cause an exception.  The exception handler then checks and causes the VM to
> check for events.  For this to work, all of writes, removing and setting
> page write permissions and handling exceptions must be sufficiently cheap.
> Anyone looking for a low-level project for the Cog VM could take a look at
> this mechanism.  I've chosen to stick with the simple stack limit approach.
> Fabio
>>> To summarize, the itimer heartbeat is to be avoided as much as
>>> possible.  It causes hard to debug issues with external code, has to be
>>> turned off and on around fork.  It's a stop gap.  Having to install a file
>>> in /etc just to be able to use a thread is insane (and AFAICT unique to
>>> linux).  Whatever you do in the short term to deal with these problems I'll
>>> support, but in the long term we simply want a threaded heartbeat without
>>> needing to install anything.
>>> >
>>> > The code in sqUnixHeartbeat.c is not a lot nor very complex, it should
>>> not be difficult to do...
>>> >
>>> > Also, what would be the drawbacks besides an increase on the vm size?
>>> I hope I've explained above that I expect the drawbacks will be
>>> intermittent failures of external code.
>>> >
>>> > Guille
> _,,,^..^,,,_
> best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20170110/6dd4fb4c/attachment-0001.html>

More information about the Vm-dev mailing list