[Vm-dev] Heartbeat & Linux

Eliot Miranda eliot.miranda at gmail.com
Mon Apr 22 17:52:25 UTC 2013


On Mon, Apr 22, 2013 at 10:31 AM, Bert Freudenberg <bert at freudenbergs.de>wrote:

>
>
> On 2013-04-22, at 19:16, Eliot Miranda <eliot.miranda at gmail.com> wrote:
>
>
>
> On Mon, Apr 22, 2013 at 5:33 AM, Bert Freudenberg <bert at freudenbergs.de>wrote:
>
>>
>> Still, you make it sound like this inefficiency is actually preventing
>> the Stack VM from running on the Raspberry Pi? That is hard to believe.
>>
>
> No, we just use the interval timer.  But as I say in a
> server/multi-user/ffi context the interval timer is problematic.
>
>
> Okay. So for the typical local single pi user it should be fine :)
>
> Btw, what kind of deadlocks have you seen with SA_RESTART? Normally that
> should avoid the EINTR problems.
>

Just weird bugs that look like linux internal issues.

The most recent one is with PAM, the authentication library, using
pam_start and pam_authenticate with a callback to fetch the password (the
pam_conv structure).  On linux if one successfully authenticates then fine.
 But if one answers an invalid password to fail the authentication the
ITIMER_REAL interval timer's reload value gets set to zero, disabling the
heartbeat.  If one puts a breakpoint on the setitimer entry-point for the
setitimer system call then it is never called (I didn't put a breakpoint on
the indirect system call).

The previous one with RHES 4 is that if there are two threads in the VM
then very occasionally in the delivery of the SIGALRM signal the kernel
confuses the thread control blocks of the two threads and one ends up with
both the VM and the second thread running the second thread's code,
freezing the VM.

<rant>Essentially I think linux has serious quality control issues and the
itimer tickles them.  These things are a right royal pain in the ass to
debug and a pain to work-around.  I don't see much difference between
Windows and linux in terms of quality.  They're both abysmal.  In my
experiemce Mac OS X, Solaris and HP-UX are of signaificantly higher
quality</rant>.

>
> - Bert -
>
>
>> - Bert -
>>
>> On 2013-04-19, at 05:24, Casey Ransberger <casey.obrien.r at gmail.com>
>> wrote:
>>
>> I had a feeling the actual problem was probably political. The name is
>> also a bit... well. As many children will prefer to avoid the use of
>> profanity while raising their parents, I wonder if I shouldn't start by
>> forking it and calling it the Big Furry Scheduler or something...
>>
>> Anyhow this sheds lots of light on things. Thank, Eliot!
>>
>>
>> On Thu, Apr 18, 2013 at 1:29 PM, Eliot Miranda <eliot.miranda at gmail.com>wrote:
>>
>>>
>>> Hi Casey,
>>>
>>> On Thu, Apr 18, 2013 at 5:37 AM, Casey Ransberger <
>>> casey.obrien.r at gmail.com> wrote:
>>>
>>>>
>>>> Totally. I know it isn't going to be a small thing. I'm curious if any
>>>> of the BSD implementations solve the problem adequately for Cog; I might
>>>> have something working to study, build tests around, etc. A control group.
>>>>
>>>
>>> The technical problem was solved for Linux in 2009 by Con Kolivas with
>>> his (excuse the language)
>>> http://en.wikipedia.org/wiki/Brain_Fuck_Scheduler.  The political
>>> problem of getting it adopted throughout the linux community so it can be
>>> relied upon is an entirely different thing.
>>>
>>> Note that once BFS is more widely available I could write a VM that will
>>> test thread priorities on startup and use threads if multiple priorities
>>> are supported, falling back on the ITIMER if not.
>>>
>>>
>>>>
>>>> Either way though, it would be a largish win for the community, given
>>>> the widespread popularity of Linux. I might be able to find some friends
>>>> who might like to attack it with me. Etcetera.
>>>>
>>>> I wonder about why the Linux folks haven't dealt with it already. Maybe
>>>> it hasn't come up. Hopefully it isn't a security thing, that'd be hard to
>>>> make a case about.
>>>>
>>>> Either way, I haven't figured out where to start looking yet.
>>>>
>>>>
>>>> On Thu, Apr 18, 2013 at 5:23 AM, David T. Lewis <lewis at mail.msen.com>wrote:
>>>>
>>>>>
>>>>> I think that Eliot has described the issue in some detail either on
>>>>> this list, or on his blog http://www.mirandabanda.org/cogblog/. Sorry
>>>>> I don't have any links handy, but reading through the blog is time
>>>>> well spent in any case :)
>>>>>
>>>>> I think you are dealing with a fundamental issue in the pthreads
>>>>> implementation(s) on Linux, so don't be surprised if it's not just
>>>>> a simple matter of fixing a bug. The pthreads implementation is more
>>>>> or less grafted onto the original Unix process model, and operating
>>>>> systems tend to vary as to whether they support pthreads fully, or
>>>>> for that matter whether they support a thread model at all.
>>>>>
>>>>> Dave
>>>>>
>>>>> On Thu, Apr 18, 2013 at 05:07:42AM -0700, Casey Ransberger wrote:
>>>>> >
>>>>> > Well I guess maybe I left a part of the message out. Oops. I'm
>>>>> talking
>>>>> > about Cog's heart. It needs user-level thread prioritization and
>>>>> apparently
>>>>> > (GNU/?)Linux lacks this.
>>>>> >
>>>>> > My guess is it's a kernel issue. But that's an uneducated guess.
>>>>> >
>>>>> > It became interesting to me because I'm not super happy with the UI
>>>>> > responsiveness of Squeak trunk under the interpreter and Raspbian on
>>>>> a
>>>>> > 700MHz Pi (out of box experience isn't overclocked, so I'm targeting
>>>>> that.)
>>>>> > My thought was to get the stack VM going, but we still have this
>>>>> issue with
>>>>> > the heartbeat for the stack VM to be able to perform optimally. If I
>>>>> can
>>>>> > fix the heartbeat (read: pthreads) issue, I can benefit GNU/Linux
>>>>> users of
>>>>> > the system across the board, both in terms of the (Intel) JIT, and
>>>>> in terms
>>>>> > of the stack oriented virtual machine which has the awesome
>>>>> Ungar-style
>>>>> > garbage collection, right? I want the efficient GC eventually
>>>>> anyway. The
>>>>> > advantages of stack orientation are somewhat beyond my current
>>>>> > understanding of virtual machines, but I gather that's desirable as
>>>>> well?
>>>>> > </noob>
>>>>> >
>>>>> > I need to do some tests with Cuis, Spoon, Pharo, and Etoys before
>>>>> I'm going
>>>>> > to blame any part of the VM about the UI perf I'm seeing, but a
>>>>> faster VM
>>>>> > is a faster VM anyway. I'd like to make all of the images faster on
>>>>> this
>>>>> > little gadget, because it's cool, it's popular, and it gives us an
>>>>> approach
>>>>> > on more than just the third world (not that the third world isn't
>>>>> > incredibly important, just that if we don't start making better
>>>>> adults
>>>>> > where people with first world problems are too, I'll end up feeling
>>>>> like
>>>>> > Rick Moranis in Spaceballs or I'm going to have to relocate to
>>>>> someplace
>>>>> > where we previously made better adults.)
>>>>> >
>>>>> > http://www.youtube.com/watch?v=sen8Tn8CBA4
>>>>> >
>>>>> > I know this isn't really the place to ask, but I thought maybe
>>>>> someone
>>>>> > might be able to point me at what I'd need to dig into to understand
>>>>> the
>>>>> > problem we have with the popular Linux implementation of pthreads,
>>>>> because
>>>>> > when I GOOG that, I get #1 a tutorial on using pthreads, #2 a
>>>>> Wikipedia
>>>>> > article which speaks abstractly about pthreads, and #3 another
>>>>> tutorial on
>>>>> > using pthreads. I think I'm trying to figure out what to google for
>>>>> so that
>>>>> > I can figure out where the problem lives and then look at it to see
>>>>> whether
>>>>> > or not I can steal its lunch money.
>>>>> >
>>>>> > Even if I google "linux pthreads" I still get a tutorial on using
>>>>> them as
>>>>> > the top hit.
>>>>> >
>>>>> > This is the most directly informative thing I can find:
>>>>> >
>>>>> > http://man7.org/linux/man-pages/man7/pthreads.7.html
>>>>> >
>>>>> > Hopefully this explains anything I might have left out of my first
>>>>> message
>>>>> > on the subject!
>>>>> >
>>>>> > Casey
>>>>> >
>>>>> >
>>>>> > On Wed, Apr 17, 2013 at 5:33 AM, David T. Lewis <lewis at mail.msen.com>
>>>>> wrote:
>>>>> >
>>>>> > >
>>>>> > > On Wed, Apr 17, 2013 at 01:15:45AM -0700, Casey Ransberger wrote:
>>>>> > > >
>>>>> > > > I'm assuming the problem is in the kernel, but making
>>>>> assumptions is
>>>>> > > usually a bad plan, so I'm asking.
>>>>> > > >
>>>>> > > > When I look, I find a rather confusing jumble: pthreads exists in
>>>>> > > different implementations under different operating systems.
>>>>> > > >
>>>>> > > > I'd kind of like the heartbeat to work right under Raspbian,
>>>>> because
>>>>> > > kids are going to use it, and the goal of producing better adults
>>>>> is pretty
>>>>> > > close to my heart. See also making Squeak faster. (Stack VM.)
>>>>> > > >
>>>>> > > > I want to arm myself with as much knowledge about this as
>>>>> possible,
>>>>> > > because I'm considering just bloody fixing it. If the (GNU/Linux)
>>>>> community
>>>>> > > has objections, and I'm able to pull off the fix, I can make a
>>>>> patch
>>>>> > > available to people who want it.
>>>>> > > >
>>>>> > > > I'm not confident that I can fix the bug with the knowledge I
>>>>> have now,
>>>>> > > but I'm confident that I can fix it eventually, if someone else
>>>>> doesn't
>>>>> > > beat me to the punch.
>>>>> > > >
>>>>> > > > Outside of the wiki page, are there any documents/people/mailing
>>>>> lists I
>>>>> > > should look at to be able to get a good handle on where in the
>>>>> kernel (or
>>>>> > > elsewhere) I should be looking for the code which isn't meeting our
>>>>> > > Cog-nitive requirements?
>>>>> > > >
>>>>> > >
>>>>> > > I do not understand what problem or bug you are asking about. Is
>>>>> there
>>>>> > > some issue with Raspbian Linux that is causing a problem?
>>>>> > >
>>>>> > > Dave
>>>>> > >
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Casey Ransberger
>>>>
>>>>
>>>
>>>
>>> --
>>> best,
>>> Eliot
>>>
>>>
>>
>>
>> --
>> Casey Ransberger
>>
>>
>>
>>
>
>
> --
> best,
> Eliot
>
>
> - Bert -
>
>
>
>


-- 
best,
Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20130422/5da50f88/attachment.htm


More information about the Vm-dev mailing list