[Vm-dev] VM on Solaris (was: Camera sig fault on 64 bits machines.)

Eliot Miranda eliot.miranda at gmail.com
Thu Apr 24 18:32:03 UTC 2014


Hi Andreas,

    I just read the OpenSolaris sigaction manual page and it has the
expected semantics with SA_RESETHAND; i.e. one does /not/ have to do
anything special to avoid having to reset the handler.  So I wonder
whether ioInitHeartbeat is even being called.  You might check.  It ends
with a call of setIntervalTimer(beatMilliseconds) which should set the
heartbeat itimer going.


On Thu, Apr 24, 2014 at 11:28 AM, Eliot Miranda <eliot.miranda at gmail.com>wrote:

> Hi Andreas,
>
>
> On Thu, Apr 24, 2014 at 9:58 AM, Andreas Wacknitz <a.wacknitz at gmx.de>wrote:
>
>>
>>
>> Am 24.04.2014 um 00:14 schrieb Eliot Miranda <eliot.miranda at gmail.com>:
>>
>> Hi Andreas,
>>
>>
>> On Wed, Apr 23, 2014 at 9:35 AM, Andreas Wacknitz <a.wacknitz at gmx.de>wrote:
>>
>>>
>>> Thanks again Eliot,
>>>
>>> First, I solved the pthreads problem under OpenSolaris. While Solaris 10
>>> doesn’t need special user privileges for thread control (at least within
>>> the same thread policy I guess),
>>> users under Solaris 11 (and thus OpenSolaris) need the privilege
>>> „proc_priocntl“ to be given by an administrator.
>>> (For those who are interested: usermod -K
>>> defaultpriv=basic,proc_priocntl andreas)
>>>
>>
>> This is a pain :-).  You could either assume that people can always get
>> the necessary permission and go with the threaded heartbeat (my preferred
>> suggestion) or provide two VMs (always tedious).
>>
>> Yes, I consider going with the threaded heartbeat for OpenSolaris (I will
>> also try to compile everything under Solaris 11.1 but that’s on lower
>> priority for me as I am not really using it.).
>> I am not yet decided whether the version without increased priority would
>> be enough. At the moment everything seems to run fine with this version; I
>> can interrupt "[[true] whileTrue] forkAt: Processor userInterruptPriority“
>> by ALT-.
>>
>
> That implies it is working.  But I would definitely make sure the
> heartbeat runs at a higher priority than the main thread.
>
> One thing to check is that delays expire even when the system is fully
> busy, e.g.
>
> | run s |
> run := true.
> s := Semaphore new.
> [| i | i := 0. s wait. [run] whileTrue: [i := i + 1]] forkAt: Processor
> highestPriority - 1.
> [(Delay forSeconds: 1) wait. run := false] forkAt: Processor
> highestPriority.
> s signal
>
> should lock up the system for 1 second.  If the heartbeat is not advancing
> the clock used to check for delays then the sytsem will remain locked.
>
> But the whole thing isn’t finished yet as FFI (NativeBoost doesn’t seem to
>> work (e.g. "UnixEnvironment environ“ fails. I don’t know when I will find
>> time to deal with that.
>>
>
> Well, the code is still useful for the Squeak VM, so please commit if and
> when you have the heartbeat working to your satisfaction.
>
>
>>  More below…
>>>
>>> Am 22.04.2014 um 22:31 schrieb Eliot Miranda <eliot.miranda at gmail.com>:
>>>
>>>
>>>
>>>
>>> On Tue, Apr 22, 2014 at 1:10 PM, Andreas Wacknitz <a.wacknitz at gmx.de>wrote:
>>>
>>>>
>>>>
>>>> Am 22.04.2014 um 21:36 schrieb Eliot Miranda <eliot.miranda at gmail.com>:
>>>>
>>>> Hi Andreas,
>>>>
>>>>
>>>> On Tue, Apr 22, 2014 at 12:05 PM, Andreas Wacknitz <a.wacknitz at gmx.de>wrote:
>>>>
>>>>>
>>>>> This evening I further dealt with the problems on OpenSolaris
>>>>> (openindiana).
>>>>> I finally got a pthread version running without superuser rights. But
>>>>> I don’t know whether this will really work (ATM it does for me)
>>>>> because I removed the call to pthread_setschedparam in
>>>>> beatStateMachine leaving the heartbeat thread with the same
>>>>> priority than the vm thread.
>>>>
>>>>
>>>> Alas, that will not work -(.  As soon as the image enters into a hard
>>>> loop (e.g. [true] whileTrue) the heartbeat thread will be blocked and the
>>>> VM will never break out of the loop.
>>>>
>>>>
>>>> How can I check this blockage? I started the VM with --pollpipe 1 and
>>>> then [true] whileTrue in a Workspace. The GUI is blocked but the pipe is
>>>> still rotating.
>>>>
>>>
>>> Can you interrupt with ctrl-period?  If not, then I don't understand how
>>> the pip is still rotating :-).  If you can, then you're not blocking the
>>> system.  Try e.g. [[true] whileTrue]  forkAt: Processor highestPriority.
>>>
>>> Yes, I can do that in both (with and without higher priority) BUT not
>>> when running this with highestPriority (again in BOTH versions!).
>>>
>>
>> Oops That's right.  It should be just "[[true] whileTrue]  forkAt:
>> Processor userPriority + 1".  Obviously one can't interrupt something
>> running higher than userInterrupt priority.  Sorry, I was asleep.
>>
>> See above. OpenSolaris seems to run fine with the two threads having the
>> same priority.
>>
>
> Since I've been here before I know that examples can be constructed when
> this will not work properly.  My Delay example above should be one of them.
>  All one needs is to arrange that the system is fully busy, shuts out the
> heartbeat thread, but depends on the heartbeat thread to make progress (as
> in the Delay example; the heartbeat advances a low-resolution (~2ms) clock
> that is used to fire delays).
>
>
>>  You are running the JIT right?
>>>
>>> How to tell for sure?
>>>
>>
>> vm -version
>>
>> If it includes a CoInterpreter line you're running the JIT.  e.g.
>> McStalker.macbuild$ oscfvm -version
>> /Users/eliot/Cog/oscogvm/macbuild/Fast.app/Contents/MacOS/Squeak
>> 4.0 4.0.2894 Mac OS X built on Apr 14 2014 17:02:16 Compiler: 4.2.1
>> (Apple Inc. build 5666) (dot 3) [Production VM]
>> CoInterpreter VMMaker.oscog-eem.674 uuid:
>> eefd603d-9638-4ad8-99c0-4ee12e87d49d Apr 14 2014
>> StackToRegisterMappingCogit VMMaker.oscog-eem.674 uuid:
>> eefd603d-9638-4ad8-99c0-4ee12e87d49d Apr 14 2014
>> VM: r2894 http://www.squeakvm.org/svn/squeak/branches/Cog Date:
>> 2014-04-14 15:32:11 -0700
>> Plugins: r2545
>> http://squeakvm.org/svn/squeak/trunk/platforms/Cross/plugins
>>
>> merkur pharo-without-higher-priority $ ./pharo --version
>> 3.9-7 #1 22. April 2014 20:30:39 CEST gcc 4.7.3 [Production VM]
>> NBCoInterpreter NativeBoost-CogPlugin-GuillermoPolito.19 uuid:
>> acc98e51-2fba-4841-a965-2975997bba66 Apr 22 2014
>> NBCogit NativeBoost-CogPlugin-GuillermoPolito.19 uuid:
>> acc98e51-2fba-4841-a965-2975997bba66 Apr 22 2014
>> https://github.com/pharo-project/pharo-vm.git Commit:
>> 9e648898f53aadb692f2dc95f432daedc449d432 Date: 2014-04-09 16:01:20 +0200
>> By: Esteban Lorenzano <estebanlm at gmail.com>
>> SunOS merkur 5.11 illumos-b6240e8 i86pc i386 i86pc
>> plugin path: /home/andreas/bin/pharo-without-higher-priority/ [default:
>> /home/andreas/bin/pharo-without-higher-priority/]
>>
>>
>> What does this tell?
>>
>
> That you're using the JIT (NBCogit & NBCoInterpreter).
>
>
>>  I started the VM with —trace. The last log is
>>> „IRBytecodeGenerator>>from:goto“.
>>> The pipe is still rotating but ALT-. does not break the loop (maybe a
>>> problem of my Pharo image?; I will try later with a Squeak image).
>>>
>>>
>>>  I tried to replace the pthread_setschedparam call with a similar
>>>>> pthread_setschedprio call but
>>>>> with no luck (same problem: failed call with "Not owner"). I don’t
>>>>> know wether this is a general problem with the pthreads implementation
>>>>> on Solaris or just a problem with the gcc version (4.4.4) coming with
>>>>> the openindiana distribution I am using. Maybe this works only
>>>>> with the compilers and libraries that is delivered by Oracle (Solaris
>>>>> 10 ships with gcc 3.4.3; Solaris Studio has its own compilers).
>>>>>
>>>>
>>>> It's too do with pthreads.  Nothing to do with the compiler.  On some
>>>> implementations it requires special permission to create threads with
>>>> different priorities.  That used to be the case on linux and it appears to
>>>> be the case on OpenSolaris.  Hence one is stuck with the itimer heartbeat.
>>>>
>>>> Is there any implementation actually using sqUnixITimerHeartbeat.c?
>>>>
>>>
>>> Yes, but unhappily.  We use it at Cadence because we have customers on
>>> pre 2.6.12 kernels.  We have to e.g. switch off the heartbeast around
>>> certain external calls.
>>>
>>> I am still wondering about where the necessary sleep call will be
>>> generated in this case. I will check your latest VM sources. Maybe PharoVM
>>> is different here…
>>>
>>
>> Where is there a necessary sleep?
>>
>> My understanding is that the interrupt handler for the heartbeat is
>> waiting for SIGALRM. Typically this is emitted by an expiring usleep or
>> nanosleep call. I cannot see one in the code that is active
>> when compiling the pharo-vm with ITIMER_HEARTBEAT flag set. In case that
>> VM_TICKER flag is set there is an nanosleep call in the corresponding code.
>>
>
> But SIGALRM is also delivered by setitimer, as in
>
> ...
> # define THE_ITIMER ITIMER_REAL
> # define ITIMER_SIGNAL SIGALRM
> ...
> if (setitimer(THE_ITIMER, &pulse, &pulse)) {
>
> in platforms/unix/vm/sqUnixITimerHeartbeat.c.  I'm surprised this doesn't
> work on OpenSolaris.  In fact, I can't believe it doesn't work.  Something
> odd must be going on.
>
>
>> The fact that my ITIMER_HEARTBEAT version is running when external
>> SIGALRM’s being triggered confirms my view that a source for this signal is
>> missing.
>>
>
> We;;, the source is there (the call to setitimer) but for some reason
> something is going wrong.  I suspect the signal handler fires only once.
>  SysV unixes need a signal handler to be rearmed with a call to signal or
> sigaction in the handler.  This is daft, soby default sigaction avoids
> having to rearm (saving a costly system call on every signal).  The old
> behaviour can be reinstated using SA_RESETHAND.  Perhaps on OpenSolaris one
> has to explicitly use a flag that is the converse of SA_RESETHAND that says
> "don't reset the handler after delivery".
>
> Here's the relevant excerpt from the Mac OS X manual page for sigaction:
> "SA_RESETHAND    If this bit is set, the handler is reset back to SIG_DFL
> at the moment the signal is delivered."
>
> HTH,
> --
> best,
> Eliot
>



-- 
best,
Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20140424/56341902/attachment-0001.htm


More information about the Vm-dev mailing list