Ian: I assume you mean this:
http://www.stanford.edu/~stinson/cs240/cs240_1/WIB.txt
Thanks -- it was an interesting read!
On Wed, Feb 11, 2009 at 6:11 AM, Andreas Raab andreas.raab@gmx.de wrote:
There is obviously a lot I don't understand about interrupt handling on Unix since your description (and the stuff that I found looking for PC-losering problem) don't make much sense to me ;-)
If I understand you correctly, then the program is in system call, then an interrupt happens and for some unexplicable reason that means the OS has to back out of the system call. Why is that? Wouldn't it be more sensible to just delay delivering the interrupt up to the point where the syscall returns? Yes, it doesn't guarantee real-time response but then there is probably more than one process running at any given time anyway so I wouldn't expect interrupts to be delivered real-time to user land anyway. And I *really* can't fathom the thought that any interrupt that happens for a process within a syscall somehow auto-magically leads to the kernel to forgetting the state associated with the call ;-)
Andreas: I'll try to clarify this for you. The situation is this:
a. a user-level process makes a kernel call which might take a while, typically for I/O.
b. an interrupt arrives and is delivered to the user-level process. In the Unix world, this is a software interrupt, and is called a "signal"; hardware interrupts are handled by the kernel and are not visible to user-level processes.
c. when delivering the signal to the user-level process, the kernel needs to make a choice: should it (1) call the signal handler and, when it returns, then cancel the I/O operation and return an error (in this case, EINTR)?; or (2) call the signal handler and, when it returns, then restart/resume the I/O operation?
d. when the signal handler returns, should the kernel (3) leave it up to the user-level process to re-instate it?; or (4) re-instate the signal-handler itself?
All Unices with signal handling support options (1) and (3). Modern Unices (since at least 12 years ago) also support options (2) and (4).
From what I can tell, Squeak assumes a bit of both models.
- it does not specify that system calls should be restarted. - it does not re-instate the handler (i.e. it assumes that kernel will). - it does not in every case do the manual check for EINTR and restart, as John mentioned in his post.
My approach is to fix the former two, thus avoiding having to fix the latter. It should be simple, given that there are only a few calls to signal() in the codebase: sqUnixMain.c, aio.c, UnixOSProcessPlugin.c. It should be possible to replace each signal() call with sigaction() with SA_RESTART, fixing both the syscall restarting problem and the handler-reinstating problem. This way we don't have to go through the entire codebase looking for IO operations and checking for EINTR, restarting, etc. And we don't have to remember to add it into any new code we might write in the future.
This whole issue is complicated further in that signal() on certain Unices (including FreeBSD, Linux, and MacOS) will restart syscalls automatically, and certain Unices won't (SYS-V, including Solaris and -- I think -- HPUX). And certain Unices will re-instate the handler automatically, and others won't. So the existing code may be broken yet appear to work, if you happen to be on the "right" platform.
Another thing: signals may arrive from external processes (e.g. the kill command) or from squeak itself. aio.c asks the kernel to notify it when there is I/O available for reading/writing/etc. When there's I/O ready, the kernel sends SIGIO; when SIGIO arrives, Squeak jumps to forceInterruptCheck.
- Andrew