<div dir="ltr"><div>On Sun, Apr 2, 2017 at 8:10 AM, Ben Coman <<a href="mailto:btc@openinworld.com">btc@openinworld.com</a>> wrote:</div><div>></div><div>></div><div>></div><div>> On Sun, Apr 2, 2017 at 3:11 AM, Petr Fischer <<a href="mailto:petr.fischer@me.com">petr.fischer@me.com</a>> wrote:</div><div>>></div><div>>></div><div>>> > On Thu, 30 Mar 2017, Eliot Miranda wrote:</div><div>>> ></div><div>>> > > Once the active process is in a tight loop the delay is effectively</div><div>>> > disabled because the tight loop effectively shuts out the heartbeat thread</div><div>>> > and hence the system never notices that the delay has expired.</div><div>>> ></div><div>>> > I think that won't happen, because the process scheduler (O(1), CFS, BFS) on</div><div>>> > linux is not cooperative. So, the kernel will periodically preempt the main</div><div>>> > thread and run the heartbeat thread no matter what their priorities are. The</div><div>>> > higher priority only provides lower jitter on the heartbeat thread.</div><div>>> ></div><div>>> > Levente</div><div>>></div><div>>> Is there some test case or code, that I can run in Pharo and evaluate if kernel sheduler is working correctly (with heartbeat thread at normal priority).</div><div>>> I need to test it under FreeBSD.</div><div>>></div><div>>> Thanks! pf</div><div>></div><div>></div><div>> Just for starters, what result do you get for my multi-priority fibonacci stress test... </div><div>> <a href="http://forum.world.st/Unix-heartbeat-thread-vs-itimer-tp4928943p4938456.html">http://forum.world.st/Unix-heartbeat-thread-vs-itimer-tp4928943p4938456.html</a> </div><div>></div><div>> cheers -ben</div><div>></div><div><br></div><div>I got curious to read up on the FreeBSD scheduler.</div><div><br></div><div>FreeBSD has the same constraint as Linux such that "Only the super-user may lower priorities."  </div><div><a href="https://www.freebsd.org/cgi/man.cgi?query=setpriority&sektion=2">https://www.freebsd.org/cgi/man.cgi?query=setpriority&sektion=2</a></div><div><br></div><div><br></div><div>From <a href="https://classes.cs.uoregon.edu/13F/cis607distcomp/PPT/FreeBSDscheduler(McKay).pdf">https://classes.cs.uoregon.edu/13F/cis607distcomp/PPT/FreeBSDscheduler(McKay).pdf</a></div><div>Each CPU has (a KSeq) three arrays of run queue indexed by priority<br></div><div>* The Current queue receives interactive, real time and interrupt threads</div><div>* The Next queue receives everything else except idle threads</div><div>* When Current queue is empty, the two queues swap.</div><div>* The third queue hold idle threads, and is only used when there are no other runnable threads</div><div><br></div><div><br></div><div>From <a href="http://web.cs.ucdavis.edu/~roper/ecs150/ULE.pdf">http://web.cs.ucdavis.edu/~roper/ecs150/ULE.pdf</a></div><div>ULE: A Modern Scheduler For FreeBSD</div><div><br></div><div>A thread is assigned to a queue until it sleeps, or for the duration of a slice. </div><div>The base priority, slice size, and interactivity score are recalculated each time a slice expires. </div><div>The thread is assigned to the Current queue if it is interactive or to the Next queue otherwise. </div><div>Inserting interactive tasks onto the Current queue and giving them a higher priority </div><div>results in a very low latency response.</div><div><br></div><div>In ULE the interactivity of a thread is determined using its voluntary sleep time and run time. </div><div>The voluntary sleep time is recorded by counting the number of ticks that have passed between </div><div>a sleep() and wakeup() or while sleeping on a condition variable. </div><div>The run time is simply the number of ticks while the thread is running.  </div><div>The scheduler uses the interactivity score to determine whether or not a thread </div><div>should be assigned to the Current queue when it becomes runnable. </div><div><br></div><div>On x86, FreeBSD has a default HZ of 100, </div><div>and a minimum slice value of 10ms and maximum slice value of 140ms. </div><div>Interactive tasks receive the minimum slice value. </div><div>This allows us to more quickly discover that an interactive task is no longer interactive. </div><div><br></div><div><br></div><div>From <a href="http://ptgmedia.pearsoncmg.com/images/9780321968975/samplepages/9780321968975.pdf">http://ptgmedia.pearsoncmg.com/images/9780321968975/samplepages/9780321968975.pdf</a><br></div><div><div>The Design and Implementation of the FreeBSD Operating System</div></div><div><br></div><div>The scheduling policy initially assigns a high execution priority to each thread </div><div>and allows that thread to execute for a fixed time slice. </div><div>Threads that execute for the duration of their slice have their priority lowered, </div><div>whereas threads that give up the CPU (usually because they do I/O) are allowed to remain at their priority. </div><div>Threads that are inactive have their priority raised. </div><div><br></div><div>Some tasks, such as the compilation of a large application, may be done in many </div><div>small steps in which each component is compiled in a separate process. </div><div>No individual step runs long enough to have its priority degraded, </div><div>so the compilation as a whole impacts the interactive programs. </div><div>To detect and avoid this problem, the scheduling priority of a child </div><div>process is propagated back to its parent. When a new child process is started, </div><div>it begins running with its parent’s current priority. </div><div>As the program that coordinates the compilation (typically make) starts many compilation steps, </div><div>its priority is dropped because of the CPU-intensive behavior of its children. </div><div>Later compilation steps started by make begin running and stay at a lower priority, </div><div>which allows higher-priority interactive programs to run in preference to them as desired.</div><div><br></div><div>Resuming a thread ...   </div><div>If any threads are placed on the run queue and one of them has a scheduling </div><div>priority higher than that of the currently executing thread, </div><div>it will request that the CPU be rescheduled as soon as possible.</div><div>Real-time and interrupt threads do preempt lower-priority threads. </div><div>The kernel can be configured to preempt timeshare threads executing </div><div>in the kernel with other higher-priority timeshare threads. </div><div>This option is not used by default as the increase in context switches </div><div>adds overhead and does not help make timeshare threads response time more predictable</div><div><br></div><div><br></div><div>From <a href="https://github.com/freebsd/freebsd/blame/master/sys/kern/sched_ule.c">https://github.com/freebsd/freebsd/blame/master/sys/kern/sched_ule.c</a></div><div>and substituting defined constants...</div><div>PRIO_MIN               -20</div><div>PRIO_MAX                20</div><div>SCHED_INTERACT_THRESH   30</div><div>SCHED_INTERACT_HALF     50 = (SCHED_INTERACT_MAX / 2)</div><div>SCHED_INTERACT_MAX     100</div><div>PRI_MIN_TIMESHARE<span class="gmail-Apple-tab-span" style="white-space:pre">      </span>   120</div><div>PRI_MAX_TIMESHARE<span class="gmail-Apple-tab-span" style="white-space:pre">       </span>   223 = (PRI_MIN_IDLE - 1)</div><div>PRI_MIN_IDLE<span class="gmail-Apple-tab-span" style="white-space:pre">               </span>           224</div><div><br></div><div>SCHED_PRI_NRESV           40 = (PRIO_MAX - PRIO_MIN)</div><div>PRI_TIMESHARE_RANGE 104 = (PRI_MAX_TIMESHARE - PRI_MIN_TIMESHARE + 1)</div><div>PRI_INTERACT_RANGE<span class="gmail-Apple-tab-span" style="white-space:pre">    </span>    32 = ((PRI_TIMESHARE_RANGE - SCHED_PRI_NRESV) / 2)</div><div><br></div><div>PRI_MIN_INTERACT<span class="gmail-Apple-tab-span" style="white-space:pre">      </span>   120 = (PRI_MIN_TIMESHARE)</div><div>PRI_MAX_INTERACT<span class="gmail-Apple-tab-span" style="white-space:pre">  </span>   153 = (120 + PRI_INTERACT_RANGE - 1)</div><div><br></div><div>PRI_MIN_BATCH<span class="gmail-Apple-tab-span" style="white-space:pre">         </span>   152 = (PRI_MIN_TIMESHARE + PRI_INTERACT_RANGE)</div><div>PRI_MAX_BATCH<span class="gmail-Apple-tab-span" style="white-space:pre">                </span>   223 = (PRI_MAX_TIMESHARE)</div><div>SCHED_PRI_NHALF<span style="white-space:pre">             </span>20 = (SCHED_PRI_NRESV / 2)</div><div>SCHED_PRI_MIN<span class="gmail-Apple-tab-span" style="white-space:pre">            </span>   172 = (PRI_MIN_BATCH + SCHED_PRI_NHALF)</div><div>SCHED_PRI_MAX<span class="gmail-Apple-tab-span" style="white-space:pre">               </span>   203 = (PRI_MAX_BATCH - SCHED_PRI_NHALF)</div><div>SCHED_PRI_RANGE            30 = (SCHED_PRI_MAX - SCHED_PRI_MIN + 1)</div><div><br></div><div>sched_interact                  30 = (SCHED_INTERACT_THRESH)</div><div><br></div><div><br></div><div>sched_interact_score() </div><div>  if (sleep/run)>1,  interact_score = 50 / (sleep/run)    </div><div>  if (sleep/run)=1,  interact_score = 50 </div><div>  if (sleep/run)<1,  interact_score = 50 * (2 - (sleep/run)) </div><div><br></div><div><br></div><div>sched_priority()</div><div>  * If the score is interactive we place the thread in the realtime</div><div>  * queue with a priority that is less than kernel and interrupt</div><div>  * priorities.  These threads are not subject to nice restrictions.</div><div>  *</div><div>  * Scores greater than this are placed on the normal timeshare queue</div><div>  * where the priority is partially decided by the most recent cpu</div><div>  * utilization and the rest is decided by nice value.</div><div>  *</div><div>  * The nice value of the process has a linear effect on the calculated</div><div>  * score.  Negative nice values make it easier for a thread to be</div><div>  * considered interactive. Default nice is 0.</div><div>  *</div><div>  score = sched_interact_score() + nice); </div><div>  if (score < (30))</div><div>     priority = 120 + score*34/30    // = 120 + (153 - 120 + 1) / 30 * score </div><div>  else</div><div>     priority = 201 + nice           // = 172 + 30 - 1 + nice</div><div><br></div><div><br></div><div>sched_add(struct thread *td, int flags)</div><div>  * Select the target thread queue and add a thread to it.  </div><div>  * Request preemption or IPI a remote processor if required.</div><div>  * Recalculate the priority before we select the target cpu or run-queue.</div><div>  *</div><div>  if (PRI_BASE(td->td_pri_class) == PRI_TIMESHARE)</div><div>     sched_priority(td);</div><div>  ...</div><div><br></div><div>So it seems as long as "sleep/run > 2" then it seems </div><div>FreeBSD heatbeat-thread will get an interactive priority bump.</div><div><br></div><div>cheers -ben</div></div>