On Wed, 25 Sep 2019 at 15:32, Nicolas Cellier <nicolas.cellier.aka.nice@gmail.com> wrote:
 
I noticed that many crash happen in testSendTimeout.
This test is organizing a race:
- Smalltalk fill the socket send buffer
- the OS tries to drain

The other thing I noticed is that [SocketTest suite run] takes 7 to 8s on macos while only 2 to 3s on ubuntu and windows
So it might be that some test times out on macos, while it doesn't on other OS.

This would explain that many crash also happen in JITted Timer loop, and that we cannot observe it in other OSes.

The access to OS ressources and the race may also explain some randomization of the crash...

So one idea would be to make the test timeout in linux too, see if we can make it crash then try using rr.
Maybe it's possible with a huge Socket buffer, a smaller image side buffer (we could reduce the size from 1000 to 1 so as to increase overhead).

How that might be done spiked my interest, so I had a poke around...
https://serverfault.com/questions/799605/set-large-buffer-queue-on-a-network-interface-to-emulate-bufferbloat
https://coderwall.com/p/wuwoja/simulate-network-latency-to-debug-connection-timeouts
https://stackoverflow.com/questions/614795/simulate-delayed-and-dropped-packets-on-linux

But those might not directly affect the send buffer, so maybe an alternative...
https://utcc.utoronto.ca/~cks/space/blog/linux/TCPSendbufferDefaultSize
https://stackoverflow.com/questions/47350028/how-to-tune-linux-network-buffer-size
https://www.cyberciti.biz/faq/linux-tcp-tuning/

cheers -ben

P.S. Probably not specifically useful, but Oh Wow! just-too-impressive and more-than-you-could-possible-want-to-know about Linux networking... 
https://blog.packagecloud.io/eng/2017/02/06/monitoring-tuning-linux-networking-stack-sending-data