[Vm-dev] COG crashing - more info

Bob Arning arning315 at comcast.net
Thu Jan 19 22:38:24 UTC 2012


Still hoping to debug this in Smalltalk, I'm exploring the bounds a bit.

for
Smalltalk installLowSpaceWatcher; useUpMemory

I tried increasing the lowSpaceThreshold in case that was the reason I 
was not getting a low space log. The nominal size (200k) and 300k both 
lead to vm crashes. Larger numbers cause the vm to spend forever in 
garbage collection (well, 15 minutes at least)

10M hangs at 476M real
1M hangs at 490M real
400K hangs at 491.5M real
300K ==crashes VM at 499M real (approximate)
200K ==crashes VM
--------------------------------------------------------------------------------------
repeating the experiment for
Smalltalk installLowSpaceWatcher; useUpMemoryWithContexts


had similar results although the crash-to-hang threshold was a bit higher
12M hangs at 493M real
1M ==crashesVM around 508M real
600K ==crashesVM around 507M real
400K ==crashesVM around 508M real
200K ==crashes VM


------------------------------------------------------------------------------------
The interesting thing about this test is that it get a low space 
debugger, but not where the code expects it. The error is signalled when 
allocating the String, not the Arrays.

Smalltalk installLowSpaceWatcher; useUpMemoryWithArrays == produces low 
space debugger

------------------------------------------------------------------------------------
So, I made a slightly different version that allocates a slightly 
smaller, but still huge String and then allocates Arrays of a particular 
size. It appears that for smaller array sizes, the vm hangs, while for 
larger sizes, it will signal low space.

Smalltalk installLowSpaceWatcher; useUpMemoryWithArrays: 100 -- hangs at 
509M
Smalltalk installLowSpaceWatcher; useUpMemoryWithArrays: 300 -- hangs at 
509M
Smalltalk installLowSpaceWatcher; useUpMemoryWithArrays: 500 -- hangs at 
515M
Smalltalk installLowSpaceWatcher; useUpMemoryWithArrays: 700 -- low 
space at 509M
Smalltalk installLowSpaceWatcher; useUpMemoryWithArrays: 1000 -- low 
space at 509M

============
useUpMemoryWithArrays: arraySize

     | b bytesLeft lsp req count |

     req _ (bytesLeft _ self bytesLeft) - (lsp _ self lowSpaceThreshold) 
- (20*1024*1024).
     count _ 50000000 // arraySize // 4.
     Transcript
         show: bytesLeft asStringWithCommas,' bytesLeft(before)'; cr;
         show: lsp asStringWithCommas,' lowSpaceThreshold'; cr;
         show: req asStringWithCommas,' string size requested'; cr;
         show: count asStringWithCommas,' Arrays to be requested'; cr;
         show: arraySize asStringWithCommas,' Array size requested'; cr;
         yourself.
     self confirm: 'ready to allocate the string'.
     b := String new: req.
     self confirm: 'we got the string ',b size asStringWithCommas.

     (1 to: count) collect: [:i | Array new: arraySize].

     self confirm: 'oops! no problems detected'.
=============

--------------------------------------------------------------------------------------

On 1/19/12 10:26 AM, Bob Arning wrote:
> A little more info...
>
> Smalltalk installLowSpaceWatcher; useUpMemory == crashes VM
>
> Smalltalk installLowSpaceWatcher; useUpMemoryWithArrays == produces 
> low space debugger
>
> Smalltalk installLowSpaceWatcher; useUpMemoryWithContexts == crashes VM
>
> Smalltalk installLowSpaceWatcher; useUpMemoryWithTinyObjects == 
> produces low space debugger
>
> Cheers,
> Bob
>
> On 1/19/12 8:12 AM, Bob Arning wrote:
>> A little more info on these crashes:
>>
>> I had 4 crashes in the space of about 5 days, then none for 10 days, 
>> then another today. This time I happened to be looking at Squeak and 
>> noticed that it was no longer writing simulation results to the 
>> screen. It was sluggish to come to the front and would not accept 
>> CMD-period. Looking in Activity Monitor, real memory usage had gone 
>> from the normal 210M to 510M and it seemed to spend about 10 minutes 
>> in gargage collection before crashing completely. Any thoughts on how 
>> to debug this?
>>
>> Cheers,
>> Bob
>>
>>
>> ===this is what Activity Monitor saw happening (repeated as often as 
>> I chose to sample it)===
>> Sampling process 306 for 3 seconds with 1 millisecond of run time 
>> between samples
>> Sampling completed, processing symbols...
>> Analysis of sampling Croquet (pid 306) every 1 millisecond
>> Call graph:
>>     2506 Thread_2507
>>       2506 start
>>         2506 main
>>           2506 RunApplicationEventLoopWithSqueak
>>             2506 RunApplicationEventLoop
>>               2506 ToolboxEventDispatcher
>>                 2506 SendEventToEventTarget
>>                   2506 
>> SendEventToEventTargetInternal(OpaqueEventRef*, 
>> OpaqueEventTargetRef*, HandlerCallRec*)
>>                     2506 DispatchEventToHandlers(EventTargetRec*, 
>> OpaqueEventRef*, HandlerCallRec*)
>>                       2506 
>> ToolboxEventDispatcherHandler(OpaqueEventHandlerCallRef*, 
>> OpaqueEventRef*, void*)
>>                         2506 SendEventToEventTargetWithOptions
>>                           2506 
>> SendEventToEventTargetInternal(OpaqueEventRef*, 
>> OpaqueEventTargetRef*, HandlerCallRec*)
>>                             2506 
>> DispatchEventToHandlers(EventTargetRec*, OpaqueEventRef*, 
>> HandlerCallRec*)
>>                               2506 EventLoopEventHandler
>>                                 2506 initStackPagesAndInterpret
>>                                   2506 0x906dd9c
>>                                     2506 ceSendFromInLineCacheMiss
>>                                       2505 executeNewMethod
>>                                         2505 primitiveNew
>>                                           2505 sufficientSpaceAfterGC
>>                                             2497 fullGC
>>                                               1260 markPhase
>>                                                 688 markAndTraceStackPage
>>                                                   688 markAndTrace
>>                                                     688 markAndTrace
>>                                                 566 
>> markAndTraceInterpreterOops
>>                                                   566 
>> markAndTraceStackPage
>>                                                     566 markAndTrace
>>                                                       566 markAndTrace
>>                                                 6 
>> markAndTraceObjectsOrFreeMachineCode
>>                                                   6 
>> markAndTraceOrFreeCogMethodfirstVisit
>>                                                     5 
>> markLiteralsAndUnlinkIfUnmarkedSendpcmethod
>>                                                       3 
>> markAndTraceOrFreeCogMethodfirstVisit
>>                                                         3 
>> markLiteralsAndUnlinkIfUnmarkedSendpcmethod
>>                                                           3 
>> markAndTraceOrFreeCogMethodfirstVisit
>>                                                             2 
>> markLiteralsAndUnlinkIfUnmarkedSendpcmethod
>>                                                               2 
>> markAndTraceOrFreeCogMethodfirstVisit
>>                                                                 2 
>> markLiteralsAndUnlinkIfUnmarkedSendpcmethod
>>                                                                   1 
>> markAndTraceOrFreeCogMethodfirstVisit
>>                                                                     1 
>> markAndTraceOrFreeCogMethodfirstVisit
>>                                                                   1 
>> markLiteralsAndUnlinkIfUnmarkedSendpcmethod
>>                                                             1 
>> startOfMemory
>>                                                               1 
>> startOfMemory
>>                                                       2 markAndTrace
>>                                                         2 markAndTrace
>>                                                     1 
>> markAndTraceOrFreeCogMethodfirstVisit
>>                                               800 incCompBody
>>                                                 800 
>> mapPointersInObjectsFromto
>>                                                   797 
>> mapPointersInObjectsFromto
>>                                                   3 
>> mapObjectReferencesInMachineCode
>>                                                     2 remap
>>                                                       2 remap
>>                                                     1 
>> remapIfObjectRefpchasYoung
>>                                                       1 remap
>>                                                         1 remap
>>                                               432 fullGC
>>                                               5 finalizeReference
>>                                                 5 finalizeReference
>>                                             8 incrementalGC
>>                                               8 markPhase
>>                                                 8 
>> markAndTraceInterpreterOops
>>                                                   7 markAndTraceStackPage
>>                                                     6 markAndTrace
>>                                                       6 markAndTrace
>>                                                     1 
>> markAndTraceStackPage
>>                                                   1 
>> markAndTraceInterpreterOops
>>                                       1 activateCoggedNewMethod
>>                                         1 checkForEventsMayContextSwitch
>>                                           1 ioProcessEvents
>>                                             1 ReceiveNextEvent
>>                                               1 ReceiveNextEventCommon
>>                                                 1 
>> RunCurrentEventLoopInMode
>>                                                   1 CFStringCompare
>>                                                     1 
>> CFStringCompareWithOptions
>>                                                       1 
>> CFStringCompareWithOptionsAndLocale
>>                                                         1 
>> CFUniCharGetBitmapPtrForPlane
>>                                                           1 
>> CFUniCharGetBitmapPtrForPlane
>>     2506 Thread_2603
>>       2506 thread_start
>>         2506 _pthread_start
>>           2506 beatStateMachine
>>             2487 __semwait_signal
>>               2487 __semwait_signal
>>             11 nanosleep$UNIX2003
>>               10 clock_get_time
>>                 10 mach_msg
>>                   10 mach_msg_trap
>>                     10 mach_msg_trap
>>               1 nanosleep$UNIX2003
>>             2 __udivdi3
>>               2 __udivdi3
>>             2 beatStateMachine
>>             2 cerror
>>               2 __error
>>                 2 __error
>>             1 __error
>>               1 __error
>>             1 forceInterruptCheckFromHeartbeat
>>               1 forceInterruptCheckFromHeartbeat
>>
>> Total number in stack (recursive counted multiple, when >=5):
>>         5       markAndTraceOrFreeCogMethodfirstVisit
>>
>> Sort by top of stack, same collapsed (when >= 5):
>>         __semwait_signal        2487
>>         markAndTrace        1262
>>         mapPointersInObjectsFromto        797
>>         fullGC        432
>>         mach_msg_trap        10
>>         finalizeReference        5
>> Sample analysis of process 306 written to file /dev/stdout
>>
>> ===this appeared on the console===
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: out of memory
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: C stack backtrace:
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: Smalltalk stack dump:
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: 0xbffa8308 M Delay 
>> class>handleTimerEvent 157474072: a(n) Delay class
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: 0xbffa8320 M Delay
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]:  class>runTimerEventLoop 
>> 157474072: a(n) Delay class
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]:  0x9851140 s [] in Delay 
>> class>startTimerEventLoop
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]:  0x9851260 s [] in 
>> BlockClosure>newProcess
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: Most recent primitives
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: millisecondClockValue
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: wait
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: basicNew
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: millisecondClockValue
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: primSignal:atMilliseconds:
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: millisecondClockValue
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: wait
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: basicNew
>> ...repeated items snipped...
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: millisecondClockValue
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: wait
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: millisecondClockValue
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: primSignal:atMilliseconds:
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: millisecondClockValue
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: wait
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: millisecondClockValue
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: primSignal:atMilliseconds:
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: millisecondClockValue
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: wait
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: millisecondClockValue
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: primSignal:atMilliseconds:
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: millisecondClockValue
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: wait
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: millisecondClockValue
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: primSignal:atMilliseconds:
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: millisecondClockValue
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: wait
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: millisecondClockValue
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: primSignal:atMilliseconds:
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: millisecondClockValue
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]: wait
>> Jan 19 07:37:15 robert-arnings-computer 
>> [0x0-0x25025].com.squeak.Cog[306]:     (out of memory)
>> Jan 19 07:37:19 robert-arnings-computer com.apple.launchd[102] 
>> ([0x0-0x25025].com.squeak.Cog[306]): Exited abnormally: Abort trap
>>
>> On 1/9/12 4:05 PM, Eliot Miranda wrote:
>>>   
>>>
>>>
>>>
>>>
>>> On Sun, Jan 8, 2012 at 4:20 PM, Bob Arning <arning315 at comcast.net 
>>> <mailto:arning315 at comcast.net>> wrote:
>>>
>>>
>>>     Does it take something special to generate one of those? The
>>>     most recent I have is 18 months old.
>>>
>>>
>>> They're produced either when the VM experiences a fatal error 
>>> (either an internal call of error or catching a fatal signal etc). 
>>>  How up-to-date is your VM?  That you're seeing messages on the 
>>> console and not getting a crash.dmp might indicate your VM is old.
>>>
>>>
>>>     Cheers,
>>>     Bob
>>>
>>>     On 1/8/12 6:34 PM, Eliot Miranda wrote:
>>>>       
>>>>
>>>>
>>>>
>>>>
>>>>     On Sat, Jan 7, 2012 at 6:22 PM, Bob Arning
>>>>     <arning315 at comcast.net <mailto:arning315 at comcast.net>> wrote:
>>>>
>>>>
>>>>         Is that the first file I posted, or yet another file? I
>>>>         looked and did not see anything specifically named "crash.log"
>>>>
>>>>
>>>>     Oops.  It is called crash.dmp and is produced in whatever the
>>>>     image file's directory is (a poor choice in certain
>>>>     circumstances, but it has to go somewhere).
>>>>
>>>>
>>>>         Cheers,
>>>>         Bob
>>>>
>>>>         On 1/7/12 6:35 PM, Eliot Miranda wrote:
>>>>>         BTW, the below is not as useful as the crash.log file.
>>>>>
>>>>
>>>>
>>>>
>>>>
>>>>     -- 
>>>>     best,
>>>>     Eliot
>>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> best,
>>> Eliot
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20120119/b32d5fb9/attachment-0001.htm


More information about the Vm-dev mailing list