How to explore a Project saving problem, in great detail

Brad Fuller brad at sonaural.com
Mon Jul 12 21:49:18 UTC 2004


I second that. Put it on the Swiki! 

> -----Original Message-----
> From: squeak-dev-bounces at lists.squeakfoundation.org 
> [mailto:squeak-dev-bounces at lists.squeakfoundation.org] On 
> Behalf Of Milan Zimmermann
> Sent: Monday, July 12, 2004 1:22 PM
> To: squeak-dev at lists.squeakfoundation.org
> Subject: Re: How to explore a Project saving problem, in great detail
> 
> Ned,
> 
> Wow ... considering I am just starting doing Smalltalk and 
> Squeak (apart from playing with Squeak last fall with my 
> kids) and struggling with basic things such as  .. well 
> pretty much everything :) .. this is a "black belt lesson" 
> here.  I will, nevertheless, try to follow what you did as a 
> learning exercise .. of course can take me days or weeks if I 
> finish at all, but it's gonna be worth it I am sure, when i 
> re-appear on this list a year from now you know i mastered 
> Ned's lesson :D
> 
> Thanks for all the details provided along the way while 
> investigating and helping out with this, really appreciated,
> 
> Milan
> 
> PS: I would think it's worth putting out on Squeak wiki?
> 
> On July 12, 2004 04:11 pm, Ned Konz wrote:
> > On Monday 12 July 2004 6:42 am, Milan Zimmermann wrote:
> > > one note/question - when playing with your solution last night I 
> > > thought the "reason" why the save hang was the 
> "improperly renamed 
> > > class ZTime" - do you think it was that or just the fact 
> I had the 
> > > Changeset browser opened in the project, which made the 
> object graph 
> > > too big and Squeak decided to stop the save?)
> >
> > It's not that the object graph was too big. It's that there was a 
> > pointer outside the graph that couldn't get resolved 
> properly (in this 
> > case, a ReadOnlyVariableBinding of #ZTime to the class 
> Time). This was 
> > caused
> > (indirectly) by your having a ChangeSorter open in the project.
> >
> > > I tried to look for the "ZTime" class in Browser before I 
> closed the 
> > > "Change Sorter" but did not find it, not sure why. In any 
> case, do 
> > > you think I must have renamed it or something else could have 
> > > happened to the image? - Just curious, I am not aware of renaming 
> > > any class, perhaps by accident .. I have been playing 
> with creating 
> > > classes from the workspace, perhaps I messed up at some point.
> >
> > The problem existed in a stock image. Read on...
> >
> > I will explain in detail what I did to explore the problem in hopes 
> > that it may help some of you. This is in the "teaching 
> people to fish" spirit.
> >
> > [For those of you who want to play along at home, you may 
> repeat this 
> > problem using a stock 3.7g image by making a new Morphic project, 
> > opening a ChangeSorter in it, and trying to save the 
> project on disk]
> >
> > - First, I noticed that the project stopped saving at some 
> point and 
> > the world would respond to mouse clicks again (that is, I 
> could raise a menu).
> >
> > - I then returned to the parent project (the one I had 
> asked to save) 
> > and saw an inform dialog that said "extra associations".
> >
> > - I opened a Workspace, typed in the phrase "extra 
> associations", and 
> > hit alt-shift-E (search for methods containing the string 
> in a string literal).
> > Actually, I could have done this easier by shift-clicking on the 
> > string in the informer, and hitting alt-shift-E, but I knew 
> I'd need a 
> > Workspace anyway.
> >
> > - I found ImageSegment>>prepareToBeSaved. Looking down at 
> the bottom 
> > of this method, I saw that it was trying to (but didn't) 
> open an inspector.
> > So...
> >
> > - In my Workspace, I typed in
> > 	ImageSegment allInstances
> > and hit alt-shift-I (explore it). I saw that there was one of them.
> >
> > - I selected this one in the explorer, and typed in
> > 	self prepareToBeSaved
> > in the bottom of the explorer, and hit alt-D (do it).
> >
> > - Again I saw the inform dialog, but this time when I hit the OK 
> > button I saw the inspector pop up. It contained an IdentitySet with 
> > one element (which printed itself as #ZTime).
> >
> > - I typed in to the bottom of the dialog
> > 	self anyOne inspect
> >
> > - This brought up an Inspector on a ReadOnlyVariableBinding 
> whose key 
> > was #ZTime and whose value was Time
> >
> > - selecting the key (#ZTime) in the inspector's upper left 
> hand pane, 
> > I brought up the context menu using the mouse. I chose "objects 
> > pointing to this value"
> >
> > [A much better tool for tracing pointers is Avi Bryant's Pointer 
> > Explorer, available using SqueakMap. I rarely have much 
> success using 
> > the "chase pointers" choice (the Pointer Finder).].
> >
> > - I got an inspector on an array with five slots. The last 
> slot was an 
> > ObjectExplorerWrapper (a side effect of having the other 
> explorer open).
> > But the other four were interesting. They were:
> >
> > 	- a big array with symbols and other things in it: 
> > #(#EventPollFrequency
> > #startUpWithCaption:at:allowKeyboard: #delayedBy: #anchors 
> #panForTrack:
> > nil #'read/write segment' #reciprocalFloorLog: an Object an Object 
> > #radial
> > #packageNamed:ifAbsent: an Object #blockAssociationCheck: 
> an Object an 
> > Object ...
> > 	- a ClassChangeRecord
> > 	- #ZTime->a ClassChangeRecord
> > 	- another big array: #(Array Project nil a 
> Project(NewChanges) false 
> > PasteUpMorph TranscriptStream ChangeSet Form HandMorph a 
> > StringMorph(2714)'a StringMorph(113)''a PasteUpMorph(1263) [world] 
> > that was not counted'' that was not counted' a StringMorph(3918)'a 
> > StringMorph(1420)''a
> > SimpleHierarchicalListMorph(1810) that was not counted'' ...
> >
> > - I guessed that the two big arrays were used by the 
> ImageSegment, so 
> > focused on the ClassChangeRecords.
> >
> > - Now, I knew that ClassChangeRecords lived in ChangeSets, 
> which are 
> > pointed to by ChangeSorters. Having seen a ChangeSorter in your 
> > project, I suspected that was the problem. So I selected the 
> > association (#ZTime->a
> > ClassChangeRecord) and chose from the Inspector context 
> menu "objects 
> > pointing to it". I saw:
> >
> > #(#Time->a ClassChangeRecord #TimeStamp->a 
> ClassChangeRecord #ZTime->a 
> > ClassChangeRecord nil nil nil nil #Date->a ClassChangeRecord nil 
> > #ZDate->a ClassChangeRecord nil #ZTimeStamp->a ClassChangeRecord)
> >
> > - "Ah, I see" I said to myself. Now I recalled that we renamed some 
> > classes in the Great Time And Date Revolution around CS 
> 5660. So I did 
> > "objects pointing to it" another two times until I got to
> > 	a ChangeSet named 5668Chrnl-i-CnvtTDateTTime1
> >
> > - So now my suspicion that the ChangeSorter was involved 
> was proved. I 
> > knew how ChangeSorters and ChangeSets were structured, but 
> following 
> > the pointers further would have led to a ChangeSetCategory 
> and/or the 
> > ChangeSorter.
> >
> > - But the question about why was this one class an unresolved out 
> > pointer still remained.
> >
> > The answer was here in ImageSegment>>prepareToBeSaved:
> >
> > 		(Smalltalk associationAt: assoc key ifAbsent: 
> [3]) == assoc ifTrue: [
> > 			outPointers at: (outIndexes at: assoc) put:
> > 				(DiskProxy global: #Smalltalk 
> selector: #associationAt:
> > 					args: (Array with: assoc key)).
> >
> > In other words, since Smalltalk didn't contain #ZTime (this was a 
> > temporary rename), we didn't replace the association with a 
> DiskProxy.
> >
> > In the Workspace, you can see other potential problems like this by 
> > doing a print-it or inspect-it on:
> >
> > ReadOnlyVariableBinding allInstances
> > 	select: [ :ea | ea value isBehavior and: [ (Smalltalk 
> at: ea key ifAbsent:
> > []) ~~ ea value ]]
> > 	thenCollect: [ :ea | ea key -> ea value ]
> >
> > which returns:
> >
> > #(#CRGestureProcessor->AnObsoleteCRGestureProcessor
> > #PseudoClassOrganizer->AnObsoletePseudoClassOrganizer
> > #CRDictionary->AnObsoleteCRDictionary
> > #ClassOrganizer->AnObsoleteClassOrganizer #ZDate->Date 
> > #CRAddFeatureMorph->AnObsoleteCRAddFeatureMorph
> > #CRLookupItem->AnObsoleteCRLookupItem #ZTime->Time 
> > #IRCProtocolMessage->AnObsoleteIRCProtocolMessage
> > #IRCMorph->AnObsoleteIRCMorph #SMTPSocket->AnObsoleteSMTPSocket
> > #CRDisplayProperties->AnObsoleteCRDisplayProperties)
> >
> > But this begs the question of "why did it appear in the out 
> > pointers?". If you look at ChangeSet>>objectForDataStream: you will 
> > see that ChangeSets themselves don't get stored. So 
> something else must have been responsible.
> > What? Hmm...
> >
> > Let's look at who might be pointing to it. We can skip 
> ChangeRecords, 
> > since
> >
> > they won't get stored. So let's try this:
> > | zt ptrs baddies ptrs2 |
> >
> > Smalltalk garbageCollect.
> > zt := (ReadOnlyVariableBinding allInstances select: [ :ea | 
> ea key = 
> > #ZTime
> > ]) first.
> > ptrs := Smalltalk pointersTo: zt except: { zt. zt key. zt value. }.
> > baddies := OrderedCollection new.
> > ptrs do: [ :p |
> > 	ptrs2 := nil.
> > 	ptrs2 := Smalltalk pointersTo: p except: { ptrs. p. thisContext.
> > thisContext sender. thisContext home. thisContext blockHome. zt. zt 
> > key. zt value. baddies. baddies collector }.
> > 	ptrs2 := ptrs2 reject: [ :ea |
> > 		('*ChangeRecord*' match: ea class name) ].
> > 	ptrs2 isEmpty ifFalse: [baddies addAll: ptrs2 ].
> > ].
> > baddies asArray inspect.
> >
> >
> > Which shows me a single MethodContext, whose receiver is Delay, and 
> > whose source is missing. But looking around the context, I see that 
> > there is a temp of a Semaphore, and on its list is a 
> process with priority 80.
> >
> > Opening a Process browser, I see that that's the timer 
> interrupt watcher.
> > Apparently this didn't get recompiled and restarted, or 
> it's the old 
> > method.
> >
> > By printing the result of
> >
> > 	Delay class compiledMethodAt: #timerInterruptWatcher
> >
> > I see that the method is different (different identityHash).
> >
> > OK... let's try to fix the problem. Looks like I'll have to restart 
> > the timer interrupt watcher, but I don't want to kill off 
> the existing delays.
> >
> > Reading the code of Delay class shows how this might work.
> >
> > Let's try:
> >
> > 	Delay shutDown; startTimerInterruptWatcher; startUp.
> >
> > Repeating the search for baddies above shows that that one 
> reference 
> > has been removed.
> >
> > Trying the project storage (with a change sorter) again now works.
> >
> > Was that fun, or what?
> >
> > Why did I say that the class was "improperly renamed"? 
> Notice that all 
> > of the other associations besides #ZTime and #ZDate refer 
> to classes 
> > whose names start with "AnObsolete" and which will in fact 
> report that 
> > they are obsolete when asked.
> >
> > However, #ZTime and #ZDate actually point to live classes 
> with other names!
> >
> > How did this happen? Well, in the postscript of change set
> >
> > 5668Chrnl-i-CnvtTDateTTime1 there is the following code:
> > | jdn |
> >
> > jdn _ SystemVersion current date asDate julianDayNumber.
> >
> > #(#Date #Time #TimeStamp)
> > 	do: 	[ :s | | zname tname |
> > 		Transcript show: (Smalltalk at: s) allInstances; cr.
> > 		zname  _ ('Z', s) asSymbol.
> > 		tname  _ ('T', s) asSymbol.
> > 		(Smalltalk at: s) rename: zname.
> > 		(Smalltalk at: tname) rename: s.
> > 		(Smalltalk systemNavigation allCallsOn: 
> (Smalltalk associationAt: zname))
> > 			do: [ :ref | ref actualClass 
> ifNotNilDo: [ :ac | ac recompile: ref 
> > methodSymbol ] ].
> > 		Smalltalk forgetClass: (Smalltalk at: zname) 
> logged: false.
> > 		Transcript show: 'replaced ', s; cr ].
> >
> > Transcript show: 'removing subclasses....'.
> > (Magnitude subclasses select: [ :sc | sc name beginsWith: 'Z' ])
> > 	do:  [ :sc | Magnitude  removeSubclass: sc ].
> > Transcript tab; show: 'removed.'; cr.
> >
> >
> > That is (for instance, for #ZTime):
> >
> > * There used to be a subclass of Magnitude called Time.
> > * we defined a new class called TTime
> > * we renamed Time to ZTime
> > * we renamed TTime to Time
> > * we then recompiled all the references to the old versions of Time 
> > (which was now named #ZTime)
> > * we then forgot #ZTime, and tried not to remember that we 
> had done so 
> > (in the change set)
> > * then we tried to remove #ZTime (again) using 
> #removeSubclass: (this 
> > shouldn't have been necessary after #forgetClass:logged:, and is 
> > almost always the wrong thing to do from outside the guts 
> of the system).
> >
> > However, this didn't work entirely right. First, what we 
> *should* have 
> > done instead of #forgetClass:logged: and #removeSubclass: 
> was instead
> >
> > 	(Smalltalk at: zname) removeFromSystem: false.
> >
> >
> > Anyway, this is probably more than you wanted to know.
> >
> > I'll post a change set that fixes this problem, so saving projects 
> > that contain change sorters will work.
> >
> > Have fun,
> 
> 
> 
> 





More information about the Squeak-dev mailing list