Some recent debugging handed me a reminder that we have several varieties of 'inboard source' in the system already, just not presently used and probably not complete.
Take a look at CompiledMethod>>#getSourceFor:in: - it looks at the method properties, checking for a Dictionary with a #source entry - it checks the method trailer for tempNames, and if found, decompresses the names string and then decompiles the method and inserts those tempNames - it checks the trailer for sourceCode that may be contained in one of four different ways, two of them being compressed strings in the trailer and I think the code in CompiledMethodTrailer>>#sourceCode fails to decompress them? The other two methods rely on methods that no longer exist in the image. - it checks if the trailer has a source pointer, and if not, decompiles the method with no assistance fro many temp names stuff - if there is a source pointer, that value is used to fetch the source from the file(s), and as a backup for the files not being there it repeats the bare decompile code.
I think we might be able to clean that up a bit.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful Latin Phrases:- Illiud Latine dici non potest = You can't say that in Latin.
Looking briefly at this issue again, we might want to consider merging the CompiledMethodTrailer & AdditionalMethodState concepts. It seems a touch daft to have two mechanisms overlapping so much.
Of course, pretty much all the capabilities provided by the two could be done more cleanly by splitting methods up similarly to the version we did at Interval in '97/98; have a normal object with ivars for the literals, pragma, source info, bytecodes, etc.Yes, it would cost some memory . It would potentially save some time in GC. It would be simpler to extend things. Yada-yada.
Aside from that pipedream, it would probably be simplest to go for dropping the trailer bytes, and putting the info in the properties object. So that would mean every method would replace the penultimate literal with a pointer to an AdditionalMethodState object, which would contain the usurped selector and some source access object. They also already include a pointer back to the method and then indexed values for pragmas etc. So that's 32 bytes for a bare case, or in a squeak 6 image a megabytes & half since 1500 methods already have an AdditionalMethodState and we would 'save' the trailer bytes space. I'm not entirely sure why the back-pointer to the method is required after looking at the usage of it. Being able to dump that would save another ~half-MB.
What would this buy us? Simpler is almost always easier to understand and maintain, which is good. Storing the source would become a bit simpler since we could make appropriate classes for remote, in-image, in-image compressed, in-database, ask-another-image, etc.
A quick test of the size the system sources would be if kept in-image-compressed suggests ~11MB, which is rather better than the 52MB of the plain text file. 2MB of that is the 360 methods with more than 1024 bytes of zipped source, things like the sound of coffee cups clinking & car motors, plus some font stuff.
A trivial hack to keep the sources in AdditionalMethodStates for every method, zipping as added, shows it can function.
On 2022-12-04, at 6:56 PM, tim Rowledge tim@rowledge.org wrote:
Some recent debugging handed me a reminder that we have several varieties of 'inboard source' in the system already, just not presently used and probably not complete.
Take a look at CompiledMethod>>#getSourceFor:in:
- it looks at the method properties, checking for a Dictionary with a #source entry
- it checks the method trailer for tempNames, and if found, decompresses the names string and then decompiles the method and inserts those tempNames
- it checks the trailer for sourceCode that may be contained in one of four different ways, two of them being compressed strings in the trailer and I think the code in CompiledMethodTrailer>>#sourceCode fails to decompress them? The other two methods rely on methods that no longer exist in the image.
- it checks if the trailer has a source pointer, and if not, decompiles the method with no assistance fro many temp names stuff
- if there is a source pointer, that value is used to fetch the source from the file(s), and as a backup for the files not being there it repeats the bare decompile code.
I think we might be able to clean that up a bit.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful Latin Phrases:- Illiud Latine dici non potest = You can't say that in Latin.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful Latin Phrases:- Illiud Latine dici non potest = You can't say that in Latin.
On 2022-12-22, at 1:10 PM, tim Rowledge tim@rowledge.org wrote:
A trivial hack to keep the sources in AdditionalMethodStates for every method, zipping as added, shows it can function.
Hmph; #condenseSources can cause some interesting crashes with this change. Actual VM crashes on both ARM64 & X64.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Oxymorons: Government organization
Hi Tim,
On Dec 22, 2022, at 1:11 PM, tim Rowledge tim@rowledge.org wrote:
Looking briefly at this issue again, we might want to consider merging the CompiledMethodTrailer & AdditionalMethodState concepts. It seems a touch daft to have two mechanisms overlapping so much.
Of course, pretty much all the capabilities provided by the two could be done more cleanly by splitting methods up similarly to the version we did at Interval in '97/98; have a normal object with ivars for the literals, pragma, source info, bytecodes, etc.Yes, it would cost some memory . It would potentially save some time in GC. It would be simpler to extend things. Yada-yada.
No, no, no and no! This works for HIT only implementations but it is awful for interpreted implementations. One has to traverse two objects to get to the bytecode, not one.
Y all means explore the AdditionalMethodState approach. With simple bytecode compiler extensions inst vars in AMS can be used syntactically as inst vars in CompiledCode, compiling to message sends, eg as in Andreas’ Tweak compiler that does the same for a tweak object’s properties.
Aside from that pipedream, it would probably be simplest to go for dropping the trailer bytes, and putting the info in the properties object. So that would mean every method would replace the penultimate literal with a pointer to an AdditionalMethodState object, which would contain the usurped selector and some source access object. They also already include a pointer back to the method and then indexed values for pragmas etc. So that's 32 bytes for a bare case, or in a squeak 6 image a megabytes & half since 1500 methods already have an AdditionalMethodState and we would 'save' the trailer bytes space. I'm not entirely sure why the back-pointer to the method is required after looking at the usage of it. Being able to dump that would save another ~half-MB.
It’s a lot of space to give up when the trailer implementation works.
The back pointer is necessary whenever access to AdditionalMethodState would change the AMS’s method. Look at senders. You’ll see some use cases.
What would this buy us? Simpler is almost always easier to understand and maintain, which is good. Storing the source would become a bit simpler since we could make appropriate classes for remote, in-image, in-image compressed, in-database, ask-another-image, etc.
A quick test of the size the system sources would be if kept in-image-compressed suggests ~11MB, which is rather better than the 52MB of the plain text file. 2MB of that is the 360 methods with more than 1024 bytes of zipped source, things like the sound of coffee cups clinking & car motors, plus some font stuff.
A trivial hack to keep the sources in AdditionalMethodStates for every method, zipping as added, shows it can function.
That’s not the thing to prototype. Extending the ClassBuilder and bytecode compiler do that one can create subclasses of CompiledMethod whose inst vars live in AMS subclasses is the thing to prototype.
On 2022-12-04, at 6:56 PM, tim Rowledge tim@rowledge.org wrote:
Some recent debugging handed me a reminder that we have several varieties of 'inboard source' in the system already, just not presently used and probably not complete.
Take a look at CompiledMethod>>#getSourceFor:in:
- it looks at the method properties, checking for a Dictionary with a #source entry
- it checks the method trailer for tempNames, and if found, decompresses the names string and then decompiles the method and inserts those tempNames
- it checks the trailer for sourceCode that may be contained in one of four different ways, two of them being compressed strings in the trailer and I think the code in CompiledMethodTrailer>>#sourceCode fails to decompress them? The other two methods rely on methods that no longer exist in the image.
- it checks if the trailer has a source pointer, and if not, decompiles the method with no assistance fro many temp names stuff
- if there is a source pointer, that value is used to fetch the source from the file(s), and as a backup for the files not being there it repeats the bare decompile code.
I think we might be able to clean that up a bit.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful Latin Phrases:- Illiud Latine dici non potest = You can't say that in Latin.
tim
tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful Latin Phrases:- Illiud Latine dici non potest = You can't say that in Latin.
On 2022-12-24, at 1:03 PM, Eliot Miranda eliot.miranda@gmail.com wrote:
No, no, no and no! This works for HIT only implementations but it is awful for interpreted implementations. One has to traverse two objects to get to the bytecode, not one.
Don't care for the purpose of thinking about this. As best I can recall it actually didn't make any noticeable difference when we did it at Interval but then again, machines were what, 1% of modern performance back then? But I'm not about to mess with deep VM code to try it out again. I still have the scars from the first time, thank you very much.
It’s a lot of space to give up when the trailer implementation works.
Again, not very interesting *at this point*. After all, the default image has a huge amount of stuff that ought to be removed if one is interested in saving memory for a server usage pattern. EToys, the games...
Also, using the trailer bytes stuff means recompiling every method and doing a mass become, which right now blows away the ARM64 VM with extreme prejudice, not even in #condenseSources :-( I'll see if the very latest VM build solves that. Just FYI the offending method is attached.
A trivial hack to keep the sources in AdditionalMethodStates for every method, zipping as added, shows it can function.
That’s not the thing to prototype. Extending the ClassBuilder and bytecode compiler do that one can create subclasses of CompiledMethod whose inst vars live in AMS subclasses is the thing to prototype.
Not sure why I'd need to do that when the AMS class is perfectly happy to have a #source property as-is (See CompiledMethod>>#getSourceFor:in:)
Obviously there's some compiler related changes needed to make sure the source for new methods gets set properly and so on. Not seeing that as too tricky, so far anyway. The messy bit is most likely the array of tool related code that does assume source from files and so on. That alone might make it impractical to change anything.
My aim is to see if we could have a system where all the code in use is in the image. Any change to a method would still be logged, assuming the changes file is in use. I think for most server type cases we wouldn't have changes being logged in the normal fashion, though perhaps by some means along with general logging. It should help with forking images on a server for example, and that is definitely a matter of some interest.
Maybe none of it will work. Maybe it's Maybelline.
tim
-- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: BBL: Branch on Burned out Light
On 2022-12-24, at 1:46 PM, tim Rowledge tim@rowledge.org wrote:
Also, using the trailer bytes stuff means recompiling every method and doing a mass become, which right now blows away the ARM64 VM with extreme prejudice, not even in #condenseSources :-( I'll see if the very latest VM build solves that. Just FYI the offending method is attached.
Nope. Boom.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim C for sinking, java for drinking, Smalltalk for thinking
OK, retry with a loop to do each method become: replacement instead of going bulk. Same crash effect but more useful looking crash.dmp
The highlight is
Smalltalk stack dump: 0x7fc0ca6830 M INVALID RECEIVER>become: 0x5594989328 is a forwarded hdr8 slot size 4d object to 0x5593187798 0x7fc0ca6868 M [] in SmalltalkImage>inboardSources 0x5593b229a0: a(n) SmalltalkImage 0x7fc0ca68b8 M Array(SequenceableCollection)>with:do: 0x55989b00a8: a(n) Array 0x7fc0ca6920 I SmalltalkImage>inboardSources 0x5593b229a0: a(n) SmalltalkImage
.. which might make things more obvious.
On 2022-12-24, at 3:33 PM, tim Rowledge tim@rowledge.org wrote:
On 2022-12-24, at 1:46 PM, tim Rowledge tim@rowledge.org wrote:
Also, using the trailer bytes stuff means recompiling every method and doing a mass become, which right now blows away the ARM64 VM with extreme prejudice, not even in #condenseSources :-( I'll see if the very latest VM build solves that. Just FYI the offending method is attached.
Nope. Boom. <crash.dmp> tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim C for sinking, java for drinking, Smalltalk for thinking
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Useful random insult:- Has nothing to say, but delights in saying it.
Oh, it works fine on X64linux VM
Open Smalltalk Cog[Spur] VM [CoInterpreterPrimitives VMMaker.oscog-dtl.3185] Unix built on Jun 2 2022 15:26:05 Compiler: Clang 9.0.0 (tags/RELEASE_900/final) platform sources revision VM: 202206021410 runner@fv-az125-921:work/opensmalltalk-vm/opensmalltalk-vm Date: Thu Jun 2 16:10:44 2022 CommitHash: c9fd365 Plugins: 202206021410 runner@fv-az125-921:work/opensmalltalk-vm/opensmalltalk-vm CoInterpreter VMMaker.oscog-dtl.3185 uuid: 0e7f07b8-eed6-4362-b223-86c98594ddb9 Jun 2 2022 StackToRegisterMappingCogit VMMaker.oscog-mt.3179 uuid: c6fbcb07-2a19-ed4f-8b40-9c119a70882a Jun 2 2022
So unfortunately an ARMv8 specific issue. Poo.
tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Oxymorons: Religious tolerance
squeak-dev@lists.squeakfoundation.org