Hi all,
So I've been plugging away at Environments, and have a new implementation that supports renaming, but I wanted to get some feedback before figuring out the sequence of commits necessary to update the trunk safely.
The idea is to support this kind of scenario:
seaside := (Environment named: #Seaside) import: Smalltalk globals; importSelf; exportSelf; yourself.
magma := (Environment named: #Magma) import: Smalltalk globals; importSelf; exportSelf; yourself.
app := (Environment named: #AwesomeApp) import: Smalltalk globals; import: seaside; from: seaside import: {#Session -> #SeasideSession}; import: magma; from: magma import: {#Session -> #MagmaSession}; importSelf; exportSelf; yourself.
In order to have the decompiled code use the correct names, I've changed the way lookup works. Instead of moving bindings from one environment to another, we now create a separate set of bindings for each environment, and copy the values (i.e., classes and globals) into a new binding when a binding is imported.
I had originally thought to use something like #changed:/#update: to keep all the bindings in sync, but then I realized that ClassBuilder does a #becomeForward: updates a class. So it's really only globals that could become out of sync between environments. Since we have very few globals in the image, and they basically never change, an update mechanism may not be necessary in practice.
So, here are some possible ways to proceed:
1. Use #changed:/#update: or something similar to keep keep global variables in sync.
2. Idea from Eliot—adopt the same convention as VW, and send #value to the binding on every access. Then we could have a special Alias bindings that forward the #value message to the original binding.
3. Share bindings for global variables between environments and disallow renaming of globals.
4. Special case the traditional globals such as Transcript and Display, and disallow the creation of new globals.
Anyhow, I'd appreciate a code review from anybody who's interested in this stuff. To take a look, file the attached (hand edited) change set into an updated trunk image.
Thanks,
Colin
Hello Colin,
good to see you moving on with the design and implementation of Environments.
The scenario you outline with code makes sense to me.
What does importSelf; exportSelf;
do?
More comments will follow.
Regards --Hannes
On 3/2/13, Colin Putney colin@wiresong.com wrote:
Hi all,
So I've been plugging away at Environments, and have a new implementation that supports renaming, but I wanted to get some feedback before figuring out the sequence of commits necessary to update the trunk safely.
The idea is to support this kind of scenario:
seaside := (Environment named: #Seaside) import: Smalltalk globals; importSelf; exportSelf; yourself.
magma := (Environment named: #Magma) import: Smalltalk globals; importSelf; exportSelf; yourself.
app := (Environment named: #AwesomeApp) import: Smalltalk globals; import: seaside; from: seaside import: {#Session -> #SeasideSession}; import: magma; from: magma import: {#Session -> #MagmaSession}; importSelf; exportSelf; yourself.
In order to have the decompiled code use the correct names, I've changed the way lookup works. Instead of moving bindings from one environment to another, we now create a separate set of bindings for each environment, and copy the values (i.e., classes and globals) into a new binding when a binding is imported.
I had originally thought to use something like #changed:/#update: to keep all the bindings in sync, but then I realized that ClassBuilder does a #becomeForward: updates a class. So it's really only globals that could become out of sync between environments. Since we have very few globals in the image, and they basically never change, an update mechanism may not be necessary in practice.
So, here are some possible ways to proceed:
- Use #changed:/#update: or something similar to keep keep global
variables in sync.
- Idea from Eliot—adopt the same convention as VW, and send #value to the
binding on every access. Then we could have a special Alias bindings that forward the #value message to the original binding.
- Share bindings for global variables between environments and disallow
renaming of globals.
- Special case the traditional globals such as Transcript and Display, and
disallow the creation of new globals.
Anyhow, I'd appreciate a code review from anybody who's interested in this stuff. To take a look, file the attached (hand edited) change set into an updated trunk image.
Thanks,
Colin
On Sun, Mar 3, 2013 at 6:18 PM, H. Hirzel hannes.hirzel@gmail.com wrote:
What does importSelf; exportSelf;
A newly created environment has no imports and no exports. If you compile a method (or doIt) in such an environment, all bindings will be undeclared, and if you import it into another environment, none of its binding will be visible.
To be able to resolve bindings inside the environment, you need imports. #importSelf tells the environment to import its own contents, to make classes and globals defined in the environment visible to methods compiled in the environment. #exportSelf tells it to make all its contents visible to the outside world.
Colin
On 2 March 2013 09:25, Colin Putney colin@wiresong.com wrote:
Hi all,
So I've been plugging away at Environments, and have a new implementation that supports renaming, but I wanted to get some feedback before figuring out the sequence of commits necessary to update the trunk safely.
The idea is to support this kind of scenario:
seaside := (Environment named: #Seaside) import: Smalltalk globals; importSelf; exportSelf; yourself.
magma := (Environment named: #Magma) import: Smalltalk globals; importSelf; exportSelf; yourself.
app := (Environment named: #AwesomeApp) import: Smalltalk globals; import: seaside; from: seaside import: {#Session -> #SeasideSession}; import: magma; from: magma import: {#Session -> #MagmaSession}; importSelf; exportSelf; yourself.
In order to have the decompiled code use the correct names, I've changed the way lookup works. Instead of moving bindings from one environment to another, we now create a separate set of bindings for each environment, and copy the values (i.e., classes and globals) into a new binding when a binding is imported.
+1 the last time i thought about it, i came to same conclusion.
I had originally thought to use something like #changed:/#update: to keep all the bindings in sync, but then I realized that ClassBuilder does a #becomeForward: updates a class. So it's really only globals that could become out of sync between environments. Since we have very few globals in the image, and they basically never change, an update mechanism may not be necessary in practice.
So, here are some possible ways to proceed:
- Use #changed:/#update: or something similar to keep keep global variables
in sync.
- Idea from Eliot—adopt the same convention as VW, and send #value to the
binding on every access. Then we could have a special Alias bindings that forward the #value message to the original binding.
yeah.. but i would go even further: since you already sending a message, why not using a looked-up name as selector? <obj> Foo -> gives binding of 'Foo' in environment. <obj> Foo: <value> -> sets the new value.
i bet, with some trickery you can almost completely eliminate the overhead of dynamic dispatch (especially while under Cog VM). you can even exploit VM lookup logic to search through hierarchy of environments.
- Share bindings for global variables between environments and disallow
renaming of globals.
Forget it. Message is best. What if you want to delete binding or create new one? If you stick with early binding scheme, you will have a lot of problems with it. Late bound solution is natural , i would go with it first, and then see how to optimize it.. instead of exploiting model which quite inadequate for environments purpose :)
- Special case the traditional globals such as Transcript and Display, and
disallow the creation of new globals.
well, some globals will be still there, like it or not: special objects array ;) as for 'human-recognized' globals , e.g. those who kept in system dictionary.. who cares? you can replace it with bunch of environments at any moment without any harm. yeah.. of course you have to change the tools accordingly, to not rely on capturing association object as a name binding, but instead always send message to environment object.. but that's you have to change anyways.. so what's the deal?
Anyhow, I'd appreciate a code review from anybody who's interested in this stuff. To take a look, file the attached (hand edited) change set into an updated trunk image.
Thanks,
Colin
On 4 March 2013 03:12, Igor Stasenko siguctua@gmail.com wrote:
On 2 March 2013 09:25, Colin Putney colin@wiresong.com wrote:
Hi all,
So I've been plugging away at Environments, and have a new implementation that supports renaming, but I wanted to get some feedback before figuring out the sequence of commits necessary to update the trunk safely.
The idea is to support this kind of scenario:
seaside := (Environment named: #Seaside) import: Smalltalk globals; importSelf; exportSelf; yourself.
magma := (Environment named: #Magma) import: Smalltalk globals; importSelf; exportSelf; yourself.
app := (Environment named: #AwesomeApp) import: Smalltalk globals; import: seaside; from: seaside import: {#Session -> #SeasideSession}; import: magma; from: magma import: {#Session -> #MagmaSession}; importSelf; exportSelf; yourself.
In order to have the decompiled code use the correct names, I've changed the way lookup works. Instead of moving bindings from one environment to another, we now create a separate set of bindings for each environment, and copy the values (i.e., classes and globals) into a new binding when a binding is imported.
+1 the last time i thought about it, i came to same conclusion.
I had originally thought to use something like #changed:/#update: to keep all the bindings in sync, but then I realized that ClassBuilder does a #becomeForward: updates a class. So it's really only globals that could become out of sync between environments. Since we have very few globals in the image, and they basically never change, an update mechanism may not be necessary in practice.
So, here are some possible ways to proceed:
- Use #changed:/#update: or something similar to keep keep global variables
in sync.
- Idea from Eliot—adopt the same convention as VW, and send #value to the
binding on every access. Then we could have a special Alias bindings that forward the #value message to the original binding.
yeah.. but i would go even further: since you already sending a message, why not using a looked-up name as selector? <obj> Foo -> gives binding of 'Foo' in environment. <obj> Foo: <value> -> sets the new value.
i bet, with some trickery you can almost completely eliminate the overhead of dynamic dispatch (especially while under Cog VM). you can even exploit VM lookup logic to search through hierarchy of environments.
- Share bindings for global variables between environments and disallow
renaming of globals.
Forget it. Message is best. What if you want to delete binding or create new one? If you stick with early binding scheme, you will have a lot of problems with it. Late bound solution is natural , i would go with it first, and then see how to optimize it.. instead of exploiting model which quite inadequate for environments purpose :)
Late binding means being able to change your mind at the last minute.
- Special case the traditional globals such as Transcript and Display, and
disallow the creation of new globals.
well, some globals will be still there, like it or not: special objects array ;) as for 'human-recognized' globals , e.g. those who kept in system dictionary.. who cares? you can replace it with bunch of environments at any moment without any harm. yeah.. of course you have to change the tools accordingly, to not rely on capturing association object as a name binding, but instead always send message to environment object.. but that's you have to change anyways.. so what's the deal?
Agreed - for non-GUI applications, having Transcript actually mean FileStream stderr would be great, for instance.
frank
Anyhow, I'd appreciate a code review from anybody who's interested in this stuff. To take a look, file the attached (hand edited) change set into an updated trunk image.
Thanks,
Colin
-- Best regards, Igor Stasenko.
On Sun, Mar 3, 2013 at 7:12 PM, Igor Stasenko siguctua@gmail.com wrote:
yeah.. but i would go even further: since you already sending a message, why not using a looked-up name as selector? <obj> Foo -> gives binding of 'Foo' in environment. <obj> Foo: <value> -> sets the new value.
One of the goals with environments is to avoid syntactic or semantic changes to the language. All existing code should run as-is, and still benefit from the namespace isolation that we get from environments. Your idea above seems feasible for a new language, but it breaks compatibility.
So I'll take this as a vote in favour of Eliot's idea—message send rather than direct access, but preserving the existing syntax and semantics of Smalltalk.
Thanks!
Colin
On 4 March 2013 18:37, Colin Putney colin@wiresong.com wrote:
On Sun, Mar 3, 2013 at 7:12 PM, Igor Stasenko siguctua@gmail.com wrote:
yeah.. but i would go even further: since you already sending a message, why not using a looked-up name as selector? <obj> Foo -> gives binding of 'Foo' in environment. <obj> Foo: <value> -> sets the new value.
One of the goals with environments is to avoid syntactic or semantic changes to the language. All existing code should run as-is, and still benefit from the namespace isolation that we get from environments. Your idea above seems feasible for a new language, but it breaks compatibility.
breaks? how?
you mean changing compiler to compile "<obj> Foo" message send for accessing Foo name, instead of direct binding access?
yes, it changes semantics for global variable access.. but that's the way to go if you wanna introduce late-bound names.
but breaking compatibility... with what?
From language perspective, you can still be able compile & run smalltalk code
and perform it equally as canonical ST-80.. The way how compiler provides accessing to globals (or any method's external scope variables) is implementation detail. And you cannot introduce environments without changing that.. so i don't understand.
So I'll take this as a vote in favour of Eliot's idea—message send rather than direct access, but preserving the existing syntax and semantics of Smalltalk.
hmm.. you confused me.. in what way my proposal breaks any semantics/syntax? it uses message sends.. and no change to VM needed: you can create a behavior where all its method are accessors, so when you sending a message to an instance of such behavior you will get a variable value by its name.
Anyways.. i just gave an idea.. you're free to use it or throw it away , no problem :)
Thanks!
Colin
On Mon, Mar 4, 2013 at 5:09 PM, Igor Stasenko siguctua@gmail.com wrote:
you mean changing compiler to compile "<obj> Foo" message send for accessing Foo name, instead of direct binding access?
Oh, I see. The source code still reads as a variable access, but the bytecode implements a message send. Sure, that would be equivalent to Eliot's idea. Instead of a different receiver for each variable, it's a different message for each variable.
That's a lot more feasible, but I don't see any advantage over just sending #value to the binding, and it's more work. We already have tools that know how to deal with bindings, and they'd all have to be converted to the new scheme.
Colin
On 2013-03-05, at 02:40, Colin Putney colin@wiresong.com wrote:
On Mon, Mar 4, 2013 at 5:09 PM, Igor Stasenko siguctua@gmail.com wrote:
you mean changing compiler to compile "<obj> Foo" message send for accessing Foo name, instead of direct binding access?
Oh, I see. The source code still reads as a variable access, but the bytecode implements a message send.
No, I think Igor is proposing to write something like "self environment Foo" to access Foo. Which is flexible, granted, but looks ugly.
- Bert -
On Tue, Mar 5, 2013 at 3:53 AM, Bert Freudenberg bert@freudenbergs.dewrote:
On 2013-03-05, at 02:40, Colin Putney colin@wiresong.com wrote:
On Mon, Mar 4, 2013 at 5:09 PM, Igor Stasenko siguctua@gmail.com
wrote:
you mean changing compiler to compile "<obj> Foo" message send for accessing Foo name, instead of direct binding access?
Oh, I see. The source code still reads as a variable access, but the
bytecode implements a message send.
No, I think Igor is proposing to write something like "self environment Foo" to access Foo. Which is flexible, granted, but looks ugly.
That's what I thought too, but that obviously breaks compatibility with existing code, which Igor claims his proposal does not.
Colin
On 5 March 2013 21:54, Colin Putney colin@wiresong.com wrote:
On Tue, Mar 5, 2013 at 3:53 AM, Bert Freudenberg bert@freudenbergs.de wrote:
On 2013-03-05, at 02:40, Colin Putney colin@wiresong.com wrote:
On Mon, Mar 4, 2013 at 5:09 PM, Igor Stasenko siguctua@gmail.com wrote:
you mean changing compiler to compile "<obj> Foo" message send for accessing Foo name, instead of direct binding access?
Oh, I see. The source code still reads as a variable access, but the bytecode implements a message send.
No, I think Igor is proposing to write something like "self environment Foo" to access Foo. Which is flexible, granted, but looks ugly.
That's what I thought too, but that obviously breaks compatibility with existing code, which Igor claims his proposal does not.
Can you be more specific, what exactly it breaks?
If it is about #bindingOf: which answers association holding key/value pair.. nothing prevents us from answering "LateBoundBinding" which when you send #value to it, actually doing <special object> perform: #Foo (and similar things for write access).
Colin
On Fri, Mar 8, 2013 at 2:44 PM, Igor Stasenko siguctua@gmail.com wrote:
Can you be more specific, what exactly it breaks?
Initially, I didn't realize that you were talking about changes to the compiler. I thought that you were proposing that we replace this code:
someMethod ^ Foo
with this:
someMethod ^ self environment Foo
That would break all the existing code that refers to Foo directly.
Colin
On 9 March 2013 02:20, Colin Putney colin@wiresong.com wrote:
On Fri, Mar 8, 2013 at 2:44 PM, Igor Stasenko siguctua@gmail.com wrote:
Can you be more specific, what exactly it breaks?
Initially, I didn't realize that you were talking about changes to the compiler. I thought that you were proposing that we replace this code:
someMethod ^ Foo
with this:
someMethod ^ self environment Foo
That would break all the existing code that refers to Foo directly.
You mean in the "class references to" sense? You could still find these by looking for senders-of.
frank
Colin
On 5 March 2013 12:53, Bert Freudenberg bert@freudenbergs.de wrote:
On 2013-03-05, at 02:40, Colin Putney colin@wiresong.com wrote:
On Mon, Mar 4, 2013 at 5:09 PM, Igor Stasenko siguctua@gmail.com wrote:
you mean changing compiler to compile "<obj> Foo" message send for accessing Foo name, instead of direct binding access?
Oh, I see. The source code still reads as a variable access, but the bytecode implements a message send.
No, I think Igor is proposing to write something like "self environment Foo" to access Foo. Which is flexible, granted, but looks ugly.
i making a prototype implementation right now.. so you can look and see.
The trick is in compiler.., when it sees:
someMethod ^ Foo
it compiles it not to "read from literal binding" bytecode but to "push <special object> send #Foo" .. so it is message send for global variable access
then it is all about that <special object> which understands and/or handles #Foo message to answer a proper result.
- Bert -
On Fri, Mar 8, 2013 at 2:41 PM, Igor Stasenko siguctua@gmail.com wrote:
i making a prototype implementation right now.. so you can look and see.
I think I understand what you're proposing, but I don't think you've explained why you think it's a good idea. The alternative, sending #value to the binding, is already implemented, works well and is supported by all the tools. What advantage does your idea have that makes it worth the effort?
Colin
On 9 March 2013 03:15, Colin Putney colin@wiresong.com wrote:
On Fri, Mar 8, 2013 at 2:41 PM, Igor Stasenko siguctua@gmail.com wrote:
i making a prototype implementation right now.. so you can look and see.
I think I understand what you're proposing, but I don't think you've explained why you think it's a good idea. The alternative, sending #value to the binding, is already implemented, works well and is supported by all the tools. What advantage does your idea have that makes it worth the effort?
Well, maybe it's an overkill .. it is mainly about speed and reusing VM's lookup mechanism for searching a name over multiple scopes (when you have namespaces with imports).
On a first run, you will replace "read value of <binding>" bytecode with " send #value to <binding>" but that implies having a binding at compile time (you must lookup for a name at compile time). It also means that you won't change anything semantically: even though you sending a message, you still accessing the very same state which you bound early at compile time.
Then, i wonder, if such change is actually worth doing. Because if you don't do lookup dynamically, then there is no change.
But if you going to do a dynamic lookup, your binding will have to do extra work by holding (lookup name and environment object), and so your #value method will look like:
MyEnvBinding>>value
^ env lookupForName: name "where name and env is inst vars"
and then , depending on implementation .. it will cost extra cycles to do a lookup but i bet you will end up having a Dictionary somewhere to which you will send #at: message. But think how many extra message sends you must perform in order to do a lookup (especially in cases when you have a deeply nested namespace hierarchies) . So, at the end, you will pay much bigger price for accessing the variable by its name dynamically.
In my case, the <special object> to which you sending #Name message is an instance of Behavior, which already holds a dictionary (method dictionary).. and lookup is performed by VM, you can chain those objects through <superclass> field so VM will do a lookup visiting different namespaces.. Now think, how much faster it will be, especially with JIT and inline cache. You sending a message, VM does lookup, you grab the result. done.
On Sat, Mar 9, 2013 at 6:48 AM, Igor Stasenko siguctua@gmail.com wrote:
Well, maybe it's an overkill .. it is mainly about speed and reusing VM's lookup mechanism for searching a name over multiple scopes (when you have namespaces with imports).
On a first run, you will replace "read value of <binding>" bytecode with " send #value to <binding>" but that implies having a binding at compile time (you must lookup for a name at compile time). It also means that you won't change anything semantically: even though you sending a message, you still accessing the very same state which you bound early at compile time.
Right. The lookup happens at compile time. When the method is actually executed, we're just fetching the value from the binding we found at compile time. I don't see how that's going to be slow. In the most common case, a class reference, it's not even a message send, because we have a dedicated bytecode. For global variables, it'll be two message sends: we send #value to the alias binding, which then sends #value on to the canonical binding in the environment's "contents" dictionary.
Then, i wonder, if such change is actually worth doing. Because if you
don't do lookup dynamically, then there is no change.
Sure there is. We still bind names at compile time, but we have changed the way names are resolved.
Colin
On 10 March 2013 06:47, Colin Putney colin@wiresong.com wrote:
On Sat, Mar 9, 2013 at 6:48 AM, Igor Stasenko siguctua@gmail.com wrote:
Well, maybe it's an overkill .. it is mainly about speed and reusing VM's lookup mechanism for searching a name over multiple scopes (when you have namespaces with imports).
On a first run, you will replace "read value of <binding>" bytecode with " send #value to <binding>" but that implies having a binding at compile time (you must lookup for a name at compile time). It also means that you won't change anything semantically: even though you sending a message, you still accessing the very same state which you bound early at compile time.
Right. The lookup happens at compile time. When the method is actually executed, we're just fetching the value from the binding we found at compile time. I don't see how that's going to be slow. In the most common case, a class reference, it's not even a message send, because we have a dedicated bytecode. For global variables, it'll be two message sends: we send #value to the alias binding, which then sends #value on to the canonical binding in the environment's "contents" dictionary.
okay, so as i understood, every namespace will have a set of own alias bindings which can point to some "real" binding.. and then when things need to be changed, you have to change those aliases instead of recompiling methods. And only if method/class goes to different namespace you need to recompile it, to rewire code with different aliases.
Yes, that's quite simple.
Then, i wonder, if such change is actually worth doing. Because if you don't do lookup dynamically, then there is no change.
Sure there is. We still bind names at compile time, but we have changed the way names are resolved.
Colin
This was a thread on environments which started in March 2013.
Were there other updates?
--Hannes
On 3/11/13, Craig Latta craig@netjam.org wrote:
Hi Colin--
I'd appreciate a code review from anybody who's interested in this stuff.
No class comments? I'd like to see a usage summary. thanks,
-C
-- Craig Latta www.netjam.org/resume +31 6 2757 7177 (SMS ok)
- 1 415 287 3547 (no SMS)
Original proposal for Environments by Colin Putney was in June 2012
I put it here http://wiki.squeak.org/squeak/6218
--Hannes
On 11/24/15, H. Hirzel hannes.hirzel@gmail.com wrote:
This was a thread on environments which started in March 2013.
Were there other updates?
--Hannes
On 3/11/13, Craig Latta craig@netjam.org wrote:
Hi Colin--
I'd appreciate a code review from anybody who's interested in this stuff.
No class comments? I'd like to see a usage summary. thanks,
-C
-- Craig Latta www.netjam.org/resume +31 6 2757 7177 (SMS ok)
- 1 415 287 3547 (no SMS)
And the March 2013 update is now here
http://wiki.squeak.org/squeak/6220
--HH
On 11/25/15, H. Hirzel hannes.hirzel@gmail.com wrote:
Original proposal for Environments by Colin Putney was in June 2012
I put it here http://wiki.squeak.org/squeak/6218
--Hannes
On 11/24/15, H. Hirzel hannes.hirzel@gmail.com wrote:
This was a thread on environments which started in March 2013.
Were there other updates?
--Hannes
On 3/11/13, Craig Latta craig@netjam.org wrote:
Hi Colin--
I'd appreciate a code review from anybody who's interested in this stuff.
No class comments? I'd like to see a usage summary. thanks,
-C
-- Craig Latta www.netjam.org/resume +31 6 2757 7177 (SMS ok)
- 1 415 287 3547 (no SMS)
squeak-dev@lists.squeakfoundation.org