Multy-core CPUs, ERLANG

Peter William Lount peter at smalltalk.org
Tue Oct 23 14:52:05 UTC 2007


Wolfgang Eder wrote:
> [more stuff snipped]
>
> Hello all,
> I think that Erlang does have mechanisms to share
> stuff between processes. First, the code is shared.
> When I update a module, all processes using the
> code of the module will (eventually) switch to the
> new version.
> And then there is the Mnesia database and its parts
> that can be used to share data between processes.
>
> And, slightly off topic probably:
> One thing that strikes me as remarkable about the
> Erlang system is that, since there is non-destructive
> assignment, you cannot have cycles in your object
> graphs. I think this simplifies the GC tremendously.
> But I can think of no way of doing something similar
> with Smalltalk objects, unfortunately.
>
> Cheers,
> Wolfgang 

Hi,

That's interesting. Thus Erlang DOES IN FACT HAVE SHARED MEMORY between 
processes: for code and for data. I'd like to learn more about that. 
Could anyone provide more details?

One proposal was a "copy-on-write" object space model where objects that 
are about to be written to in a Smalltalk process would be copied to 
that processes private object space - in effect that processes view of 
the "image".

To implement a copy-on-write technique would require operating system 
support for the typical modern mainstream operating system. To implement 
copy-on-write requires a synchronization primitive to be used by the 
operating system - if I'm not mistaken - at least for a few instructions 
while the page tables are updated - a critical section.

To implement copy-on-write requires a language to have an ability to go 
beyond the Erlang style of concurrency capabilities.

One of the crucial aspects that Alan Kay (and others) have promoted over 
and over again is the ability of a language to be expressed in itself. 
This has a certain beauty to it as well as a mathematical aesthetic that 
has important ramifications that go way beyond those characteristics. To 
have a "mobius" system that can rewrite itself while retaining 
functioning versions across a continuous evolutionary path one requires 
a system that can be expressed in itself. Alan Kay points to a page in 
the Lisp Manual where Lisp is implemented in itself. Since Smalltalk is 
supposed to be a general purpose programming language it is crucial that 
it have this aspect of being able to implement itself with itself. So 
far Squeak comes close to this - at least with respect to the virtual 
machine which is written in the slang subset of Smalltalk. Unfortunately 
Squeak relies upon manually written C files for binding with the various 
operating systems. Co-existence with C based technology has it's price 
and it's high in that it blocks access to the entire system from within 
the system; by being blocked one is prevented from online interactive 
exploration and experimentation that we are used to at the Smalltalk 
source code level. At least this is being addressed in the amazing work 
of Ian Piumarta (http://piumarta.com/pepsi/pepsi.html) and the 
incredible work of LLVM (http://llvm.org). In fact I highly recommend 
that Squeak move from it's current obsolete C compilers to make use of 
either of these two projects as the bottom of the VM. Apple is funding 
LLVM and Ian's work seems to be part of the work of Alan Kay's 
Viewpoints Research Institute (http://www.vpri.org).

The "non-destructive" assignment aspect of Erlang is typical of 
non-write-in-place functional and object database systems. It's a key 
aspect of the ZokuScript Object Database Management System and 
Technologies. However it's not a panacea that the silver bullet utopians 
think it is. As with any other solution matrix it has it's benefits, 
payoffs, minuses and costs. These need to be balanced for every 
application. As Wolfgang points out there are issues with it such as the 
"cycle" problem that need to be overcome via implementation exceptions.

The other issue is how fine to you cut the objects? At what point do you 
say enough is enough? That is at what point does a process say oh, I 
don't really have control of changes to the object in question... as 
that object is private to another object space. Thus control needs to be 
passed to a process in the other object space likely on another compute 
node. For example corporate security constraints may require that 
certain data remain on the server while only permitting some data to be 
shared with a laptop node running remotely.

It's important to consider the wider issues involved in distributed 
systems that are to be deployed in the real world. For Smalltalk to 
evolve we must get really serious about these issues ahead of the curve 
that others are pursuing now.

It's shocking that systems like Flash MX's Javascript compatible 
language has a few features that are more advanced than Smalltalk. It's 
shocking that Flash is so popular even though the language also has 
serious flaws - for example in it's handling of exceptions.

One of the tremendous strengths of Smalltalk is shared with Unix 
systems. If you visit the Smalltalk versions page at Smalltalk.org 
(http://Smalltalk.org/versions) you'll a great many versions of 
Smalltalk. In fact the page isn't complete as there are older historical 
versions of Smalltalk that are missing as well as a slate (no pun 
intended) of Smalltalk and Smalltalk like languages that are missing 
from the roster listed there. Smalltalk shares this proliferation aspect 
with Unix. Count the Unix variants and it's in the hundreds if not 
approaching thousands of distributions that have been or that are 
available now. Linux alone has hundreds of variants.

Compare this variety with Java and Microsoft. They are stagnant with 
just one thread of evolution. Smalltalk and Unix are undergoing a much 
wider range of co-evolutionary development much of which is parallel and 
much of which is divergent. Both aspects are important.

Divergence is important for strong vendors so that they can distinguish 
their products and meet the needs of their set of vertical markets.

Parallel co-evolution, cross pollination and open sharing of code via 
libraries and the ANSI Standard for Smalltalk (new version in the works 
- please contribute) is important for the language as a unified entity.

Parallelism is one of the low level aspects that needs to be shared 
openly between the vendors for such features to become "standard" 
features. Otherwise parallelism across the vendors products will become 
or remain hodgepodge (as it is now).

The same goes for the Graphical User Interface but that's an entirely 
different conversation.

The basic point is that for a language to be expressible in itself it 
means that ALL the computer science techniques used to implement the 
language must be expressible in the language. It goes beyond this self 
referential definition since the language must also be able to express 
ANY computer science technique that is needed for the full range of 
systems that will be implemented in it. To do less is to create a 
language that is less than capable.

With the advances in static compiler just in time technologies (LLVM, 
Code-Pepsi, etc...) that can co-exist with the C universe it's possible 
for Smalltalk to become a full fledged systems language again as it once 
was. To limit the language and prevent this from happening will create a 
version of Smalltalk that simply only addresses the needs of a small 
segment of the market.

Concurrency control issues are a very important aspect of any general 
purpose programming language. To limit the solution space to a tiny 
corner of solutions would be a mistake by design.

Certainly making concurrency easier and fool proof is a laudable goal. 
However the cost might be too high a price if it's not done well or if 
it alters the language beyond it's current shape.

One of the reasons that I'm implementing a new language, ZokuScript, is 
that it does change the paradigm beyond that of Smalltalk. Keeping 
connected with Smalltalk is done via ZokuTalk. However the execution 
engine (not a virtual machine) will translate ZokuTalk (i.e. Smalltalk) 
into ZokuScript and then compile it to native code. ZokuTalk and 
Smalltalk are subsets of ZokuScript which is a fusion of many ideas and 
concepts from other languages and - above all else - application 
requirements.

The erlangification (erlangization, or erlangisation) of Smalltalk may 
be a radical enough transformation that it's no longer Smalltalk. If 
that's the way of Squeak that's fine however it seems that a fork is 
likely the result (and yes, the pun of forking was intended).

Since the driver is the requirements and not just technology awe what 
are the requirements for concurrency in Squeak and in Smalltalk (since 
Squeak is diverging from Smalltalk more and more)?

Inventing the future is fun and hard work. Which future are you inventing?

All the best,

Peter William Lount



More information about the Squeak-dev mailing list