Complexity and starting over on the JVM (community process and values)

Thu Feb 7 03:07:50 UTC 2008

goran at krampe.se wrote:
>> Why can't Squeak also be for us lazy people who are
>> used to how the rest of the world does things, instead of sending us off to
>> Python or Ruby or wherever? :-)
> 
> First of all, Squeak/Smalltalk is not "how the rest of the world does
> things". If it was then it wouldn't be interesting. 

I think that is true, up to a point. Different when it is for the sake of
being clearly better is one thing. Different for being just different or to
be compatible with outdated technology is much more problematical.

Squeak gets much of the hard stuff right (everything is an object, the GUI
is open ended, all the code is right there, the syntax is extensible in
terms of control structures, and so on), but it only gets about, say, 50% of
the easy stuff right IMHO (the GUI, the assign statement, namespaces). The
problem is that missing the easy stuff makes Squeak unusable for most
people. Kind of like a tennis player who can hit back any impossibly hard
shot but half the time flubs the easy ones because their mind is focused on
preparing for the hard things. :-)

> Sure, we can make things better and easier etc,
> and AFAIK we do try all the time.

Ten or more years later, when I still see many of the same problems
(underscore and assign, licensing, fonts, modularity, newbies confused by
the GUI, etc.) I'm not sure I can agree with that, at least as it applies to
the (theoretically) easy stuff. :-) Of course, if Squeak flubs some of the
easy stuff it must be harder than it looks. :-) Why is that?

Consider this:
   "Debugger use"
http://www.cincomsmalltalk.com/blog/blogView?showComments=true&entry=3237275853
"""
Frans Bourma discusses "edit and continue" in development:
    Debugging isn't about searching for forgotten quotes or a ';' at the
wrong spot. It's about a totally different thing. Let's categorize some
types of bugs to make understanding how to fix them a little easier, shall we?
       1. Functionality bugs. These are the ones at the highest abstract
level: in the functionality the software has to provide. An example of this
kind of bug is the ability to execute script in an email in Outlook
(Express) and enable that feature by default.
       2. Algorithmic bugs. These are the ones at the abstract level below
the functionality bugs. An example of this kind of bug is the (now patched)
flaw in Microsoft's TCP/IP stack which marked the TCP/IP packets with
numbers that weren't random enough which then could lead to data exposure
via sniffing. The code was good, the algorithm used was bad.
       3. Algorithm implementation bugs. This is the kind of bug you'll see
when an algorithm is implemented wrong. This type shouldn't be confused with
the next category, however. Algorithm implementation bugs are bugs which
originate in the developers mind when the developer thinks s/he understands
how the algorithm to implement works and starts cranking out code, however
the developer clearly didn't fully understand the algorithm so a piece of
code was written which will not function as expected, although the developer
thinks it does.
       4. Plain old stupidity bugs.. Everyone knows them: forget to update a
counter, add the wrong value to a variable, etc. etc.
"""

Let me continue counting down from 1 and add a 0 and -1 to Frans Bourma's
bugs types list, related to writing software as an free and open source
software community. :-)

=== 0. Community process bugs.

These are ones where useful functionality is developed in the community and
then lost, or where confusing or useless or overly complex functionality is
adopted or not later rejected or refactored when is is not working out
(Traits?). These bugs can happen from lack of communication, lack of person
power, dysfunctional community culture, or as the legacy of historical bad
decisions. Often these problems show up when the core of a community is
unstable for some reason or overall project complexity is untamed, so people
are always fighting fires (often the same fires, repeatedly) instead of
focusing on new and exciting things at the edges (where they don't conflict)
or even just adding fire suppression apparatus and smoke detectors
(regression tests?). Bitrot is a good indicator of this problem.

Personally, I feel the Squeak licensing issue is also an example of this
problem too (and as long as, say, Disney does not sign off on the core, and
they probably never will, is this licensing issue ever really going to be
resolved?) See my comments here:
  "Belling the cat of complexity (License issues)" Jul 2 18:55:05 CEST 2000
http://lists.squeakfoundation.org/pipermail/squeak-dev/2000-July/007879.html
or here:
  "Belling the cat of complexity (License issues)" Mon Jul 3 23:07:01 CEST 2000
http://lists.squeakfoundation.org/pipermail/squeak-dev/2000-July/010097.html
At which point the "pro bono" lawyer wrote:
> I believe this colloquy has more than exceeded the patience and
> tolerance level of our colleagues.  I will respond to this, after
> which I would suggest instead taking this offline.
(I think my email might have got filtered by his company after that, as I
got bounces. :-) How many years has Squeak suffered under an unusual
not-quite-free-or-open-source license? How much trouble has it caused (like
not getting into Debian main or not having Squeak people at conferences)?
And still causes? How much cheaper to the community to have fixed this seven
years ago? Clearly this was a serious community process *bug*. :-) As much
as I can agree with Alan Kay when he says that copyright laws and legal
proceedings don't make much rational sense, why tempt fate? But, this
licensing failure also stems from a *value* related to the Squeak community,
which my next point, stemming from thinking of Squeak as a personal tool (or
small private group tool) instead of a free and open source software
community collaboration tool.

=== -1. Community vision bugs and related priority setting bugs.

These are ones where the ideals and values of the community result in
undesired outcomes, and this typically comes about when they once made sense
but the world has changed. For example, Smalltalk was originally intended as
a personal creativity amplifier back when a 10MB disk pack was a big thing
and PCs cost $100K, but if now "the network is the computer" and "the
community is the developer" then some ideals might need to change. For
instance, the desire for using ":=" instead of a non-standard remapped
underscore for assign flows out of a desire to collaborate more easily with
others using Smalltalk-like languages, but unless collaborative development
is part of the vision, then this relatively easily made changeover isn't a
priority (even after twelve years) and will be resisted by expected
conservative trends by the community.

See for example, my post to the Squeak list on Dec 25 22:10:47 UTC 2006
"Design Principles Behind Smalltalk, Revisited"
http://lists.squeakfoundation.org/pipermail/squeak-dev/2006-December/112306.html
Excerpt on one of several issues I raise:
"So while I think Dan's original goal [of one person being able to
understand and change the whole system] is a nice ideal, in practice it is
not needed in the extreme, since the creativity of a group sharing, say, a
community mailing list, will still move beyond the creativity of any
individual in the group. So, it is *more* important in the internet age to
have techniques for supporting group creativity, including modularity,  than
it is for the system to not have any barriers. That is probably one reason a
system like Python is actually used in practice by more people to do
creative things than Smalltalk. Python makes it easy for many individuals to
do small things. While Python as a language and as an environment in
practice does not scale as well as Smalltalk, the aggregate amount of all
those individuals doing small things overwhelms what any one Smalltalker can
do (or even a small group of them stepping on each others' toes and watching
their contributions suffer from "bit rot")."

Or as I wrote here:
 http://mail.python.org/pipermail/edu-sig/2007-March/007822.html
"There is once again the common criticism leveled at Smalltalk of being
too self-contained. Compare this proposal with one that suggested making
tools that could be used like  telescope or a microscope for relating to
code packages in other languages -- to use them as best possible on
their own terms (or perhaps virtualized internally)."

Those are simply different ideals than Squeak (or Smalltalk in general)
started out with. And they are ideals we are able to have now in part
because of the very *success* of Smalltalk and Dan's and Alan's (and
others') work and ideals over the last few decades, which have allowed us to
grow from working in proprietary isolation to working in a free community.

> Secondly, and don't take this the wrong way - I am all for everyone
> speaking his/her mind on *all* issues - but it is very easy to talk and
> quite a bit different to do the walk. So along that thought - what would
> *you* be prepared to help out with in all this? Because that is what it
> will come down to at the end of the day. :) :)   (extra smileys added)

Fair enough question. And my admittedly perhaps too harsh reply to Keith on
the value of non-developer contributions aside (sorry, Keith), I would not
have raised this specific issue if I had not already put some work into it
and was willing to put some more.

I've already spent several person months trying to make Python more
Squeak-ish (and Self-ish). I'm not happy with the results, though I think it
was a productive experiment. See:
  http://patapata.sourceforge.net/critique.html
Essentially, I've also hit the limits of Python not being designed for
dynamic development (as in, modifying the code while it is running, like in
the debugger). So, I can either improve Python's core if I continue down
that route of making more Squeakish, or I can instead try to bring some of
the flavor of Python (especially modularity) back to an already dynamic
Squeak. Or I can just give up on it and use an existing JVM language
(Jython, Scala, Groovy, Javascript, etc.. :-)

As another experiment over the next year or so, I'd be willing to spend the
same amount of time trying to get a Squeak-like system on the JVM (through
probably drawing from GNU Smalltalk for the core due to licensing issues),
if there was some serious interest in the idea (and maybe a little help in
testing and such by interested parties).

I've put some time into exploring the possibilities (like the original code
snippet I included at the start). And I'm somewhat familiar with JVM
language from using Jython and looking through the internals. For reference:
"Re: Exploring & decompiling jython _pyx0.class files"
http://osdir.com/ml/lang.jython.devel/2005-11/msg00045.html
"Essentially, this is a somewhat inside-out abstraction from the one I am
used to on debugging through something like the Squeak Smalltalk VM (or
other VMs I have written). I was expecting to see the machinery of stepping
through bytecodes here in this function as Java moves into interpreting
Jython and starts running a bytecode interpreter. What is really happening
is that the Java part of the Jython intepreter application is already
running on top of the same bytecode interpreter that it is going to use to
process the compiled Jython source code, as it has previously compiled the
Jython code to Java bytecodes. So, when the source disappears, the JVM is
leaping into this newly created Jython-source-derived bytecode. So I never
see a stepping through bytecodes. Presumably if I used a different sort of
debugger or understood the Eclipse one better, I could see these bytecode
stepping, but there would still be no obviously transition between the Java
based InteractiveInterpreter and supporting code and the Jython code. This
all may seem obvious now, but coming from experience writing or debugging
byte code type VMs, it is not what I expected. But obviously it is a good
way to go, especially for speed, and a big part of Jython's magic. "

I think what I have in mind would not be *Squeak* as you know it (i.e.
Smalltalk to the metal) but it could be done with an eye to running
Squeak-like software (like Croquet or eToys or whatever). It would draw from
ideas similar to Spoon or Athena (minimal core, remote development) or Self
(embedded compiler) in some ways, and I already have a Smalltalk-ish parser
in Jython which could be used for bootstrapping (previously linked in
another post).

I could even perhaps just reuse the PataPata SourceForge site
  http://sourceforge.net/projects/patapata
to get started. PataPata means "touch touch" in Xhosa and was supposed to
relate to supporting open ended, end-user-modifiable software (for
non-experts), especially intended for learner-directed self-education like
eToys or HyperCard or Microworld simulations and so on. The current blurb on
SourceForge is: "PataPata is an experiment to support "learner-directed"
educational constructivism and "multi-user" stigmergic connectivism on the
Java platform, using Jython and other JVM languages. It is inspired by
Squeak, Design Science, Augment, and Sugar." It could always be moved
somewhere else or renamed if it develops some critical mass. That definition
is now a bit more general and JVM oriented than what PataPata started out
with, which was originally more Python-centric.

But it seems like with Dan's announcement a lot of the thunder is gone from
my suggestion. Even if Dan doesn't have as hip a name as "PataPata" yet. :-)

I guess a corollary to Dan's(?) "Just do it and you're done" is maybe "Just
be patient and Dan will probably do it if it's a good idea". :-)

Still, Dan's result (as I understand it, not having seen the latest) doesn't
fix for me some of the worrisome issues of Squeak, which are bugs at levels
#1, #0 and #-1. :-) Talks2, another Squeak-derivative for the JVM, has some
of the same issues. These include licensing of the core, factoring of the
core including namespaces, GUI expectations, Java interoperability, building
the image from source, lack of multi-jvm-programming-language development
tools, and so on (I've mentioned some other issues in other replies).

And to get those right, I think rebuilding from scratch seems to me to be
the way to go. To be clear, by "from scratch" I really mean maybe, say, 75%
of the time importing existing Smalltalk code one bit at a time from
GNU/Smalltalk or Squeak, of course respecting the license and perhaps using
or writing software that helps track that. The missing 25% would probably be
interfacing with Java, including probably using Java's threading and
concurrency model. I don't really know what those percentages would be, of
course,. That's just an initial guesstimate. I could hope imports are closer
to 95% with a little tweaking. The current Squeak relicensing initiative may
be good enough IMHO to allow relicensed Squeak-ish packages (including the
debugger and browser, etc.) to be used safely enough at the edges of a core
mostly brought from GNU/Smalltalk. In that way, working from a good faith
intent not to infringe anything, the downside risk of a claim against
Squeak's copyright is most likely probably having to have someone else
rewrite a class browser or some such tools, something which a community
could do much more easily than if there were extensive legal problems with
the core. (Although I would be relying on GNU/Smalltalk being clean. :-)

That level of effort would probably just be enough to get something to the
point where the community could decide, is this worth using or porting more
Squeak-ish stuff to? Or, in other words, does if feel Squeak-ish enough to
be worth using? Still, I am trying to fix the #0 and #-1 bugs I list above,
by cleaning up the core step-by-step and by broadening the community
orientation of the softwares. Still, Spoon has been worked on for years and
not gotten the attention it deserves IMHO, so I'm not sure this project
would either, if it even got as far as Spoon on the JVM. (For me, spoon
still has the licensing issue. :-) Squeakers are a tough crowd to please,
given what they are used to. But then so am I. :-)

Anyway, that is how I see it. It sill might not be worth doing if people
here don't like the idea or don't see the value in interoperability. Jython
and other JVM languages are pretty nice even if they are not as dynamic as
Squeak, and I've already had success with them.

I can also understand other people have different perceptions of risk vs.
reward and different levels of investment and comfort with Squeak-as-it-is.

--Paul Fernhout