Modular != minimal (was Re: [squeak-dev] Loading FFI is broken)

Fri Nov 15 20:38:24 UTC 2013

Hi Frank,

On Fri, Nov 15, 2013 at 2:14 AM, Frank Shearar <frank.shearar at gmail.com>wrote:

> On 15 November 2013 02:54, Chris Muller <asqueaker at gmail.com> wrote:
> > On Thu, Nov 14, 2013 at 4:27 PM, Frank Shearar <frank.shearar at gmail.com>
> wrote:
> >> We talk past each other every time we have this argument.
> >
> > Not every time -- I've learned a few things from y'all in this
> community.  :)
> >
> >> On 14 November 2013 20:47, Chris Muller <asqueaker at gmail.com> wrote:
> >>> I know module-heads like to say it's all about modularity and not size
> >>> but I think it being about size is unavoidable.  (And, when I say
> >>> "size" I'm talking only talking about disk and memory but also
> >>> coherence which is a valuable thing).
> >>>
> >>> Because otherwise "so what" if FFI includes the constants and VMMaker
> >>> depends on it solely for that portion of it?  How many methods making
> >>> up FFI are we talking about?  There are plenty of _other_ methods in
> >>> the image which are not being used by VMMaker, what about them?
> >>>
> >>> Acknowledged or not, at some point we're forced to assume a balance
> >>> between number of extra methods and number of extra packages.  The
> >>> hand-made-micro-packages approach puts these two metrics at inverse of
> >>> each other, trading domain complexity for package complexity.
> >>
> >> We can argue about the granularity of the packages. I don't really
> >> care about that. I argue about small packages in the base image only
> >> because you cannot break the cycles without distinguishing about the
> >> parts.
> >
> > Yes, we're in agreement that should be a criteria for determining
> > package boundaries / granularities.
> >
> >> Please, please show me that I'm wrong so that I stop tilting at
> >> the tangle web of windmills. Just take System. That would be a good
> >
> > Ha, I knew it!  You ALWAYS pick "System" every time we have this
> argument.   :)
>
> It's one of the most egregious offenders, so I'm bound to pick on it :)
>
> >> start. Show me how System makes sense as a package. Because all I see
> >> is a big fat mess of separate things that have no business being
> >> together. Projects? Change notifications? UI? Serialisation?
> >
> > "big fat mess" and "no business being together" are size / coherency
> > judgements.  Busted!  :)
>
> I use a pejorative term here only because System's (probably) the
> worst entangler we have. I want the Squeak base image to be like a
> layer cake. If you draw the dependencies between packages, and Kernel
> sits at the bottom, then all the dependency arrows point either
> sideways (with no cycles) or downwards. System is like a giant
> pineapple sitting in the middle of that cake. It cuts across these
> various layers, because it provides high level functionality (no good
> examples off the top of my head because I'm at work and don't have an
> image open - Project maybe?), low level functionality (change
> notification), and "support" stuff (which largely looks like a "useful
> things that we don't know where else to put" bucket).
>

First let me stress that I agree strongly with your desire to see the
system properly modularized.  Personally I like the image of an onion, but
then I like clams in white wine with onions and coriander much more than I
like cake.  Second, I *think* it's impossible with a system like Smalltalk
to meaningfully onionize the core of the system (i.e. System).  That's
because it is recursively implemented.  None of the core libraries
(arithmetic, collections) can exist without the kernel execution classes
(behavior, method dictionary, compiled method),  None of the core execution
classes can function without the core libraries (method dictionary is a
collection, arithmetic is used throughout the core system).

Observe that having a separate development environment such as Spoon
doesn't change much here.  We can easily extract the compiler (and a binary
loader) from the system; it is then inert (cannot add more code) but still
functional.  We can use Spoon to create methods in one image and upload
them to another.  But within an image there will always be circularity
which is fundamentally to do with teh system being implemented in itself,
with everything (including code) being objects.  And this property of
everything being an object is the single most valuable property of the
system; it leads to the system's liveness.  So at some stage we have to
accept the tangle that lies at the heart of the system.  It doesn't have to
be a tangled mess, and can have clean boundaries.  But IMO inevitably at
the core of the sytsem there will be some small number of packages which
are inevitably interrelated, inseparable and unloadable.

I'm probably teaching you to suck eggs but I had to let that brain fart
free.

> > The truth is, I'm pretty sure we've agreed about System for a while.
> > If all dependency cycles could be removed, I wouldn't care so much
> > about System being "big and fat" because I see it as the "Smalltalk
> > programming system", but I think the cycles probably won't be able to
> > be eliminated without breaking it up and so it's moot to disagree on
> > System anyway.
>
> Parts of System depending on other parts of System in a cyclic manner
> ("intra-package" cycles) don't matter. Unless you try to break up the
> package, of course, in which case you can't load the parts without
> weird preambles and non-MC-friendly things.
>
> I understand why you see System as being "the Smalltalk programming
> system". I'm trying to untangle what exactly "the Smalltalk
> programming system" actually means, and how it's built. I suppose I'm
> looking at the packages through a microscope?
>

As you well know that won't work.  We need to look at it in large chunks.
 For me its core class libraries (essentially Object (so one can add new
classes that integrate with the system), arithmetic, collections and
streams), the execution classes, environments (Smalltalk, SharedPools),
exceptions and some base error reporting facility.  Above that one can add
System (managing the evolution of the environment) & Compiler.  Above that
the programming tools.  Etc.

This works (I think) by looking at functionality.  Smalltalk is a
programming system used to express programs.  The most elemental programs
use arithmetic and/or collections; the next most elemental programs add new
classes rooted at Object.  All these elements are themselves expressed as
objects.  These programs may run, and in doing so may raise errors which
need to be reported.  Note that I *haven't* included how those programs are
created in the elemental soup.  Whether the image just is, or whether code
is created by the Compiler/ClassBuilder, or loaded via a binary loader
isn't elemental; the fact that there are at least three ways to go about it
proves this.

So IMO if you want to break the system into modules you first come up with
a model of the system's functionality, and you design modules around that
model. All forms of shrinkage, unloading, loading, compiling, remote
debugging, etc, etc are merely arcs in the functionality model.  IMO an
elegant functionality model is one where the atom of functionality is small
and can easily be used to create other compound functionalities in as few
steps as possible.

So a base headless image with arithmetic, collections, file streams,
execution classes, exceptions, compiler, read-eval-print-loop and error
reporting to standard out seems close.  Another could be arithmetic,
collections, file streams, execution classes, exceptions, error reporting
to standard out, a binary loader and a command-line argument parser such
that one can specify packages to load.  Another might be arithmetic,
collections, file streams, execution classes, exceptions, and error
reporting to standard out, that requires Spoon to load code into it.

Once one's chosen one or more of these bases then other functionalities
such as a squeak trunk image with the programming tools and morphic, or a
squeak trunk with MVC, or a headless scripting environment with a
read-eval-print-loop and lots of file system utilities, can be constructed
by loading modules, and hence those modules can be derived from what it
would take to construct a functionality.

If I'm missing the point of this discussion forgive me, but it seems to me
that without a clear notion of what the atom is there's endless scope for
confusion and delay.

> You had brought up a one-class package which is what got this going
> > this time..  :)
>
> Yes, I did kinda deliberately do that, didn't I? :)
>
> >>> This is why want Spoon to make micro-packaging less important.  Let
> >>> the machine imprint a truly "optimal", application-specific, image
> >>> that no amount of human-wrangling could ever come close.
> >>
> >> Shrinking is useless. You have no idea what you deploy. I _do not
> >
> > Dabble DB who wanted to run hundreds of images.  I also have cases
> > where I need to run many images.  For that, shrinking is not useless.
>
> By "useless" I mean "you have no idea what makes up a running process,
> except by actually inspecting that process."
>
> > The idea of Spoon is to deploy only one single "fat" image (with
> > everything you know you need and more) from which as many minimal
> > images can imprint from.  Since they only download methods they need,
> > as they get called, memory usage is optimized.
>
> Well. That inflate-as-needed approache requires you have a persistent,
> reliable network connection between the deployed artifact and some
> server somewhere. That's pretty much exactly the opposite of what I
> consider sane deployment practice. I know I'm taking a really strong
> stance here, and I apologise if I end up sounding harsh. (I can
> sort've half-see a possible use case where the "single fat image" is
> on the same machine as the mini images... in which case I'd rather see
> the mini images constructed explicitly.)
>
> In particular, the kind of thing that I'd like to see is an automated
> and explicit build process that assembles some binary artifact. That
> goes into CI, which throws stones at it. If that binary artifact
> passes muster, it's turned into a Debian package (replace with
> suitable replacement concept for your platform) with a hard version.
> That package is then deployed to the target machine, and in prod you
> know _exactly_ what your server's running. (An alternative approach
> would be to use Docker, in which case you assemble and test a "virtual
> machine lite" that you can just start running on your target machine.)
>
> The first step here is actually being able to assemble that artifact,
> and that comes down to a ConfigurationOf/Installer script that takes a
> well-known base image and builds it up to whatever you need.
>
> Now my obsession is applying this same process to the base image itself.
>
> >> care_ about this theoretically minimal image, because otherwise I'd
> >> just copy with Guille Polito's doing, and building up a whole new
> >> object space starting with nil. _That_ is minimal.
> >
> > I'm not aware of his work..
>
> This looks like a good starting point:
>
> http://playingwithobjects.wordpress.com/2013/05/06/bootstrap-revival-the-basics/
>
> >> I've been rolling out clearly versioned code, with well-understood
> >> dependencies, for years now in every language I know except for the
> >> one I love the most. And in every one of these languages (C#, Ruby,
> >> Java, Scala) I have had _no_ serious pain in managing dependencies.
> >> You define your immediate dependencies, and you're _done_.
> >
> > Yes, we seem to be getting there.
>
> We are, slowly.
>
> frank
>
>

-- 
best,
Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20131115/99b376e0/attachment.htm