[squeak-dev] binary development (was: 3.11 and the trunk)

Eliot Miranda eliot.miranda at gmail.com
Tue Aug 25 20:28:55 UTC 2009


On Tue, Aug 25, 2009 at 12:35 PM, Igor Stasenko <siguctua at gmail.com> wrote:

> 2009/8/25 Eliot Miranda <eliot.miranda at gmail.com>:
> >
> >
> > On Tue, Aug 25, 2009 at 3:57 AM, Igor Stasenko <siguctua at gmail.com>
> wrote:
> >>
> >> 2009/8/25 Eliot Miranda <eliot.miranda at gmail.com>:
> >> >
> >> >
> >> > On Wed, Aug 19, 2009 at 6:56 PM, Igor Stasenko <siguctua at gmail.com>
> >> > wrote:
> >> >>
> >> >> 2009/8/20 Eliot Miranda <eliot.miranda at gmail.com>:
> >> >> > Hi Igor,
> >> >> >
> >> >> > On Wed, Aug 19, 2009 at 6:00 PM, Igor Stasenko <siguctua at gmail.com
> >
> >> >> > wrote:
> >> >> >>
> >> >> >> 2009/8/20 Jecel Assumpcao Jr <jecel at merlintec.com>:
> >> >> >> > Colin Putney wrote on Wed, 19 Aug 2009 14:25:21 -0700:
> >> >> >> >> On 19-Aug-09, at 10:15 AM, Jecel Assumpcao Jr wrote:
> >> >> >> >>
> >> >> >> >> > For example, I would far prefer to
> >> >> >> >> > see Squeak move to a binary based development model (I would
> >> >> >> >> > mention
> >> >> >> >> > Projects and Etoys here) than the current source based things
> >> >> >> >> > we
> >> >> >> >> > are
> >> >> >> >> > doing (trunk, bob or whatever).
> >> >> >> >>
> >> >> >> >> Forgive me for seizing on a throw-away comment like this, but
> >> >> >> >> would
> >> >> >> >> you mind expanding on this a bit? Are you saying you prefer
> >> >> >> >> something
> >> >> >> >> spoonish, where CompiledMethods  are passed directly from image
> >> >> >> >> to
> >> >> >> >> image? Something else?
> >> >> >> >
> >> >> >> > Heh, I got asked about this on IRC as well. Though I had
> actually
> >> >> >> > started to explain this a little in the original email, I ended
> up
> >> >> >> > deleting it to keep on topic. With a new subject line I don't
> feel
> >> >> >> > I
> >> >> >> > have to worry about that. Some details about this (with a few
> >> >> >> > drawings)
> >> >> >> > can be found in the Chunky Squeak wiki page:
> >> >> >> >
> >> >> >> > http://wiki.squeak.org/squeak/584
> >> >> >> >
> >> >> >> > The idea is to be more like the Etoys users which can load
> binary
> >> >> >> > projects containing not only the code they need but also hand
> >> >> >> > crafted
> >> >> >> > objects which have no source (like a drawing, some nested Morphs
> >> >> >> > or
> >> >> >> > even
> >> >> >> > some text). This is very simplistic compared to Spoon, and my
> >> >> >> > proposal
> >> >> >> > was even more simplistic. In particular, this doesn't handle the
> >> >> >> > case
> >> >> >> > where any changes to bytecodes or object format are needed.
> >> >> >> >
> >> >> >>
> >> >> >> The central question, which arising immediately is, what is the
> >> >> >> credible way(s) to reproduce such artifacts?
> >> >> >> When we having a source code, we could (re)compile it on a
> different
> >> >> >> system. But what you propose to do with pure binary data, a soup
> of
> >> >> >> objects, in respect that it is incredibly hard to understand, what
> >> >> >> bits you need and what's not, in case if you need to do clean-up ,
> >> >> >> refactor, rewrite and simply analyze what is happening.
> >> >> >> This is what making a huge difference, for instance, between
> >> >> >> applications with open source code and applications shipped in
> >> >> >> binary
> >> >> >> form - you can only report bugs, but can't realy make any
> >> >> >> suggestions
> >> >> >> about what happening.
> >> >> >> I don't think that developers of Squeak should be victims of such
> >> >> >> situation(s).
> >> >> >
> >> >> >     it is possible to have your cake and eat it too.  One can
> create
> >> >> > a
> >> >> > binary format that includes source and includes the meta-source for
> >> >> > its
> >> >> > creation.  But including a binary representation allows much faster
> >> >> > loading,
> >> >> > loading without a
> >> >> >
> compiler, and source hiding if one choses not to include the source.
> >> >> >
> >> >> >
> >> >> >
> There are other advantages, such as not cluttering up the changes file when one loads a package  In the VW parcel system, to which I added source management, we replaced the SourceFiles with a SourceFileManager whose job was to manage the sources and changes file and an arbitrary number of source files for parcels, the binary format.  In
> >> >> > the parcel file the source pointers of compiled methods are the
> >> >> > positions of
> >> >> > their source in the parcel source file.  When one loads a parcel
> the
> >> >> > SourceFileManager adds the file to its set of managed files and
> >> >> > assigns
> >> >> > an
> >> >> > index for the source file.  The parcle loader then swizzles all the
> >> >> > source
> >> >> > pointers so that they include the source file index along with the
> >> >> > position.
> >> >> >  So accessing the source for a method loaded form a parcel accesses
> >> >> > that
> >> >> > parcel's source file.  We used a floating-point like format for
> >> >> > source
> >> >> > pointers, where the exponent was the source file index, and the
> >> >> > mantissa
> >> >> > was
> >> >> > the position in the file.
> >> >> > We didn't create a single file format, having two separate files
> for
> >> >> > binary
> >> >> > and source, which is probably a mistake.  A format with a short
> >> >> > header,
> >> >> > followed by source, followed by binary, followed by metasource,
> would
> >> >> > be
> >> >> > easier to manage than three separate files.
> >> >> > We didn't include any metasource, but we did include pre-read, load
> >> >> > and
> >> >> > unload actions.  I did a very bad job on version numbering and
> >> >> > prerequisite
> >> >> > selection.
> >> >> > That's not the whole story but enough to start answering your
> >> >> > question.
> >> >> >  If
> >> >> > there is a well-defined definition of the objects in a package and
> >> >> > that
> >> >> > definition is included in the package as metasource, then one can
> >> >> > comprehend
> >> >> > the binary package's contents by examining the metasource and can
> >> >> > reproduce
> >> >> > creating the package, provided that the tools are careful to impose
> >> >> > ordering, etc.
> >> >> > best
> >> >> > Eliot
> >> >>
> >> >> I think you inevitably made wrong decisions, because you went this
> way
> >> >> by allowing an
> >> >> arbitrary binary data , held by package.
> >> >> In such situations it is much more easier to make a mistakes.
> >> >> But sure, one who's making no mistakes is one who doing nothing :)
> >> >
> >> > We didn't disallow representation of arbitrary data but we also didn't
> >> > support it.  The only thing the Parcel system supports (as in the tool
> >> > set,
> >> > rather than what one can extend the framework to do in specific
> >> > circumstances) is to represent code, which it does very well.
> >> > What are these mistakes?  Can you be specific?  I think the parcel
> >> > system
> >> > has been a major success.  VW is now deployed as a system of
> components,
> >> > the
> >> > base image and a much larger suite of parcels.  Parcels are not tied
> to
> >> > a
> >> > particular version or implementation and yet are still fast to publish
> >> > and
> >> > load.  What's not to like?
> >>
> >> I referred mainly to your own statements about mistake(s).
> >
> > Ah, ok,  Sorry :)
> >
> >>
> >> I don't know about parcels so much to tell exactly where is the flaws.
> >> I'm still wondering, how you could unload a parcel if its not longer
> >> needed, but
> >> there are still object(s) which used/created by parcel sitting in image.
> >
> > Smalltalk has this problem with or without binary loading; they're called
> > obsolete classes :)  However, the problem of knowing what to remove when
> the
> > user says "unload" means that a loaded parcel requires a data structure
> that
> > names the classes and methods it loaded.  In addition we maintain
> overrides,
> > the older versions of methods and class definitions, in a stack, so that
> > these can be restored when unloading a parcel.  I made lots of mistakes
> here
> > (not allowing the tools to publish a parcel that has code overridden by
> > others, not integrating source management and browsing queries with
> > overridden code, not compressing the changes correctly with overridden
> code,
> > etc, etc).  Tests would have helped :/
> > VW did (does?) test for open instances of applications when we unload a
> > parcel so that if the parcel contains a subclass(s) of ApplicationModel
> > (VW's top-level GUI app class) all open applications are tested to see if
> > they contain instances of the class(es) and a warning is issued.
> >>
> >> A basic use case is: developer needs some specific tool (like UI
> >> design tool) when he working
> >> on application. But at the moment when he ships the application, it is
> >> no longer needed.
> >
> > Right.  I don't know of an automatic solution, but a good convention is
> to
> > split all packages into a development and deployment pair where
> > the deployment half is a prerequisite of the development half.  Sticking
> to
> > the convention and using good names makes it easier to remember to remove
> > deevelopment components and to guess which parts of someone else's
> > components are development only.
>
> Yes, and this is what i really missing in smalltalk-80 based
> environments: distinction between development
> and deployment modes & models.
> It would be cool to have some basic things to behave different when in
> deployed mode (like preventing access & data overrides).
> The main problem in open system (such as smalltalk object memory) is
> that when something goes wrong, often you
> having two choices: reboot the system or debug and fix the problem in
> a living environment.
> Often, none of the choices is acceptable, because if we are talking
> about end-user application, we don't expect that
> user is able to debug & fix the issue. As well as rebooting an image
> means loss of data and/or interruption of serving other jobs.


Yes, I agree.  One of the things the headless support in VW allows which is
quite nice is taking a shapshot which can then be restarted in a headless
mode for debugging.  This can easily be mailed or ftp'ed back for analysis.

Not quite the same, but very neat:  The other day at Qwaq Craig Latta had a
VM crash while running in a Parallels Linux VM under gdb.  He was able to
give me a copy of the VM snapshot at the point where gdb stopped the
process, giving me the opportunity to debug the live app at my leisure.  A
cool idea.


>
> But, if system modelled in modular layers , like kernel -> services ->
> interfaces -> working set,  then things
> would be much easier to handle.


Yes, yes, yes!!  The system should be like an onion where each layer of the
onion is a set of interlocking techtonic plates of modules of functionality.


>
> > I added a bulk instancesOf primitive that answered all instances of an
> Array
> > of classes that my colleague Steve Dahl wanted to use in instance
> migration
> > on class redefinition.  This could be used to look for all instances of
> the
> > classes defined by a parcel prior to unload.  Do a GC, collect all
> instaces
> > of classes defined (rather than redefined) by a parcel and warn if
> non-empty
> > (if in a dev image).
>
> I think that independent tiny layers (isles/vats) is the future system
> organization in smalltalk-like VMs.
> First, it gives the strong answer to question, what belongs to what.
> There is no possibility to reference a foreign object
> other than by far ref. You can count/enumerate them easily, and this
> approach also makes possible to run code in vats concurrently.
> The problem here is how to handle the shared behavior, like Arrays,
> Collections etc in order to avoid duplication. Since in smalltalk
> everything is objects, and so methods & classes too, they can belong
> only to a single island/vat, and therefore , only owning island can
> manipulate with it. This creates a major bottleneck in effective
> implementation of concurrently (and independently) running the code.
> Trade space for speed? Allow each island to have own Array class with
> own implementation?
> This question remains open for me.


Yes, this is a cool radical idea that I haven't got my head around yet.  I
need to think about this at length.  The obvious approach to the duplication
is copy-on-write where any modifications to the root Array class get
propagated to the copies, assuming there is some hierarchical control
organization.  I think this approach is taken in Alex's worlds where
modifications to a parent world are seen my children.  But then the merge
problem rears its head when trying to propagate modifications to a child
that has made its own local modifications in the same region.


>
> >>
> >> >> Obviously one of the side of such problem is uniform object memory,
> >> >> where each object could
> >> >> reference any other object and limited only by a imagination of
> people.
> >> >> There is no layers or any other means which could establish a certain
> >> >> barriers (which we calling a modules)
> >> >> in smalltalk.
> >> >> It means, that once you integrated the parcel into image, and started
> >> >> using it, you may have a hard times trying to unload it.
> >> >> It is possible to develop an image as an artifact, which contains
> both
> >> >> binary & sources , but such approach
> >> >> having a drawbacks, which we, by the way, trying to overcome
> nowadays.
> >> >> Practice shows that such approach is credible only
> >> >> for a small group of individuals, but becomes a bottleneck if you
> >> >> adopt such scheme for a wider community.
> >> >>
> >> >> So, i think , that before entering this domain (allowing binary
> data),
> >> >> first we should solve more basic problems of smalltalk & its design -
> >> >> modularity, name spaces, layering & etc etc.. Only the we could
> return
> >> >> to original question and solve it.
> >> >>
> >> >> --
> >> >> Best regards,
> >> >> Igor Stasenko AKA sig.
> >> >>
> >> >
> >> >
> >> >
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Best regards,
> >> Igor Stasenko AKA sig.
> >>
> >
> >
> >
> >
> >
>
>
>
> --
> Best regards,
> Igor Stasenko AKA sig.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20090825/ea220f7a/attachment.htm


More information about the Squeak-dev mailing list