[SF]VM building project third report

Ian Piumarta ian.piumarta at inria.fr
Thu Jun 13 20:36:34 UTC 2002


Hi Lex,

In a few line's time I'm going to explain the motivation behind the use of
individual subdirs each with their own "make" in the new process.  (These
aren't really recursive makes, for two reasons, which I'll get to
shortly.)  But first...

> Have you read the following page?
> 
> 	http://www.tip.net.au/~millerp/rmch/recu-make-cons-harm.html
> 
> I don't want to repeat all of it here.

His arguments are flawed (in the context of building Squeak) for two
reasons.  First is that, in order to cope with having the flexibility of
recursive make in a single makefile, he assumes GNU make -- and we don't
have that luxury.  Second is that all of the shortcomings of recursive
makes that he cites stem from one central cause: partitioning the
dependency graph across multiple edges (i.e., at a point where there are
targets "lower down" in the graph that are dependencies for *more than
one* target "higher up" in the graph).  In the context of Squeak this is
not the case, and never will be with the current division of subdirs into
plugins (where the core vm support is also considered a special kind of
"internal plugin").

While his arguments are valid and well-argued, they simply don't apply
to building Squeak.  Let me quote a few things he says:

    (Section 2)  "It must be emphasized that this paper does not
    suggest that make itself is the problem. [...] The problem is
    [...] rather in the input given to make -- the way make is
    being used."

    (Section 3.1)  "The use of recursive make [...] causes make
    to construct an inaccurate DAG, and it forces make to traverse
    the DAG in an inappropriate order."

    (Section 4)  "The above analysis is based on one simple action:
    the DAG was artificially separated into incomplete pieces.  This
    separation resulted in all of the problems familiar to recursive
    make builds. [...] Incomplete makefiles are *wrong* makefiles."

Each of the subdirectory targets in Squeak appears as the dependency
the DAG contains precisely one edge.  No information in the DAG is lost
because I propagate this edge to the "recursive" make as its target.

So the makefiles I'm using are complete (they build entirely
self-contained units of the VM) and hence the DAG is complete and hence
make does not traverse the DAG in an inappropriate order.  The entire
paper is talking about artifacts related to a condition that simply does
not exist (and never will) in the Unix Squeak build process.

I'd like to take a minute to explain why we want to use subdirectories
at all (3.1 and earlier built everything in a single directory).  The
simple answer is that it eliminates any possibility of name conflicts
between unrelated plugins.  If I have FooPlugin and BarPlugin both of
which contain "baz.c" then there's a problem trying to build
everything in one directory.  (This isn't at all far-fetched: imagine
the case where "baz.c = moduleInitShutdown.c".)

However, this still doesn't explain why separate makefiles are needed. The
short answer is that there are C compilers (or maybe it's the assemblers)
around that are incapable of placing the output file in a directory other
than the cwd.  This is why I avoided them in 3.1 and earlier.  (I can't
remember the exact reference, but I'm fairly sure it's somewhere in the
documentation for automake.)  So: if we want to avoid name conflicts we
need separate directories for plugins, and if we want to remain portable
then we must chdir to the directory in which we're building the object
before running the compiler.

Let me just emphasize one more time that Unix Squeak is not using
recursive make in the sense that many people understand it (or in the
sense explicitly described in the above paper).  Each makefile is
"completely configured" and entirely independent: there are no variables
inherited from the parent make, and there are no dependencies between
subdirs which would break the build by causing make to construct an
incomplete graph.  Each subdir makefile builds exactly one target which
appears as a dependency in exactly one rule of the top-level makefile.

> Of course.  I was simply appealing to you as someone who has generated
> makefiles from scripts.  Isn't it nice having a real for loop?  Isn't it
> nice to have subroutines?  Why not allow that power to plugin authors? 
> 
> This is a specific case of Olin Shivers's "little languages" argument:
> don't embed programming commands into little languages, but instead
> drive the little languages from an existing programming language.  Have
> you read it?

My day job is reflexive compilers and meta languages so I'm quite familiar
with his arguments.  But they're probably overkill for constructing a set
of makefiles.  Our problem is to determine the set of object files to link
and the specific options to pass to the compiler when building a plugin.  
Both of these can be handled by the combination of autoconf and
"personalised" makefiles.  (The thing that was missing in 3.1 and earlier
was the autoconf half of the story.) If anything more exotic is required
then it's related to generating parts of the plugin source using programs
other than cc and ld, and the only reasonable way to represent the
required actions are as makefile rules just like for cc and ld.  At which
point being able to inject your own rules into the makefile (possibly
modified by some earlier [and arbitrary] autoconf behaviour) seems
entirely adequate. (Indeed, not firing these things from rules would
introduce precisely the same "broken makefile" problems that the "recusive
make" paper is talking about.  And the "little languages" that the second
paper is talking about are precisely those [standard] programs which you'd
want to fire from rules in a makefile.)

> Makefile.inc is an improvement, if it does what I think it does.  Wtih
> that file, MPEGPlugin might not need acustom Makefile.in any longer. 

No, it still needs it, because it only wants to compile a subset of the .c
files into the final plugin.  The alternative to providing a Makefile.in
template (as is done now) and having some other mechanism (such as a
"make.files" file containing the list of things you actually want to
compiler and link) is minimal.  I chose to go with the former because the
Makefile.in is tiny (it's a template in which all configuration and
dependency information is substituted automatically), the mechanism was
already in place, it's more general, and it's required for other things
anyway (so going with the second option would make the build process more
complicated, not less).

> (Which means the script vs. raw file debate becomes fairly academic!)

I'm not convinced.  These are still all good arguments.  I'm simply being
stubborn and refusing to be convinced (easily) that there are any
*realistic* cases that cannot be handled *elegantly* by a combination of
per-plugin acinclude.m4 and Makefile.in template.

> Aside from that, my scheme is about the same except that Makefile.in is
> a script.  In the simplest case you can just put "cat <<EOF" at the top
> and "EOF" at the end and you have a script.   But you can also do more
> powerful things, if you want to.  I'd think it very likely that you'd
> want that power, if you are bothering to write custom build rules at
> all.

The VM itself has custom build rules (the decision whether or not to
gnuify interp.c).  This is handled elegantly by a combination of setting a
variable in acinclude.m4 which in turn affects which of two subtargets
appears in the dependecy list for the final binary.  The different actions
along the different paths are then handled by having two mutually
exclusive sets of rules that lead to one or other of the subtargets as
selected by configure.  My (pig-headed) assertion is that this is totally
sufficient for any realistic eventuality.

> Note, by the way, that this script handles three cases: internal
> compilation, external compilation, and compiling libmpeg by itself.

Just an observation: adding one line to the Makefile.in would let my
current scheme could build itself an independent libmpeg.  (Instead of
sucking $(LIBPMEG) [the list of .o files in the lib] into the $(LINK)
line along with the plugin stuff, you'd suck libmpeg.a into the
$(LINK) line along with the plugin stuff and add a rule to make
libmpeg.a from $(LIBPMEG) separately from the plugin stuff.)

> Consider JPEG and libjpeg, instead.  With JPEG, it would be nice to use
> the system's libjpeg if it is present.

You chose an unfortunate example.  As far as I know the libjpeg that comes
with the plugin has been modified to work with Squeak (something to do
with working on files vs. on memory, I believe).  (The first thing I tried
when creating the 3.2 process was to install the libjeg package and link
against the system library.  It didn't work.)

But I agree that your argument is valid.  Then again, the norm so far
(when using non-standard system libs) is not to distribute the source
along with the plugin (e.g., OpenGL and libffi).  This seems perfectly
reasonable.  (Why go to the trouble of maitaining the sources and then
having to compile it yourself when you can just install an up-to-date copy
of it with apt-get, yup, rpm or whatever?)  So if there are explicit
sources for a lib in Squeak then I think we can safely assume they've been
modified w.r.t. the original lib, and so the problem becomes academic.

(FWIW, the plugin's acinclude.m4 could detect the lib and make a different
substitution for one variable in makefile.inc to cause the final make to
select between linking a sys lib and compiling + linking against a set of
explicit sources in the plugin's "cross" dir.)

> I think I see how your system works.  I's not actually much different
> from your 3.1 system, is it?

It's similar in some respects, but quite different in others.  The use of
the "[...]" keywords in the makefile templates is new (and very powerful
for almost no work), as is the possibility of having per-plugin
specialisation of both configure and makefile.  None of this was needed in
3.1 (which was consequently incapable of building the MpegPlugin).

> I don't see how you can say it is
> significantly simpler.  Changing scripts to raw files is only a tiny
> simplification that comes with a significant reduction in power.

It's simpler for the plugin developer because the stuff they don't care
about is now opaque.  (Plugins can specify arbitrarily complex extensions
to the configure/make behaviour without having to worry about how the vast
majority of the stuff in the makefile is constructed and used.)

> Using recursive make seems no simpler than using integrated make --
> perhaps it's even more complicated, depending on how you look at it.

It's simpler because the makefile generation phase no longer has to worry
about whether each plugin is internal or external.  Everything is
symmetric: a one-line change (sucking make.int or make.ext into the final
makefile) takes care of about half of the complexity that appeared in the
old mkfrags script.

> To summarize, mine has:
> 
> 	1. Including sub-makefiles into the top-level makefile, to avoid
> recursive make invocation.

In our context, recusive make suffers from none of the problems that
are cited as reasons not to use it.

> 	2. Allowing scripts for the sub-makefiles.

A combination of acinclude.m4 + per-plugin makefile template (and the use
of configured targets and their dependent's rules to trigger alternative
behvaviours in that makefile) requires far less work from a plugin
developer than would writing a script, is equally powerful (because you
can drop any "little languages" you like into your personal rules) and is
almost guaranteed not to suffer from the problems of broken makes because
of things not being rebuilt when they should be (moving things outside the
rule-driven model immediately opens "holes" in the costruction of make's
dependency graph -- reintroducing the possibility for all the problems
cited in the "recursive make" paper).

But here's the bottom line: construct for me just one plausible case in
which the congfigure + template approach cannot work elegantly and I'll
add a keyword "[make.sh]" which runs a script and subsitutes its output
into the final makefile.  (This is a three-liner, one line of which is
"fi". ;-)  That way we'd have a superset of the two schemes, allowing
arbitrary makefile content to be build programmatically while retaining
all the opportunities for simplicity and genericity that we get from the
template scheme.

Deal?

Ian




More information about the Squeak-dev mailing list