contemplating a VM fork.

Ian Piumarta ian.piumarta at inria.fr
Thu Apr 7 23:12:22 UTC 2005


I refuse to get drawn into an(other) argument about any of this.  But 
here's a little history/rationale anyway, for those who have 
short/selective memories.

On Apr 7, 2005, at 08:22, Lex Spoon wrote:

> Even so, you must admit that this is a lot of if's.

Well...

> Ned Konz <ned at bike-nomad.com> wrote:
>
> Assuming your system has a properly matched set of tools, installed 
> properly,
> the configure script should work properly.

Not really.  If you install completely incompatible versions of 
autoconf, automake and libtool then you can expect the compilation of 
lots of things (not just squeak) to break.

> If you have the wrong awk, you don't get gnuification.

A while back you told me about Debian not using GNU awk, so I fixed the 
problem:

2002-09-27
         * platforms/unix/config/acinclude.m4 (AC_GNU_INTERP): force
         AWK=gawk if --with-gnu-awk.

A little later I modified the search rules to find gawk before [mn]awk 
on almost every conceivable platform.  Even then, setting AWK=blah on 
the configure command line would override automatic detection (and 
setting INTERP=gnu-interp would force gnuification even if configure 
thought you were crazy to attempt it).

More recently you told me about a few things in the gnuify script which 
were not particularly portable, so I fixed them too:

2004-04-10
         * platforms/unix/config/gnuify: Remove gnu specifics.

2004-04-11
         * platforms/unix/config/gnuify: Escape all occurences of '{'
         within regular expressions.

I don't think gnuify cares which awk you use any more.

> If you have the wrong version of autoconf, then the configure script 
> may not be rebuildable.

autoconf has gone through at least two phase transitions since I 
adopted it (under duress) for squeak.  At each transition, previous 
autoconf.{in,ac} became incompatible with current and future versions 
of autoconf, and automake (which provides aclocal) and libtool 
contained attendant more-or-less configure-breaking changes.  I have a 
fully-updated Debian 'testing' system on which autoconf produces a 
configure script that runs fine on Debian, Yellow Dog, 
{Free,Open,Net}BSD, Solaris and Mac OS X.  (I routinely regenerate 
configure on the latter too, which is also a fully-updated system.)  As 
Ned says: It Just Works.  If it's broken, then either you have an 
incompatible (or obsolete) collection of autothings, or Debian and 
Apple are both shipping broken development toolsets (which I doubt is 
the case).

> Wrong version of gcc?  Your compile can fail.  etc.

There are only a handful of broken versions of gcc.  The bugs were 
fixed long (18 months or more) ago.  The current 3.3 and 3.4 series 
gccs can compile it all just fine on at least four different CPU 
architectures (and a myriad of different software architectures), and 
2.95 is still working fine the last time I checked.  You cannot hold up 
a compiler bug and then point to the source code you're feeding to the 
compiler as the cause of the bug.

> FWIW, the scripts are still not perfect even with all the if's
> satisfied.  My current system  builds without complaint, but the
> resulting VM cannot open an X window.

Again, all I can say is that it works fine on my three fully-updated 
Debian systems running on a couple of different architectures.

Most likely you have an unresolved dependency on the plugin (due to a 
bug in the source code, or something being rearranged in the x11 libs). 
  These are hard to find because error reporting is turned way down 
while loading the display and sound drivers.  Fetch the SVN version of 
sqUnixExternalPrims.c (which has better diagnostics), set DEBUG to 1 
near the top, and check if it's failing to load the plugin (and 
report/fix the error indicated by the message it spews).

> Regarding the complexity of the script, I have several ideas.

Cast your mind back to 1996 or so.  The build process was wonderfully 
simple.  There was one Makefile parameterised with bizarre compiler 
flags for each supported architecture.  Several source files began with 
sequences of symbol tests to determine which architecture they were on, 
and how to get definitions (or workarounds) for idiosyncratic (or 
missing) features.  As the number of architectures grew, the number of 
combinations of bizarre flags and #ifdef trees grew.  When plugins cam 
along the build process became more complex because the Makefile had to 
include instructions for building only the plugins that were generated, 
while supporting 'plug and play' addition of home-made plugins.  This 
spawned a handful of scripts for stitching everything together into a 
working Makefile.  Then a big grass roots movement (I will name no 
names) starting screaming and yelling very publicly on squeak-dev that 
it was all horribly complex and not being done the Open Source Way 
(apparently that equates with The GNU Way for many people) and that 
life would be oh so simple if we would all move to using autoconf (and 
some even suggested automake -- no [publicly publishable] comment).  
When I got fed up of the derogatory remarks, I moved it all to 
autoconf.  I doubt I'm the only person who has noticed that It has been 
several times more complex, and many times more fragile, ever since.  
But whatever else happened, the derogatory remarks ceased (and so I was 
lots happier).

> In fact, I have actually coded several of them up in the past, for 
> whatever
> credibility that gives me.  Since those changes have been rejected--

I can't comment on them, since I only ever recall seeing (parts of) one 
of them.  The main reasons I reject things are non-portability, 
invasiveness on code outside the Unix tree, shifting (rather than 
reducing) complexity, or reliance on non-standard tools.  (Although I'm 
as guilty as anyone for relying on autoconf and friends, which have 
failed spectacularly to standardise on anything in any given 5-year 
period.)

> 1. Dump make.  We already depend on GNU make.  It is no more onerous to
> depend on Jam

Unless this is a demonstrable net reduction in complexity (rather than 
just moving onerousity from one place to another), and until writing 
Jamfiles becomes as natural to a randomly-selected developer as writing 
a Makefile, this is not a good idea.  (I would reject Imake for the 
same reason.)  I will, however, look at using Jam again.  Maybe a 
parallel Jamfile in the trunk would be a good thing to at least 
generate some discussion on whether to dump make entirely.

> 2. Use autoconf only for its purpose: *guessing* about configuration.
> Always let the user override the guess.

For the things that really matter, you can.  --with-x-includes=blah, 
etc., plus the various environment variables (CC, CFLAGS, LDFLAGS, 
etc.) that you can set on the command line when running configure.  I 
routinely use the latter to do sanity checks of the configuration, 
where I want to turn off all optimisation and just config/compile as 
fast as possible.

> 3. Reconsider libtool.  It's a nice idea, but it is really confusing,
> and I'm not sure it is truly helping out.  It sucks having to read an
> extra manual just because we are too afraid to hard code "gcc -shared".

'-shared' works with gcc on GNU-based operating systems only.  I know 
of no other Unix-like system on which generating a shared lib, with or 
without gcc, is that simple.  I think you might benefit greatly from 
using a Unix-like system from another (non-GNU-based) vendor for a 
while.

> It would not be terrible, if the standard build scripts can only build
> external plugins if you are using gcc.

IOW, it would not be so terrible if building external plugins (which 
include the display and sound drivers, remember) only worked on 
GNU/Linux?

(I have explained elsewhere the details of why the display drivers 
absolutely cannot be compiled internally because of logical [not 
procedural] conflicts.  I will not repeat myself here.)

'-shared' is not a trivial panacea provided by gcc.  There are plenty 
of platforms on which gcc runs fine, but '-shared' is totally 
unsupported.  '-shared' is a feature that relies far more on the 
linker's behaviour than the compiler's.  If the object format is not 
quite right (or, obviously, if you're forced to use the vendor's ld 
instead of gnu ld) then '-shared' is immediately a non-starter.

The move to libtool came after the move to autoconf and (again) was in 
reaction to many complaints about builds failing on obscure platforms 
or compilers that were not detected during configuration.  The problem 
with explicit configuration of shared library compiler/linker flags is 
that you need an expert on each platform who can determine just the 
right flags to use.  On several platforms at the time, this was not the 
case.  (And adopting flags supplied by one person could break the 
compile for another person on the same platform running a slightly 
different version of the OS or compiler -- this happened several times 
for MIPS and Irix.)  The number of libtool-related complaints I have 
received since moving to it are significantly less than the number of 
library-related build failure complaints that I used to receive before 
moving to libtool.

FWIW, I hate libtool as much as anyone.  But it has created a local 
minimum in the quantity of hate mail that I receive.

> 4. Heck, while at it, reconsider the default of external plugins.
> There are clear advantages to external plugins, but they cause an
> installation hassle.  Especially it is annoying, if you want to compile
> an experimental VM and test it out without installing it on your 
> system.
> Make internal the default, if VMMaker allows this.

You can make anything you like internal when you run VMMaker.  Internal 
is the default in the VMs that I ship.  The only external plugins are 
FFI (so that you can secure your system by removing the plugin, as 
requested by lots of people) and things that link with non-standard 
libraries (such as X11DpyCtl) or which have to defer their choice of 
platform support to runtime (such as B3D, which requires OpenGL from 
whatever source corresponds to the display driver you've chosen).

> They are complicated,

They are what people wanted (passionately and vocally) a long time ago. 
  (But wants are somewhat like political parties: the grass is always 
greener on the side you're currently facing...  But the devil almost 
always comes out in the details once you get there.)

> few people understand them (maybe just 1?)

The autoconf stuff and attendant scripts are as simple as I could make 
them.  Not one line of code in there is redundant.

The entire process is described to the best of my ability in the 
HowToBuildFromSource document.  If something critical is not explained 
in there, then nobody has complained about the omission.  So either the 
document suffices, or it's so impenetrable that nobody (other than 1 
person) understands any of it.

> they sometimes fail, they often require people to
> install extra dependencies, and when the the build fails, the error
> messages (or lack thereof) are frequently impenetrable.  Most of this 
> is
> not necessary

I think I already made it quite clear that this complexity is not 
necessary (and was not there in the pre-autoconf days, when the 
complexity that people were complaining about was mostly perceived 
rather than actual, and likely inspired by having to do something other 
than './configure' as the first part of building a VM).

> and so we should welcome discussion (and code!) about how to improve 
> them.

I will look again at Jam, but don't hold your breath.  (If it entirely 
eradicates configure scripts [for features, headers and libraries] and 
compiler/linker flag complexities, and supports dynamic selection of 
build products based on the stuff VMM spits out [without having to run 
external scripts] then I probably would consider that a net reduction 
in complexity.)

Cheers,
Ian

PS --

> PS -- Alan, there are no Squeak Police that will come after you if you 
> just do
> something like gcc *.c */*.c.

Try it.  It won't work.




More information about the Vm-dev mailing list