[Vm-dev] 64-bit FAQ

Andrew Gaylard ag at computer.org
Mon Oct 29 18:32:12 UTC 2007


After waiting more than a week, and seeing no further
feedback, here's the 64-bit FAQ as it currently stands.
I'd be grateful if someone (Ian?) could park it safely on
squeakvm.org.

Thanks,
Andrew.

1. What is a 32-bit image?

2. What is a 64-bit image?

3. Can I run a 32-bit image on my 64-bit computer?

4. Can I run a 64-bit image on my 32-bit computer?

5. Can a single VM run both 32-bit and 64-bit images?

6. What is a 64-bit VM?

7. Can I run a 32- or 64-bit VM on my computer?

8. For which hardware/OS combinations is there a 64-bit VM?

9. Does this VM run both 32-bit and 64-bit images?

10. I have a 64-bit computer;  should I use a 64-bit VM to run my 32-bit
Squeak
images?

11. What are the advantages of using 64-bits?

12. What are the disadvantages of using 64-bits?

13. My OS allows a single process to grow to 3.75GB;  if I need a lot of
objects
can I use a 32-bit image / VM, or must I use a 64-bit one?

14. Where can I find a 64-bit image?

15. Why isn't there an officially-released 64-bit image?

16. How can I make a 64-bit image from a 32-bit image?

17. How does a 32-bit VM manage to run a 64-bit image if pointers are only
32-bits wide?

18. What sizes and alignment does the new 64-bit image format use for
pointers
and integers?

19. How do I tell if a given image file is 32-bit or 64-bit?


1/
What is a 32-bit image?
- A 32-bit image is an image in which the object memory uses a 32-bit word
size
  for object pointers,  limiting its total size to a maximum amount of 4GB
of
  memory.  The formats of object memory and object pointers are defined in
class
  ObjectMemory (see the class comment for a basic explanation).  As of this
  writing,  all Squeak images of practical interest are 32-bit images.

2/
What is a 64-bit image?
- A 64-bit image is an image in which the object memory uses a 64-bit word
size
  for object pointers,  allowing the size of the image to grow beyond 4GB of
  memory.  Squeak now supports a 64-bit image format that is sufficient to
  produce a working system, but which is intentionally simple.  It is
expected
  to be modified and extended to take advantage of additional 64-bit
  capabilities in the future.

3/
Can I run a 32-bit image on my 64-bit computer?
- Yes.  A 32-bit image can be run on either a 32-bit VM or a 64-bit VM. Some
  computer platforms (e.g. 64-bit Linux) can run both the 32-bit VM and
64-bit
  VM on the same system.

4/
Can I run a 64-bit image on my 32-bit computer?
- Yes. If you build a VM with the "64-bit VM?" check box selected, you will
  create a VM that runs 64-bit images.  This will work on 32-bit host
systems
  as well as on 64-bit host systems.

5/
Can a single VM run both 32-bit and 64-bit images?
- No.  For any given computer platform,  two different VMs are required to
run
  32-bit and 64-bit images.  The type of VM that you build is governed by
the
  "64-bit VM?" check box in VMMaker,  and is independent of the word size of
  your computer.  While it would be possible to create a VM that is
  "smart" enough to run both 32-bit and 64-bit images,  this is currently of
  little practical value due to Squeak's reliance on plugins that are linked
to
  32-bit or 64 external library code.

  Any combination of 32/64 bit VM and 32/64-bit image is possible, but note
  that all currently available Squeak images are still in 32-bit format, and
  most (perhaps all) pre-built VMs are 32-bit applications.

6/
What is a 64-bit VM?
- A 64-bit VM is one which is compiled with the LP64 or ILP64 data model.
This
  means, in C terms, that pointers and longs are 64-bits wide.

7/
Can I run a 32- or 64-bit VM on my computer?
- It depends.  Some current architectures,  such as the x86-64 and the
  UltraSPARC,  can run 32-bit as well as 64-bit applications;  these are
known
  as "bi-arch" systems.  However,  some systems,  such as the Alpha,  can
only
  run 64-bit applications.  For bi-arch systems,  you can choose whether to
run
  a 32-bit or 64-bit VM.  For 64-bit-only systems,  you don't have that
choice;
  you can only run a 64-bit VM,  since there's no way of compiling a 32-bit
  application.

8/
For which hardware/OS combinations is there a 64-bit VM?
- Linux on 64-bit architectures: x86-64, SPARC64, Alpha, Power64, etc.
- Solaris on x86-64 and SPARC64
- MacOS on Power64
- Windows on x86-64

9/
Does my 64-bit VM run both 32-bit and 64-bit images?
- No. Any VM will run either 32-bit or 64-bit images, but not both. You can
  select one or the other when you generate sources with VMMaker, and you
can
  install both flavors of VM on your system (one each for 32-bit images and
  64-bit images).

  If try to run a 64-bit image with a VM built for 32-bit images, you will
get
  an error message such as this:

    This interpreter (vers. 6502) cannot read image file (vers. 68000).

  If you try to run a 32-bit image using a VM built for 64-bit images, you
will
  get an error message such as this:

    This interpreter (vers. 68000) cannot read image file (vers. 6502).


10/
I have a 64-bit computer; should I use a 64-bit VM to run my 32-bit Squeak
images?
- It depends.  Either one will work,  but if your image depends on plugins
that
  are only available for 32-bit systems,  use the 32-bit VM.  Otherwise,  if
  you are building your own VM,  go ahead and use the 64-bit version.

11/
What are the advantages of using 64-bits?
- The first advantage is that your image size can be enormous.  If you need
the
  size of your VM code plus in-memory image to exceed 4 GB, then a 64-bit
image
  running on a 64-bit VM is for you.  Note that it will take ages to write
out
  an image that's this big to disk.  The sort of applications that need this
  are those which load a small(ish) image, and run code that creates
millions
  of objects, but don't save them back to disk in the image.  Keep in mind
that
  the garbage collector is probably not up to the task of collecting
multiple
  gigabytes.
- Another advantage is that certain architectures (e.g. the Alpha) don't
offer
  a 32-bit mode; they are 64-bit only.  For such machines, a 64-bit VM is
  required;  the image may be 32- or 64-bit.
- Another advantage is that when the 64-bit-VM is built, the C compiler
knows
  the ABI is different from the 32-bit ABI.  The x86-64 case is an
interesting
  example: the old i386 ABI offered few registers, used i387 floating-point,
  and passed parameters on the stack (remember, memory writes are slower
than
  register moves).  The x86-64 ABI and architecture, on the other hand, has
  many more registers, has SSE, SSE2, etc.  for FP, and passes parameters in
  registers where possible.  It also has additional instructions (MMX et
al).
  All of these CPU and ABI features may make for a VM that runs faster, but
  only if (a) the compiler is able to make use of them and (b) is told to do
so
  at compile-time.  However, the gains are unlikely to be much, and will
also
  be offset by the cost of large pointers (see below).  If you're looking
for
  performance, it's important to measure a 32-bit VM with a 32-bit image
versus
  a 64-bit VM with a 64-bit image before assuming anything.

12/
What are the disadvantages to using 64-bits?
- A disadvantage to 64-bit code is that pointers are 8 bytes instead of 4;
  they are also aligned on 8-byte boundaries, meaning that some space around
  them, known as `padding',  is wasted.  This means that (a) pointers take
more
  space in RAM,  (b) take more memory bandwidth when the CPU loads and
stores
  them,  (c) take up valuable space in on-chip caches,  and (d) will have
  greater wastage due to padding compared to 32-bit pointers aligned on
4-byte
  boundaries.  For most users,  the upper 32-bits of each pointer will
always
  be zero,  so it makes little sense to load,  process and store pointers
that
  are double the size but only half-used.  So for these users,  a 32-bit VM
and
  image is a good choice.
- Another disadvantage is that most users use the 32-bit VM and a 32-bit
image.
  This combination is therefore the most tested,  and therefore most stable,
  combination.
- A third disadvantage is that code for many of the plugins is not yet
ported
  to a 64-bit VM.

13/
My OS allows a single process to grow to 3.75GB;  if I need a lot of objects
can I use a 32-bit image / VM, or must I use a 64-bit one?
- There have in the past been problems related to the so-called "2-GB
limit".
  This is due to conversion to and from signed 32-bit integers to 32-bit
  pointers in the VM code.  These problems appeared when the operating
system
  loaded the image into memory at addresses above the 2GB mark, and could
occur
  with normal-sized images,  not only images larger than 2GB.  These issues
  should be a thing of the past.  Use the most recently-released VM for your
  platform, and report any problems that you see.  You should only *need* a
  64-bit VM and 64-bit image if your image size will exceed 4GB.

14/
Where can I find a 64-bit image?
- These are scarce. The original 64-bit port project (from Ian and Dan)
  includes a 64-bit image that worked with the VM distributed at that time.
A
  current VM cannot execute the original 64-bit image due to changes in the
  interpreter since that time.  It is possible to update that original image
  using a modified VM, and the resulting image is executable using a current
  unmodified VM.  However, there are no official or supported releases of
64-bit
  images at this time.

15/
Why isn't there an officially-released 64-bit image?
- Lack of interest:  most people don't need a 64-bit image.
- There may yet be some changes to the 64-bit image format to take advantage
of
  features of 64-bit CPUs.  For instance,  63-bit tagged integers might be
  possible.

16/
How can I make a 64-bit image from a 32-bit image?
- Use the SystemTracer (SystemTracerV2 on SqueakMap).  The original 64-bit
  Squeak image was created using this tool,  and a sufficiently motivated
  person should be able to reproduce the job.  However,  the SystemTracer
does
  not currently work on little-endian computers (including Intel),  so some
  work should be expected in order to enhance SystemTracer before a
successful
  conversion will be possible.

17/
How does a 32-bit VM manage to run a 64-bit image if pointers are 32-bits?
- The short answer:  It relies on the image size being smaller than 4GB.
- The long answer:  Object pointers within the object memory are not
pointers in
  the C sense of the word.  The VM needs to be able to convert the object
  pointers into C pointers,  and this can be done on either a 32-bit host or
a
  64 bit host.  The only caveat would be that if a 64-bit image grew to a
size
  large enough to use object pointers larger than the 32-bit limit (i.e. an
  image approaching 4GB in size),  then a 64-bit VM would be required.

18/
What sizes and alignment does the new 64-bit image format use for
pointers and integers?
- Object pointers are 64-bits wide,  allowing for memory up to 2^64 bytes to
be
  directly addressable.  They are aligned on 8-byte boundaries.  Integers
are
  still implemented as tagged 31-bit values, but are sign-extended to use
the
  full 64-bit object word size and therefore are aligned on 8-byte
boundaries.
  Future enhancements to 64-bit Squeak will probably make use of the larger
  word size to increase the range of SmallInteger values, which will require
  further changes to both the VM and the image in order to be effective.
These
  alignments were chosen as most 64-bit CPUs require them.

- The object header and pointer formats for both 32-bit and 64-bit images
are
  documented in the class comment of ObjectMemory (in the VMMaker package).
  The conversions to and from host data types are done in
  platforms/Cross/vm/sqMemoryAccess.h using either macros or inline
functions.
  The actual conversions vary from host to host,  and are controlled by
macros
  such as SQ_HOST64 and SQ_IMAGE32 which must be set for that host.  In the
  case of a Unix VM,  the configure utility is used to specify the
  characteristics of the host platform.

- The word size to be used in the object memory is specified by the
  SQ_VI_BYTES_PER_WORD macro on src/vm/interp.h.  This file is created in
the
  VMMaker code generation process,  and the value of SQ_VI_BYTES_PER_WORD is
  determined by the "64-bit VM?" check box in the VMMaker tool.

- In summary,  the object word format is described in class ObjectMemory,
the
  host data type conversions are specified in sqMemoryAccess.h,  and the
image
  word size is specified in interp.h.

19/
How do I tell if a given image file is 32-bit or 64-bit?
- The first four bytes in the image file are a "magic" value that indicates
the
  image word size.  This is specified in Interpreter>>imageFormatVersion.
For
  most images,  this will be the first four bytes of the image file,
although
  in some cases the image data may be offset by 512 bytes in order to permit
an
  image file to be treated as an executable program on Unix platforms (see
  http://en.wikipedia.org/wiki/Shebang_(Unix)).

  For instance, if I load VMMaker-3.8b6 into a stock Squeak3.9a-7024.image
  file,  I see this:

        imageFormatVersion
                "Return a magic constant that changes when the image format
                changes. Since the image reading code uses this to detect
byte
                ordering, one must avoid version numbers that are invariant
                under byte reversal."

                BytesPerWord == 4
                        ifTrue: [^6502]
                        ifFalse: [^68000]

  Examining the file itself gives this:

        apg at breakfast: ~/squeak xxd Squeak3.9a-7024.image | head -1
        0000000: 0000 1966 0000 0040 011c 7ee0 0427 b000  ...f... at ..~..'..

  Looking at the first four bytes gives this:

          apg at breakfast: ~/squeak perl -e 'print 0x1966'
          6502

  (or do "16r1966 <Alt-P>" in a workspace; it also returns 6502.)

  So Squeak3.9a-7024.image is a 32-bit image file (since BytesPerWord == 4).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20071029/9f2cdbf9/attachment-0001.htm


More information about the Vm-dev mailing list