[squeak-dev] Deltastreams update
Göran Krampe
goran at krampe.se
Thu Mar 12 10:28:58 UTC 2009
Hi folks!
Wanted to give interested people (if there are any!) a heads up on
Deltastreams development.
History
=======
I started the Deltastreams "project" a few years back, there is quite a
bit of info on the wiki and even a movie from OOPSLA where I present it
and demo it. The idea of Deltas and streams of them is an evolution of
the old changeset and update stream. The concept and idea also borrows
from experience in using MC (which is not a direct competitor) and other
distributed SCMs outside of Squeak. Think of a Delta as a "super
changeset". The streams part has not been coded on yet.
Work
====
After a while Matthew Fulmer started helping me with the code and he has
done a LOT on the code base including lots and lots more tests, lots of
fixes in SystemEditor (from Colin Putney, used in MC2 I think) which we
depend on. In fact, the Deltastreams codebase was probably first out in
stressing SystemEditor. Matthew also created ICS - an advanced file
format for Deltas. Matthew has lately been working in MC a lot, which
gives Matthew a unique perspective that I don't have. Matthew is also
involved a lot in Croquet - which is one primary potential fork to use
DS with.
I have started working the last days again and the "itch" is back for
real. :) I am focusing on the "replace changesets" part and the next
step for me is probably to make a "dual change sorter"-like UI and a new
file format (see below). And make tests green. And also make ICS format
work. :)
Code today
==========
Deltastreams is hosted on SS. I currently develop it in 3.10.2,
dependencies are SystemEditor and InterleavedChangeset (ICS). Both of
them could be replaced with other packages taking those roles ("file
format for Deltas" and "tool to atomically apply code changes to a live
image"). We want the code to have very little dependencies and to work
in "all" Squeaks.
Status
======
We have lots of broken tests right now, and I intend to make it all
GREEN and keep it that way. We have been sloppy and have added lots of
tests without implementing them - this tactic works for a while but when
the code base gets complex they really need to be GREEN. Otherwise you
lose the ability to see if you actually broke something :).
The good part is that there are about 420 tests, and lots of aspects of
Deltas are thoroughly covered. Logging, applying and reverting Deltas
(code mechanisms) are 99% working. Currently I think the only bit
missing is category reorganization.
The ICS file format is partially working, I haven't gotten into the code
base fully yet - the format is very "clever" which may be its main
problem. It tries to do a really cool trick - being compatible with
Changesets! Or in other words, the same file contains both a binary
representation of a Delta that Deltastream code uses AND a changeset
representation that old images can use. This means that an ICS file can
be filed into an old image without ANY modification to that image. It
then simply looks like a changeset.
There is a UI built by Matthew that works on SystemEditor "models", I
know too little of its status right now. I intend to build another
complementary UI working much more like the "dual change sorter".
A new format
============
ICS is cool. :) But... sorry Matthew, I think I will spend some time on
another format for Deltas too. One that is NOT backwards compatible in
that way. This is an area I really want some feedback on! Both on making
another format available and what that format would be. :)
I would like this "native" Delta format to be:
- Human readable, just like a cs. We just gzip them and make up some
nice extension like .dz or something. :)
- Editable in a text editor. This means it can not be too complex.
- Easy to extend. This means the base syntax should leave room for new
elements and "relaxed parsing" that can ignore unknown elements
- Very easy to parse. This means it needs to be simple, simple, simple.
I don't want to depend on YAXO or similarly large package for parsing.
- Not "compiler driven". I want the format to be safe and fast to load.
This means the regular Smalltalk Compiler is out of the picture.
My current idea of a format that I think covers the above is:
JSON
...possibly using netstrings for source code (thus not strictly JSON).
JSON offers a very readable "XML-ish" generic format that is very easy
to parse and produce. It can be easily edited in a text editor if
needed. It is compact. If used correctly it should be easy to extend.
One substantial part of the file will be Smalltalk source code. I am not
keen on having to do character-by-character escaping to comply with JSON
Strings though... thus - netstrings. A netstring is a trivial construct:
<length-in-ascii> ":" <binary-data> ","
For example:
11:Sentence of thirty characters.,
Which then would be used for the source code. Advantages would be not
having to do character-by-character escaping. Is this worth "breaking"
JSON? Hmmm, thinking more about it I think we need to "break it" anyway,
because a JSON String can't contain a CR. :)
Ok, sorry for the long post.
regards, Göran
More information about the Squeak-dev
mailing list
|