Hi folks!
Wanted to give interested people (if there are any!) a heads up on Deltastreams development.
History =======
I started the Deltastreams "project" a few years back, there is quite a bit of info on the wiki and even a movie from OOPSLA where I present it and demo it. The idea of Deltas and streams of them is an evolution of the old changeset and update stream. The concept and idea also borrows from experience in using MC (which is not a direct competitor) and other distributed SCMs outside of Squeak. Think of a Delta as a "super changeset". The streams part has not been coded on yet.
Work ==== After a while Matthew Fulmer started helping me with the code and he has done a LOT on the code base including lots and lots more tests, lots of fixes in SystemEditor (from Colin Putney, used in MC2 I think) which we depend on. In fact, the Deltastreams codebase was probably first out in stressing SystemEditor. Matthew also created ICS - an advanced file format for Deltas. Matthew has lately been working in MC a lot, which gives Matthew a unique perspective that I don't have. Matthew is also involved a lot in Croquet - which is one primary potential fork to use DS with.
I have started working the last days again and the "itch" is back for real. :) I am focusing on the "replace changesets" part and the next step for me is probably to make a "dual change sorter"-like UI and a new file format (see below). And make tests green. And also make ICS format work. :)
Code today ========== Deltastreams is hosted on SS. I currently develop it in 3.10.2, dependencies are SystemEditor and InterleavedChangeset (ICS). Both of them could be replaced with other packages taking those roles ("file format for Deltas" and "tool to atomically apply code changes to a live image"). We want the code to have very little dependencies and to work in "all" Squeaks.
Status ====== We have lots of broken tests right now, and I intend to make it all GREEN and keep it that way. We have been sloppy and have added lots of tests without implementing them - this tactic works for a while but when the code base gets complex they really need to be GREEN. Otherwise you lose the ability to see if you actually broke something :).
The good part is that there are about 420 tests, and lots of aspects of Deltas are thoroughly covered. Logging, applying and reverting Deltas (code mechanisms) are 99% working. Currently I think the only bit missing is category reorganization.
The ICS file format is partially working, I haven't gotten into the code base fully yet - the format is very "clever" which may be its main problem. It tries to do a really cool trick - being compatible with Changesets! Or in other words, the same file contains both a binary representation of a Delta that Deltastream code uses AND a changeset representation that old images can use. This means that an ICS file can be filed into an old image without ANY modification to that image. It then simply looks like a changeset.
There is a UI built by Matthew that works on SystemEditor "models", I know too little of its status right now. I intend to build another complementary UI working much more like the "dual change sorter".
A new format ============
ICS is cool. :) But... sorry Matthew, I think I will spend some time on another format for Deltas too. One that is NOT backwards compatible in that way. This is an area I really want some feedback on! Both on making another format available and what that format would be. :)
I would like this "native" Delta format to be:
- Human readable, just like a cs. We just gzip them and make up some nice extension like .dz or something. :) - Editable in a text editor. This means it can not be too complex. - Easy to extend. This means the base syntax should leave room for new elements and "relaxed parsing" that can ignore unknown elements - Very easy to parse. This means it needs to be simple, simple, simple. I don't want to depend on YAXO or similarly large package for parsing. - Not "compiler driven". I want the format to be safe and fast to load. This means the regular Smalltalk Compiler is out of the picture.
My current idea of a format that I think covers the above is:
JSON
...possibly using netstrings for source code (thus not strictly JSON).
JSON offers a very readable "XML-ish" generic format that is very easy to parse and produce. It can be easily edited in a text editor if needed. It is compact. If used correctly it should be easy to extend.
One substantial part of the file will be Smalltalk source code. I am not keen on having to do character-by-character escaping to comply with JSON Strings though... thus - netstrings. A netstring is a trivial construct: <length-in-ascii> ":" <binary-data> ","
For example:
11:Sentence of thirty characters.,
Which then would be used for the source code. Advantages would be not having to do character-by-character escaping. Is this worth "breaking" JSON? Hmmm, thinking more about it I think we need to "break it" anyway, because a JSON String can't contain a CR. :)
Ok, sorry for the long post.
regards, Göran