[Pharo-project] [vwnc] [squeak-dev] Re: [ANN] StOMP - Yet another multi-dialect object serializer

Stéphane Ducasse stephane.ducasse at inria.fr
Mon Jun 20 18:56:48 UTC 2011


Thanks mariano.
I like your smart attitude and answers :)

Stef

On Jun 20, 2011, at 8:50 PM, Mariano Martinez Peck wrote:

> 
> 
> On Mon, Jun 20, 2011 at 5:48 PM, Paul Baumann <paul.baumann at theice.com> wrote:
> If you are going to compare object serializing tools then State Replication Protocol (SRP) should be added to that list.
> 
> 
> Well, this thread was about StOMP, but I will answer anyway about Fuel. We did take a look to SRP. In fact, I've sent you an email asking a lot of questions and you kindly and detailed answered me all the questions. 
>  
> SRP has not been promoted much but it is after many years still a good cross dialect and platform binary serialization tool. It was originally ported to about seven smalltalk dialects.  Every aspect of SRP is context-configurable.
> 
> 
> That's one of the reasons which can me it a little bit slower than others.
>  
> SRP encoding is unique, simple, fast, and unlimited. The user base for SRP is not well known, but I hear from several people that use it for production applications and I have personal experience with one deployment.
> 
>  
> 
> The default configuration for SRP is to use a portable mapping layer and to encode metastate into the data stream. Even with these costs, SRP is comparable in performance to serialization tools that do not do this. The (optional) portable mapping layer is used to represent common smalltalk objects in way that can be loaded into any smalltalk dialect. Metastates describe the structure of the object state so that data load is data driven rather than code dependent. SRP can actually load state for which a class is not defined or has significantly changed. Metastates can be stored in metastate tables that can be reused and referenced to reduce data size and improve performance. When you use metastate tables, SRP stores more compactly than any other binary serialization tool is capable of. Whoever compares performance of SRP with other binary serialization tools should keep in mind that they will have to disable SRP features like these to have a fair comparison.
> 
> 
> How can I disable such portable mapping layer (exaxctly, in code)?  Can I disable that but at the same time support class shape changes?
> 
> 
> 
>  
> 
> SRP is maintained with a single code base that is designed to work for all smalltalk dialects. SRP does this by directing less-portable behavior through a "portal" that is configured to accommodate the dialect the code is being used with.
> 
>  
> 
> I find it funny when I see some binary encodings that are still code-bound. If the data does not somehow indicate the data encoding and layout in some standard way then you can render encode streams unreadable from something as simple as a class schema change. They do that to save the cost of a data type code. SRP would never make a mistake like that, and the cost that SRP experiences for this data type code is typically only one byte.
> 
> 
> We do store the type as well in one byte. But in our case, objects are grouped together in clusters. So it is even one byte per cluster only.
>  
> 
>  
> 
> SRP encoding is fundamentally a sequence of unsigned integers of infinite size. This is the most compact representation possible. An object type header is commonly only one byte and yet is still flexible enough to be unlimited and extended any way imaginable. SRP encoding supported four byte character strings before they were invented and stores them as compactly as possible. SRP allows direct and data width encodings for things like floats and embedded data. Even direct encoding of some doesn't break the readability of the object graph. SRP also allows has features for object annotation like if you want to remember the oop of an object or dependents. The encoding is what is most special and portable about SRP. Financial markets now exchange data using encoding standards (Fast FIX) for some data types that had been pioneered by SRP, but none that I'm aware of are as consistent and pure as SRP.
> 
>  
> 
> SRP is a solid base of code that is intended to be tailored and configured to your needs. It is fast, but the main goal of SRP was portability. SRP is provides a good configuration out of the box that you can easily tune and configure to meet your needs. The most recent tuning SRP has received was for the GS/S dialect to use GS/S specific optimizations. That GS/S specific code can be found here:
> 
>  
> 
> http://techsupport.gemstone.com/entries/181657-srp-3-1-010-0
> 
>  
> 
> SRP can serialize objects like a ComplexBlock, but does not attempt to do so in a dialect-portable way. It is simply that I had not defined a portable representation of a complex block in the portability layer. A common way to do that would be to determine the source of the block (for all dialects) and compile that code on load.
> 
> 
> Yes, but that may not work. Because closures point to another context, which can be a CompiledMethod for example. And a closure can have references to variables defined outside the closure....
>  
> It gets tricky if you attempt to support more than simple blocks or if you want to translate bytecodes (which I'd also prototyped). If you really think you need to serialize blocks then SRP is flexible enough to let you define how you want it done.
> 
>  
> 
> 
> excellent.
>  
> 
> Some Smalltalk dialects (like VA in particular) do not have an efficient two-way become. You'll find that most serialization tools expect there to be an efficient two-way become to substitute one object for another on load. SRP however has a unique way to fix-up references that is efficient for all dialects. SRP has a wide variety of object substitution hooks for both saving and loading that preserve graph relationship integrity without screwing up original objects. SRP also has support for proxy objects that can be managed by application code.
> 
> 
> Where (classes/methods/tests) can I take a look how do you manage those proxies? it sounds interestng. The same for the object sustitutio hook.
>  
> 
>  
> 
> The main thing wrong with SRP is that it is not the framework that "you" created. SRP was the first binary serialization tool to focus on Smalltalk dialect portability. I'd argue that it is still the only one that truly accomplished that in a meaningful way. I created SRP by combining proven techniques from the best tools of the time and adding features for portability. SRP was superior to even the dialect-specific frameworks at the time. SRP is not something that I intend to maintain and promote. I released it open source some ten years ago in the hope that others would do that. A lot of effort and sacrifice was put into SRP "for the benefit of others". SRP taught me a painful lesson about human nature and the perception of value. Programmers (myself included) love to solve problems more than learn about existing solutions. Everyone wants to solve problems like this their own way and thinks they have a good reason that they must do it their way. "Yet another" was an excellent subject line.
> 
> 
> I will speak just for Fuel. I don't think this is really a problem. This that you mention is so known that it has even a name: trade-off. If you find a way to be really fast in serializtion, materialization and be portable at the same time, then I am all ears. For me it is perfect to have different kind of serializers. Do you want something portable and be able to even edit it with a text editor?  then use SIXX. Do you want a portable solution with a more or less good performance? then use StOMP, SRP, etc. Do you want something really fast (mostly at materializtion time) which is not focused in portability? then use Fuel. Is that bad ??    Now in Pharo people are doing Opal compiler, which is 3 times slower than the old one. Why we are not agains that?  again, trade-off. Old Compiler is really difficult to understand and maintain. We want something more OO, easy to maintain, to understand and to experiment.  
> 
> Now, I don't know the reasons but Colin ported SRP to Squeak and the he finally implemented his own S&M serializer. Masashi now implemented StOMP but he also took a look tp SRP. In fact, check the commits in http://www.squeaksource.com/SRP,  He fixed it, and I asked him a couple of questions to make it work. Since this week (a couple of days ago), SRP tests are green in Pharo. So...these guys took a look to SRP, as well as us.
> 
> In our case, we even created benchmarks (check package FuelBenchmarksSRP in Fuel repo) to compare Fuel against the rest. I can share the results with you if you want, but tell me first how to disable the mapping layer that makes it slower. 
> 
> Cheers
> 
>  
> 
>  
> 
> Paul Baumann
> 
>  
> 
>  
> 
>  
> 
> From: vwnc-bounces at cs.uiuc.edu [mailto:vwnc-bounces at cs.uiuc.edu] On Behalf Of Mariano Martinez Peck
> Sent: Monday, June 20, 2011 08:54
> 
> 
> To: The general-purpose Squeak developers list
> Cc: VWNC; Pharo-project at lists.gforge.inria.fr
> Subject: Re: [vwnc] [squeak-dev] Re: [ANN] StOMP - Yet another multi-dialect object serializer
> 
>  
> 
>  
> 
> 2011/6/20 Janko Mivšek <janko.mivsek at eranova.si>
> 
> Hi Masashi,
> 
> Now we have a competition, Fuel vs. StOMP :) Big advantage of StOMP is
> that it is portable and already ported to VW. Which are other
> advantages? Disadvantages?
> 
> Also question for Fuel developers, do you plan to port it to other
> Smalltalks too? Portability is namelly something which is very high on
> checking list for a serializer to use in portable projects, like most of
> web ones are.
> 
> 
> Hi Janko. I think "portability" is to wide to just talk without details. For me, portability in this case means two things: a) In a dialect XXX be able to materialize a  stream which was serialized in a dialect YYY: b)  that the code of the serializer can also work in another dialect (not necessary including a) ).
> 
> Fuel will not support a) for sure. At least, we will not do extra effort to support that. Regarding b), it is not Fuel first feature to be portable to other dialects. But let me explain it: 
> - We want to be able to serialize ANY kind of object, that includes BlockClosure, CompiledMethod, MethodContext, Class, Trait, etc.... Finding a abstract and portable representation for those objects across dialects is complicated.
> - We want to be as fast as possible. That means that if we find a way to be faster which only works in Pharo, we don't care. We will go ahead with that.
> 
> That being said, I have to say that Fuel OO design, from my point of view, is quite nice, easy to understand, and not difficult to port. As an example, Eliot Miranda easily not even port Fuel to another dialect but to Newspeak. And even more, he needed special management for Newspeak data, and he was able to easily adapt Fuel for his needs. So....from in this case Fuel was portable (in the sense of b) and flexible.
> 
> 
> Another difference is that we try to be a little faster in materialization than in serialization (which is not the case of StOMP). So in summary, the differences I can see are:
> 
> 1) StOMP is focus in portability across dialects and also be able to materialize the same stream in different dialects. Fuel is not focus on portability even if it could be portable in the sense of the code.
> 2) StOMP is faster in serializing small/medium graphs. Fuel is faster in large graphs.
> 3) StOMP is faster in serializing while Fuel in materializing. 
> 4) StOMP can serialize some objects (cannot right now BlockClosures or things like that), Fuel can (or should) be able to serialize all. 
> 
> That's all I can see for the moment. But don't worry, there is no fight. We have been sending each other several mails this and the previous week and tried to shared knowledge between :)
> 
> Cheers
> 
> 
>  
> 
> 
> Best regards
> Janko
> 
> S, Masashi UMEZAWA piše:
> 
> > Hello all,
> >
> > I have recently developed a new serialization library called
> > StOMP(Smalltalk Objects on MessagePack).
> > http://stomp.smalltalk-users.jp/
> >
> > StOMP is a binary serializer for major Smalltalk dialects. For those
> > who know SIXX, StOMP can be seen as a binary SIXX. While SIXX
> > represents object data as XML, StOMP uses MessagePack. By combining
> > the flexibility of SIXX with the compactness of MessagePack, StOMP
> > aims to be a unique, next-generation portable serializer for
> > Smalltalk.
> >
> > Features:
> > - Implementation is compact and portable
> > - Shared/circular references support
> > - "Class shape changes" support
> > - Data is interchangable between Smalltalk dialects
> > - Good performance for small sized object graph
> >
> > StOMP is now available for Squeak, Pharo, and VisualWorks.
> >
> > There is ConfigurationOfStOMP, so the installation is easy.
> >
> > Gofer new
> >   squeaksource: 'MetacelloRepository';
> >   package: 'ConfigurationOfStOMP';
> >   load.
> > (Smalltalk at: #ConfigurationOfStOMP) perform: #load.
> >
> > Enjoy!
> 
> --
> 
> Janko Mivšek
> Aida/Web
> Smalltalk Web Application Server
> http://www.aidaweb.si
> 
> 
> 
> 
> -- 
> Mariano
> http://marianopeck.wordpress.com
> 
> 
> This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.
> 
> 
> 
> -- 
> Mariano
> http://marianopeck.wordpress.com
> 




More information about the Squeak-dev mailing list