[squeak-dev] Re: Why would one prefer CouchDB or TokyoT/C over plain old SQL? (was Re: Re: Squeak packages for accessing SQL databases and for report generation)

Andreas Raab andreas.raab at gmx.de
Sat Mar 28 18:47:29 UTC 2009


Wow. What a fascinating read. Thanks so much. How's tool support for 
these dbs? One of the major reasons for us going with MySQL was simply 
because of tool support (i.e., management can extract the relevant 
numbers like new user signups, usage hours, growth curve, etc for 
themselves). Turns out this is just as important as the technical 
details ;-)

Cheers,
   - Andreas

Göran Krampe wrote:
> Hi!
> 
> (I almost missed this one, sorry)
> 
> Andreas Raab wrote:
>> Göran Krampe wrote:
>>> Secodnly - the original question was about SQL and how to access 
>>> legacy stuff I think, but there are several new interesting database 
>>> alternatives around that is worth mentioning. I have toyed with two 
>>> of them lately:
>>
>> Interesting. When would you prefer either of them over a plain old SQL 
>> database? I'm not familiar with either CouchDB or 
>> TokyoCabinet/TokyoTyrant but I'd be interested to find out more about 
>> their application areas.
> 
> Let me give you a quick take on this rather large subject :). First, a 
> summary of my little efforts on both these products:
> 
> - I toyed with CouchDB, there was already a Curl-based API at SS for it. 
> I also implemented a "view server" in Squeak for it, haven't released it 
> yet, should do that. I track it, it moves. It's hip.
> 
> - I recently started playing with TT/TC and have built a Squeak API for 
> it, I just threw it up on SM and yesterday I posted a lengthy blog 
> article about it in fact:
> 
>     http://goran.krampe.se/blog/Squeak/TokyoTyrant.rdoc
> 
> 
> History:
> 
> - Most of these new dbs have been built as responses to pragmatic needs 
> to scale a LOT. TT/TC comes from Mixi.jp ("Facebook of Japan"). Then you 
> have a whole list of these things from Amazon (Dynamo - closed), Google 
> (BigTable), Facebook (Cassandra) etc etc.
> 
> - CouchDB is also built to scale like crazy, but started as a single man 
> hobby project. It is one of the few projects with a really strong 
> developer community since it was NOT built inside a company. Built in 
> Erlang as are MANY of these new dbs.
> 
> Back to the reasons why one would prefer them (any of these), my take on 
> it:
> 
> 1. Peformance/hardware ratio. Most of these are variants of "key-value 
> stores" or "document centric dbs". They focus a lot on speed. As you can 
> see in my blog entry TT/TC is awfully fast, well, I haven't compared yet 
> to say PGSQL, but I can't imagine doing 2000-3000 inserts/sec stuffing 
> 18Mb/sec into an SQL db on this little mini laptop of mine. I really 
> hope I am not lying through my teeth. :)
> 
> 2. Avoid the ORM/"impedance mismatch" swamp. CouchDB is a key-value 
> store which stores JSON objects (a "document" in their lingo). Thus it 
> can store/load object graphs/hierarchies in one "clump" quite easily - 
> so in some respect these databases are similar to OODBs IMHO. TT/TC 
> stores "more or less" binary blobs (unless you use table extension).
> 
> 3. Dynamics. Both CouchDB and TT/TC (using table extension) talk about a 
> "schema less" model. This translates to the fact that CouchDB can store 
> *any* JSON object, there is no schema. And using map/reduce you can 
> still work with "views" on them etc. TT/TC using table extension more or 
> less stores a Dictionary as value: "<key> $00 <value> $00 <key2> $00 
> <value2>". And then it has support for adding indexes on these keys, a 
> query engine, a Lua scripting extension inside TT to do "stored 
> procedures"-stuff etc. But key here is the fact that these databases are 
> made to deal with a changing world and does not rely on strict schemas 
> nor advanced types, and when you have it running on say 100 servers 
> these aspects seem to become very important (not talking from 
> experience, just from what I hear in these forums).
> 
> 4. Scale horisontally. A lot. CouchDB aims at mega-scaling using 
> replication and multi-version logic - "eventual consistency". It also 
> implements the map/reduce pattern where you can define map and reduce 
> functions in JS running on the server in a so called "view server". 
> TT/TC also has replication, dual-master failover, 
> single-master-multi-readers etc. It does not have mega-scaling goals as 
> CouchDB has, but there are already "layers on top" like LightCloud that 
> forms a hash-ring of TT/TC servers for scaling. And since it is so darn 
> fast on a single box it covers a lot of use cases without large scaling.
> 
>  From a more personal "touchy feely" perspective these things (and 
> several others like Dynomite) are a fresh air! They are very simple to 
> use. They are FAST. They are robust. They are small. They often embrace 
> the "web 2.0" world by using JSON, HTTP-REST APIs, memcached protocol 
> etc etc.
> 
> For the moment I am focusing on TT/TC but CouchDB has some very 
> interesting things going for it - like Erlang OTP, promised transparent 
> replication and its map/reduce stuff. And the CouchDB API on SS seems to 
> work fine if you get the Curl plugin.
> 
> Well, that turned into a long post, but hopefully I answered some of it. 
> For more details read my blog article! :) I also plan to write another 
> soon about the table extension and its Lua mechanisms.
> 
> regards, Göran
> 
> 
> 




More information about the Squeak-dev mailing list