Re: relational for what? [was: Design Principles Behind Smalltalk, Revisited]

3 Jan 2007


      ...
From: Howard Stearns hstearns@wisc.edu
Reply-To: The general-purpose Squeak developers 
listsqueak-dev@lists.squeakfoundation.org
To: The general-purpose Squeak developers 
listsqueak-dev@lists.squeakfoundation.org
Subject: Re: relational for what? [was: Design Principles Behind Smalltalk, 
 Revisited]
Date: Tue, 02 Jan 2007 14:36:22 -0600
Yes, I'm quite serious. I'm asking what kinds of problems RDBMS are 
uniquely best at solving (or at least no worse). I'm not asking whether 
they CAN be used for this problem or that.  I'm asking this from an 
engineering/mathematics perspective, not a business ("we've always done 
things this way" or "we like this vendor") perspective.
<horror story ommited>
Honestly, it just seems to me like someone architected an awful system.  I 
know, for example, some databases (oracle I thought) can span a given DB 
across boxes etc. with different methods of partitioning (e.g. some tables 
here, some tables there, foreign keys between them, etc.).
You certainly shouldn't have to be copying data between tables.  If nothing 
better, you could install MySQL everywhere and turn on replication.
...
Maybe this isn't typical, but it is the architecture that Oracle and its 
PeopleSoft division pushes on us in their extensive training classes. And 
it appears to be the architecture discussed in the higher education IT 
conferences and Web sites in the U.S.
Well, the big companies tend to push the most expensive option, not the best 
for the data model.  In my experience so far, I can think of no case where 
we accepted what the vendors proposed before some serious threats etc..
...
Anyway, either the data AS USED fits into memory or doesn't. If it does, 
then what benefit is the relational math providing? If it doesn't, then we 
have to ask whether the math techniques that were developed to provide 
efficient random access over disks 20 years ago are still valid. Is this 
still the fastest way? (Answer is no.) Is there some circumstance in which 
it is the fastest? Or the safest? Or allow us to do something that we could 
not do otherwise?
I still don't think the question has anything to do with "in memory" vs. 
"not in memory" or "quickest way to access the disk".  You can tune your 
RDBMS to try to cache as much as possible in memory, and then it becomes a 
contest of: is it faster for me to write all the code to do the joins, etc. 
or take what they already have for possibly a run-time speed hit.
Or maybe a speed gain since the RDBMS can break up the table into different 
"spaces" and run the query simultaniously in different threads.  Of course 
you can do that by hand, but then you are getting further behind what they 
already have.
...
I tried briefly to combine JJ's answer with Peter's to find an appropriate 
niche. (Again, I'm trying to look at the math, not fit and finish, 
availability of experienced programmers, color of brochure...) For exampe, 
there could be a class of problems for which the data set is a few 10's of 
gigs and needs to be operated on as a whole. And that queries are fairly 
arbitrary and exploratory, not production-oriented. Etc. But I haven't been 
able to come up with one that doesn't have better characteristics as a 
distributed system.  Maybe if we define the problem as "and you only have 
one commodity box to do it on." That's fair. Maybe that's it?  (Then we 
need to find an "enterprise" with only one box...)
Well, it's not going to be that ("you only have one commodity box").  When I 
said I think you could do what Google is doing with an RDBMS if you really 
wanted to I wasn't thinking of a few commidity boxes.  I was thinking of 
4-10 really enormous boxes (but my understanding was that google uses *lots* 
of computers to do their work, no?).
In other words the RDBMS solution will be much more expensive 
computer/software wise compared to what Google did.
_________________________________________________________________
Your Hotmail address already works to sign into Windows Live Messenger! Get 
it now 
http://clk.atdmt.com/MSN/go/msnnkwme0020000001msn/direct/01/?href=http://get...