[Newbies] Is Squeak/Pharo an appropriate language choice?

Fri Nov 1 13:53:51 UTC 2013

Hi Charles,

I don't know enough about Magma to answer your questions.  I'm really a VA
Smalltalk guy and only play a little with Squeak.  I knew just enough about
Magma to point you to it.  I'm sure there are a lot of Squeakers that know
about Magma and can probably answer your questions but they are probably
not reading this list.  Try re-posting on:
gmane.comp.lang.smalltalk.squeak.general.  There may also be a Magma
specific list but I'm not sure about that.

Before you decide you need a database for sure maybe you could experiment
with creating a lot of data in your image and see how long it takes to
load/save.  If it isn't too long then the OS paging out pieces and then
back when needed might not be too bad?

Also, if you are going to use a database maybe you could use a hash (big)
for the id's?

Lou

On Thu, 31 Oct 2013 14:53:03 -0700, Charles Hixson
<charleshixsn at earthlink.net> wrote:

>On 10/31/2013 01:28 PM, Louis LaBrunda wrote:
>> Hi Charles,
>>
>>> If I'm going to need to use a database, and handle my own rolling in and
>>> out anyway, then Smalltalk isn't a good choice.  And while multiple
>>> processing is only a speed-up thing, that's a pretty important thing in
>>> and of itself.
>> I think you may need an OODB, you should take a look at Magma
>> http://wiki.squeak.org/squeak/2665.  You may not need to do as much rolling
>> in and out on your own as you think.
>>
>> Lou
>> -----------------------------------------------------------
>> Louis LaBrunda
>> Keystone Software Corp.
>> SkypeMe callto://PhotonDemon
>> mailto:Lou at Keystone-Software.com http://www.Keystone-Software.com
>Short answer:
>Probably not sufficient.
>
>Long answer (excuse the rambling, I was thinking it through as I wrote it):
>If I'm understanding http://wiki.squeak.org/squeak/2639 correctly, which 
>I may not be, I'd still need to recode the entire graph structure to be 
>designed in terms of id#s (keys) rather than direct references.
>I.e., I'd need to code it in terms of two collections one of which would 
>contain keys that, when interpreted, referenced itself.  This does 
>appear to move the plan into the area of the possible, but at the cost 
>of the advantage that I'd hoped Smalltalk would provide of a large 
>persistent image.  I thought at first when it was talking about 
>transparency that this wouldn't be necessary, but:
>
>> Magma *can* maintain and quickly "search" large, flat structures, but 
>> the normal Smalltalk collections such as Bag or OrderedCollection are 
>> not suitable for this. The contiguous ByteArray records Magma uses to 
>> store and transport Smalltalk objects would be impractical for a large 
>> Smalltalk Collection
>Seems to mean that the Graph couldn't be stored as something that Magma 
>would recognize as a graph.  So does "Objects are persisted by 
>reachability", though that has other possible interpretations.  But 
>since the graph would contain a very large number of cycles in multiple 
>"dimensions"...  OTOH http://wiki.squeak.org/squeak/2638 on Read 
>Strategies appears to mean that it wouldn't automatically (or rather 
>could be set to not automatically) pull in items that are references 
>within the object being read.
>
>Again, http://wiki.squeak.org/squeak/5722 , may mean that a class with 
>named variables holding 4 arrays of arrays of length 3 (reference float 
>float) and a few other variables containing things like bools and 
>strings and ints, would be handled without problem. But note that each 
>of those references is to an item of the same type, and it could include 
>cycles.  So I can't decide WHAT it means.  Do I need to recode the 
>references as id#s? Does that even suffice?  (If it does, then it's 
>still a good deal.  But if I must name each entry separately, it's not a 
>good deal at all, as the number of entries in each of the 4 outer level 
>arrays is highly variable, and though I intend to apply an upper limit, 
>only experiment can determine what a reasonable upper limit is.)
>
>And yet again (if I'm understanding correctly) I'm going to need to 
>violate just about every one of the hints on performance in 
>http://wiki.squeak.org/squeak/2985 .  I'm not sure how much MagmaArray 
>keeps in RAM of things that aren't currently in use.  At one point it 
>sounded like 6 bytes.  This is actually a lot of overhead in this kind 
>of a system.
>
>Additionally, it appears that Magma doesn't have anyway to detect that a 
>reference is "stale" (i.e., hasn't been referenced in a long time), an 
>use that to decide to roll it out.  It looks as if this needs to be done 
>by the program...but that time-stamp (and a few other items mustn't 
>(well, needn't...but I sure would need to overwrite it when I read it 
>in) itself be included in the items rolled out.  So I need to solve THAT 
>problem.
>
>Magma seems to be a good object database, but I can't see that it makes 
>Smalltalk a desirable choice for this project  (It may, this could be a 
>documentation problem...either my not understanding it or the 
>information not being clear.)  If I'm going to recode the references 
>into id#s, then either Ruby or Python make it trivial to turn the object 
>into a string (and to reconstitute it later), and they also make it 
>trivial to leave out any volatile variables. Perhaps Magma does the 
>latter, but this wasn't clear.
>
>Definitely a part of my problem is that I don't have a clear image of 
>how I would proceed.  The only examples given were small fragments, 
>extremely useful in clarifying points, but insufficient to yield a 
>larger idea of how to use things.  (E.g., I have no idea how to do Ma 
>Object Serialization, but I may need to implement it anyway.)
>
>Perhaps this is all because I don't really know Smalltalk well...which I 
>assuredly don't.  I was hoping to use Smalltalk to avoid the database 
>problem, trading RAM (including virtual RAM) consumption for capacity, 
>but it looks as if I end up at a database anyway.  And in that case I 
>should use a language that I'm already familiar with.  (I'd really been 
>hoping that the persistent image would be the answer.)  If I do a 
>decomposition I could even get away with using a key-value store.  The 
>only problem is that the id# requires lookup via an indirect reference.  
>(Is it in the Directory?  If not, get it from the database, if not, it's 
>a new value.)  Once I do the recoding of references to id#s, the 
>database portion is "trivial, but annoying". But now I've added 
>thousands of additional indirections/second.  However, IIUC, Magma would 
>be doing that under the hood anyway (as opposed to the image, which 
>would be handled in hardware memory translation), and If I code it, I 
>can put in things like automatically rolling out when it's stale.  (By 
>the way, does "stub" mean remove from memory, or remove from the 
>database?  From context I decided it probably meant remove from memory, 
>but I couldn't decide whether dirty data would be written before being 
>removed from memory, and I couldn't be really sure it wasn't just being 
>deleted.  That needs rephrasing by someone who knows what it's supposed 
>to mean.)
>
>To me this appears to be, again, not the project that justifies 
>implementation in Smalltalk.  Perhaps if I were already experienced in 
>Smalltalk I wouldn't see things that way, as Magma clearly means that 
>Smalltalk *can* handle doing the project.
>
>Thank you for your suggestion.
-----------------------------------------------------------
Louis LaBrunda
Keystone Software Corp.
SkypeMe callto://PhotonDemon
mailto:Lou at Keystone-Software.com http://www.Keystone-Software.com