[Newbies] Re: Is Squeak/Pharo an appropriate language choice?

Charles Hixson charleshixsn at earthlink.net
Thu Oct 31 19:55:40 UTC 2013


I think you *did* answer my questions. In a way that means a lot of 
extra work for me.
Too much of what I want to do depends on things that are currently 
experimental in Smalltalk.  It sounds like the image can't load lazily, 
which would probably be necessary if this were to work at all.  (Yeah, 
the 64-bit image could hold enough, but I don't have the RAM to hold it 
all, and getting that much RAM is ridiculous, when most of it would be 
rolled out most of the time.)

If I'm going to need to use a database, and handle my own rolling in and 
out anyway, then Smalltalk isn't a good choice.  And while multiple 
processing is only a speed-up thing, that's a pretty important thing in 
and of itself.

Gemstone isn't a good choice as I need a FOSS distributable. (Actually, 
if I'm reading the web site properly they don't mention what their 
license is, and it seems as if their Smalltalk version is Pharo...which 
we've already covered.)

FWIW, I'm well aware that I'm trying to run too much program on too 
small a system.  I know this implies a massive speed penalty. But that's 
true whatever approach I take.  I was hoping that I could avoid doing my 
own memory management, and for that Smalltalk appeared the only feasible 
choice.  Apparently, however, I'm trying something a bit beyond the 
bleeding edge at the current state of the art.

As to more details as to what I'm planning:
So what I'm going to need to do is connect the graph nodes by id#s, and 
roll them in from a database and stick them in a dictionary (indexed by 
id#, as most of the nodes won't have any other unique and persistent 
id).  This is necessary as each node will link to up to around 80 other 
nodes, with some of the links being bidirectional, but not dependably 
so.  And I'll need another index of "words" which are indexes from 
external symbols into nodes.  Doing it this way, most of it can be kept 
rolled out most of the time, but there's an obvious speed penalty.  So 
I'll need to track which references are stale and roll them out to disk 
(or just drop them, if they aren't dirty).  Etc.  Much of this would 
have been handled automatically in Smalltalk, but not the automatic roll 
out, apparently.  (In Smalltalk I'd use references rather than id#s, in 
fact id#s wouldn't have been needed.)
I'll probably write the first version in Python (rather than Ruby, 
because Doxygen documentation for Python is better than I can generate 
for Ruby, though Ruby is in some other ways better). Then, when it's 
working I'll translate it into D or Ada. (Not yet decided, though D has 
the inside track.  Ada has wider support, but D is garbage collected and 
has variable sized arrays and built-in hash tables.  Ada currently has a 
better interface to databases, but D is improving much more rapidly.  
And D program design structures are more similar to those of Python.  Of 
course Vala is an outside chance.  But it's been developing quite 
slowly.  And Go seems headed in a different direction, even though it 
has an easier support for concurrency.)

P.S.:  Were Smalltalk suitable I'd be needing to repartition my disk to 
give me a much larger virtual memory space.  Currently I'm only set up 
for around 1.5 Gigabytes, which should be enough for the first few 
months, but would limit what else I could be doing towards the end of 
that time.

P.P.S:  I also considered a graph database, Neo4j, but they don't 
support enough information on the links...though I could coerce integers 
into floating point, the loss of precision was worrying. This isn't a 
problem that would show up until the id#s started to get large, but 
that's not very reassuring.  Also too much appears to need to be decided 
at compile time rather than at run time, and this is a very dynamic 
system (or it had better be!).

Thank you for your help, and good reporting of the current state of the 
environment.

On 10/31/2013 11:40 AM, Levente Uzonyi wrote:
> On Thu, 31 Oct 2013, Charles Hixson wrote:
>
>> I'm contemplating a project that would benefit greatly by a 
>> persistent memory image, though I'll eventually (in a year or so) 
>> need the 64-bit image, but:
>> The image will be a lot larger than RAM.  It would include a directed 
>> graph
>
> The current garbage collector is not suitable for large images. GC 
> delays become noticable when the image grows over a few hundred MBs. 
> Eliot is working on a better one, but we don't know how it performans 
> until it's ready.
>
> I don't see how your image could be a lot larger than RAM. It's 
> technically possible, but it's pretty likely that it would be too slow 
> to be practical.
>
>> that had an index of a million or so entries, and most nodes wouldn't 
>> be indexed.  So in order to even load it would need to use some sort 
>> of lazy access.  And I'm not even sure that a Dictionary of over a 
>> million items is reasonable.  (Naturally none of the examples address 
>> this problem.)
>
> The perfomance of Dictionary mainly depends on the implementation of 
> #hash and #= of the objects you want to store in it.
>
>>
>> Additionally, all of my (written) documentation is so old that it 
>> doesn't even discuss multi-processor systems, so I don't know whether 
>> modern Smalltalks make any use of additional available processors.
>
> Squeak/Pharo don't support them from a single image. There are 
> experimental VMs designed for multi-processor systems (RoarVM, 
> HydraVM), but AFAIK none of them is ready for production use.
>
>>
>> I'd really like some advice, and possibly some references.  I know 
>> that Smalltalk has the reputation for being slow (yes, I've been 
>> reading about the recent speed-ups), but much of what I'd need to 
>> write in any other language seems like it may already be present in 
>> Smalltalk, so if it would work, I'd like to choose it.  But I won't 
>> be able to test this until the application has been running for quite 
>> awhile, so I would be very desirable that I know ahead of time.
>
> It's hard to tell more without knowing more details about the project.
>
>
> Levente
>
> P.S.: you might want to check out GemStone/S 
> http://gemtalksystems.com/index.php/products/gemstones/
>
>>
>> -- 
>> Charles Hixson
>>
>> _______________________________________________
>> Beginners mailing list
>> Beginners at lists.squeakfoundation.org
>> http://lists.squeakfoundation.org/mailman/listinfo/beginners
>>


-- 
Charles Hixson



More information about the Beginners mailing list