[Newbies] Re: Is Squeak/Pharo an appropriate language choice?
Levente Uzonyi
leves at elte.hu
Thu Oct 31 20:42:53 UTC 2013
On Thu, 31 Oct 2013, Charles Hixson wrote:
> I think you *did* answer my questions. In a way that means a lot of extra
> work for me.
> Too much of what I want to do depends on things that are currently
> experimental in Smalltalk. It sounds like the image can't load lazily, which
> would probably be necessary if this were to work at all. (Yeah, the 64-bit
> image could hold enough, but I don't have the RAM to hold it all, and getting
> that much RAM is ridiculous, when most of it would be rolled out most of the
> time.)
>
> If I'm going to need to use a database, and handle my own rolling in and out
> anyway, then Smalltalk isn't a good choice. And while multiple processing is
> only a speed-up thing, that's a pretty important thing in and of itself.
If I understand correctly, Magma[1] and Glorp[2] can both help you with
this project. The former is a pure smalltalk object database, the latter
is an ORM using PostgreSQL.
>
> Gemstone isn't a good choice as I need a FOSS distributable. (Actually, if
> I'm reading the web site properly they don't mention what their license is,
> and it seems as if their Smalltalk version is Pharo...which we've already
> covered.)
GemStone/S is not open source, but there's a free version with some
resource limitation (1 CPU/16 GB IIRC). They use their own Smalltalk
implementation, but it has no GUI, so they wrote tools in Squeak/Pharo,
which let you develop your code.
>
> FWIW, I'm well aware that I'm trying to run too much program on too small a
> system. I know this implies a massive speed penalty. But that's true
> whatever approach I take. I was hoping that I could avoid doing my own
> memory management, and for that Smalltalk appeared the only feasible choice.
Magma and Glorp can both help you with this.
Levente
[1] http://wiki.squeak.org/squeak/2665
[2] http://glorpwiki.wikispaces.com/How+Glorp+Works
> Apparently, however, I'm trying something a bit beyond the bleeding edge at
> the current state of the art.
>
> As to more details as to what I'm planning:
> So what I'm going to need to do is connect the graph nodes by id#s, and roll
> them in from a database and stick them in a dictionary (indexed by id#, as
> most of the nodes won't have any other unique and persistent id). This is
> necessary as each node will link to up to around 80 other nodes, with some of
> the links being bidirectional, but not dependably so. And I'll need another
> index of "words" which are indexes from external symbols into nodes. Doing
> it this way, most of it can be kept rolled out most of the time, but there's
> an obvious speed penalty. So I'll need to track which references are stale
> and roll them out to disk (or just drop them, if they aren't dirty). Etc.
> Much of this would have been handled automatically in Smalltalk, but not the
> automatic roll out, apparently. (In Smalltalk I'd use references rather than
> id#s, in fact id#s wouldn't have been needed.)
> I'll probably write the first version in Python (rather than Ruby, because
> Doxygen documentation for Python is better than I can generate for Ruby,
> though Ruby is in some other ways better). Then, when it's working I'll
> translate it into D or Ada. (Not yet decided, though D has the inside track.
> Ada has wider support, but D is garbage collected and has variable sized
> arrays and built-in hash tables. Ada currently has a better interface to
> databases, but D is improving much more rapidly. And D program design
> structures are more similar to those of Python. Of course Vala is an outside
> chance. But it's been developing quite slowly. And Go seems headed in a
> different direction, even though it has an easier support for concurrency.)
>
> P.S.: Were Smalltalk suitable I'd be needing to repartition my disk to give
> me a much larger virtual memory space. Currently I'm only set up for around
> 1.5 Gigabytes, which should be enough for the first few months, but would
> limit what else I could be doing towards the end of that time.
>
> P.P.S: I also considered a graph database, Neo4j, but they don't support
> enough information on the links...though I could coerce integers into
> floating point, the loss of precision was worrying. This isn't a problem that
> would show up until the id#s started to get large, but that's not very
> reassuring. Also too much appears to need to be decided at compile time
> rather than at run time, and this is a very dynamic system (or it had better
> be!).
>
> Thank you for your help, and good reporting of the current state of the
> environment.
>
> On 10/31/2013 11:40 AM, Levente Uzonyi wrote:
>> On Thu, 31 Oct 2013, Charles Hixson wrote:
>>
>>> I'm contemplating a project that would benefit greatly by a persistent
>>> memory image, though I'll eventually (in a year or so) need the 64-bit
>>> image, but:
>>> The image will be a lot larger than RAM. It would include a directed
>>> graph
>>
>> The current garbage collector is not suitable for large images. GC delays
>> become noticable when the image grows over a few hundred MBs. Eliot is
>> working on a better one, but we don't know how it performans until it's
>> ready.
>>
>> I don't see how your image could be a lot larger than RAM. It's technically
>> possible, but it's pretty likely that it would be too slow to be practical.
>>
>>> that had an index of a million or so entries, and most nodes wouldn't be
>>> indexed. So in order to even load it would need to use some sort of lazy
>>> access. And I'm not even sure that a Dictionary of over a million items
>>> is reasonable. (Naturally none of the examples address this problem.)
>>
>> The perfomance of Dictionary mainly depends on the implementation of #hash
>> and #= of the objects you want to store in it.
>>
>>>
>>> Additionally, all of my (written) documentation is so old that it doesn't
>>> even discuss multi-processor systems, so I don't know whether modern
>>> Smalltalks make any use of additional available processors.
>>
>> Squeak/Pharo don't support them from a single image. There are experimental
>> VMs designed for multi-processor systems (RoarVM, HydraVM), but AFAIK none
>> of them is ready for production use.
>>
>>>
>>> I'd really like some advice, and possibly some references. I know that
>>> Smalltalk has the reputation for being slow (yes, I've been reading about
>>> the recent speed-ups), but much of what I'd need to write in any other
>>> language seems like it may already be present in Smalltalk, so if it would
>>> work, I'd like to choose it. But I won't be able to test this until the
>>> application has been running for quite awhile, so I would be very
>>> desirable that I know ahead of time.
>>
>> It's hard to tell more without knowing more details about the project.
>>
>>
>> Levente
>>
>> P.S.: you might want to check out GemStone/S
>> http://gemtalksystems.com/index.php/products/gemstones/
>>
>>>
>>> --
>>> Charles Hixson
>>>
>>> _______________________________________________
>>> Beginners mailing list
>>> Beginners at lists.squeakfoundation.org
>>> http://lists.squeakfoundation.org/mailman/listinfo/beginners
>>>
>
>
> --
> Charles Hixson
>
> _______________________________________________
> Beginners mailing list
> Beginners at lists.squeakfoundation.org
> http://lists.squeakfoundation.org/mailman/listinfo/beginners
>
More information about the Beginners
mailing list