On 02.09.2014, at 17:03, Eliot Miranda <eliot.miranda@gmail.com> wrote:

Hi Bert,


On Mon, Sep 1, 2014 at 7:13 AM, Bert Freudenberg <bert@freudenbergs.de> wrote:
On 01.09.2014, at 15:56, Eliot Miranda <eliot.miranda@gmail.com> wrote:

>> Another thought is that given the abundance of memory these days, we might cache both sources and changes in main memory (which would also speed up full-text searches).
>
> Pharo is planning to eliminate them altogether which is more coherent than caching them.

That's another discussion, but it might be a step towards that.

>  But IMO the solution is easy, maintain a *single* read-only copy of the sources and changes files in SourceFilesArray (or whatever the class is called; I'm on my phone) and read source through them instead of reopen ing the damn things all the time.

Except that Squeak files do not maintain an independent file position pointer, so reading from different positions in the same file is not thread-safe. That's why the file is opened again.

Can you explain more.  I don't understand.

I think we're in violent agreement ;) This was a comment on the image-side file handling, not about the VM. (Although we might need better file prims. Tim appears to have ideas)

 As I see it every file instance has its own FILE structure so I don't understand how this can be so.  Files are derived from FilePlugin's primitiveFileOpen function.  That is implemented in terms of fileOpenNamesizewritesecure, which allocates a ByteArray to hold state (so no two Smalltalk files share state) and then uses sqFileOpen to open the underlying file and fill in the state.  The C library implementation of sqFileOpen in platforms/Cross/plugins/FilePlugin/sqFilePluginBasicPrims.c uses fopen et al, again creating unique state.

Precisely. Every Squeak file-open also opens an OS file via fopen(). The number of files a process can open is limited. That is why we run out of file handles unless the files gets closed properly.

 So I don't see how having a writable Smalltalk file and a separate read-only Smalltalk file on the sources and changes can result in other than two separate independent file pointers.

What I was getting at is that it would be much better to actually open the file only once (or twice, once for writing and once for read-only) and then maintain a file pointer in the image independently of the OS's file position. That is how we could share a single read-only file for many readers. The problem is that the OS file also has a file position, and the file positions we maintain in the image would easily get out of sync with that.

There *is* an issue with the structure of file access in the debugger.  If one were to step through execution of accessing source from e.g. the read-only sources file then the very act of fetching sources for the methods being displayed would confuse the state in the file one was observing through the debugger.  But that's hardly a new situation (one can get into a similar situation with the current setup), and the debugger could ease the situation by wrapping source access with something that reset the file's buffer etc.
 

> Then the file's own buffers will provide done caching.  Annoying that I write this code in 2008 for newspeak but we still rely on the mad "run the GC to finalize files when open fails" approach.

Well, you writing this for newspeak does not immediately benefit Squeak. But if you point us to the code maybe someone can port it to Squeak?

I already did a year or two ago and it got shot down on the debugger grounds I reiterated above.

Well, maybe it's time to revisit, then.

- Bert -