[squeak-dev] CurrenReadOnlySourceFiles (was: Re: Question about inlining | How to access named temps in FullBlockClosure?)

Wed Apr 1 18:39:15 UTC 2020

Hi Levente,

On Tue, Mar 31, 2020 at 9:25 PM Levente Uzonyi <leves at caesar.elte.hu> wrote:

> Hi Eliot,
>
> On Tue, 31 Mar 2020, Eliot Miranda wrote:
>
> > Hi Levente,
> >
> > On Tue, Mar 31, 2020 at 7:38 PM Levente Uzonyi <leves at caesar.elte.hu>
> wrote:
> >       Hi Eliot,
> >
> >       On Tue, 31 Mar 2020, Eliot Miranda wrote:
> >
> >       > Hi Levente,
> >       >
> >       > On Tue, Mar 31, 2020 at 3:33 PM Levente Uzonyi <
> leves at caesar.elte.hu> wrote:
> >       >       Hi Eliot,
> >       >
> >       >       On Mon, 30 Mar 2020, Eliot Miranda wrote:
> >       >
> >       >       > Hi Levente,
> >       >       >
> >       >       >> On Mar 30, 2020, at 2:21 PM, Levente Uzonyi <
> leves at caesar.elte.hu> wrote:
> >       >       >>
> >       >       >> Hi Eliot,
> >       >       >>
> >       >       >>> On Mon, 30 Mar 2020, Eliot Miranda wrote:
> >       >       >>>
> >       >       >>> Well, that's not what I meant by a search.  However,
> as Levente pointed out, textual searches should be surrounded with
> CurrentReadOlySouceFiles cacheDuring:.  I think this is an awful
> implementation
> >       and would
> >       >       implement it
> >       >       >>> very differently but that's the work-around we have in
> place now,
> >       >       >>
> >       >       >> How would you implement it?
> >       >       >>
> >       >       >> <history>
> >       >       >> When I introduced CurrentReadOnlySourceFiles, I wanted
> to solve the issue of concurrent access to the source files.
> >       >       >> I had the following options:
> >       >       >> 1) controlled access to a shared resource (a single
> read-only copy of the SourceFiles array) with e.g. a Monitor
> >       >       >> 2) same as 1) but with multiple copies pooled
> >       >       >> 3) exceptions to define the scope and lifetime of the
> resources (the read-only copies) within a process
> >       >       >>
> >       >       >> I chose the third option because it was possible to
> introduce it without exploring and rewriting existing users: you could
> leave all code as-is
> >       >       >> and sprinke CurrentReadOnlySourceFiles cacheDuring: [
> ... ] around code that needed better performance.
> >       >       >> It's obviously not a perfect solution, but I still
> think it was the best available at the time.
> >       >       >>
> >       >       >> Later ProcessLocalVariables were added to Trunk. Which
> could be used to solve the concurrent access issue by using process-local
> copies of the source files. The only challenge is to release them after
> >       they are
> >       >       not needed any more. Perhaps a high priority process could
> do that after a few minutes of inactivity. Or we could just let them linger
> and see if they cause any problems.
> >       >       >> </history>
> >       >       >
> >       >       > I think the key issue (& this from a discussion here
> with Bert) is access time source in the debugger while one is debugging
> file access.  As the debugger asks for source so the file pointer is moved
> and
> >       hence
> >       >       screws up the access one is trying to debug.
> >       >
> >       >       I don't think that's the only issue. Have a look at the
> senders of
> >       >       #readOnlyCopy. Many of them were added 10+ years ago, well
> before
> >       >       CurrentReadOnlySourceFiles was introduced. Most of those
> could use
> >       >       CurrentReadOnlySourceFiles too but are unrelated to the
> debugger.
> >       >
> >       >
> >       > Yes, but IIRC that issue was to separate the writable file from
> the read-only file.  I remember dealing with this when working on Newspeak
> in 2007/2008. So SourceFiles can easily maintain a writable file and a
> >       read-only copy
> >       > of the file for both sources and changes and do writes through
> the writable one.
> >       >
> >       >
> >       >       >
> >       >       > So I would provide something like
> >       >       >   SourceFiles withSubstituteCopiesWhile: aBlock
> >       >       > which would install either copies of the files or
> read-only copies of the files for the duration of the block, and have the
> debugger use the method around its access to method source.
> >       >       >
> >       >       > The IDE is essentially single threaded as far as code
> modification goes, even if this isn’t enforced. There is no serialization
> on adding/removing methods and concurrent access can corrupt method
> >       dictionaries,
> >       >       and that limitation is fine in practice.  So having the
> same restriction on source file access is fine too (and in fact I think the
> restriction already applies; if one were to fork compiles then source
> >       update to
> >       >       the changes file could get corrupted too).
> >       >       >
> >       >       > So I think not using read-only copies to read source,
> and having the debugger use copies for its access would be a good
> lightweight solution.
> >       >
> >       >       I agree with what you wrote about method changes, but
> reading the sources
> >       >       concurrently is still a possibility, especially when
> multiple UI processes
> >       >       can exist at the same time (e.g. that's what opening a
> debugger does,
> >       >       right?).
> >       >
> >       >
> >       > My assertion is that the IDE is essentially single0-threaded and
> this doesn't;t have to be supported.  In any case, concurrent access will
> work if processes of the same priority level are cooperating.  But I
> >       just answered the
> >       > debugger issue.  I'm assuming that the debugger guards all its
> source access by substituting a different file.  So it, and only it,
> accesses the sources files through copies, and it, and only it pays the cost
> >       for substituting
> >       > the copies.  Normal queries can use a single read-only copy.
> That gives us the functionality of cacheDuring: without having to invoke it.
> >
> >       The IDE is single-threaded but source files may be read outside the
> >       context of the IDE.
> >
> >
> > Can you give me a for instance.  I simply don't believe you.  And even
> its it's true I don't see that it has to be supported.  Please don't be
> vague.  This is important.
>
> For example, Seaside has a web-based code browser. The webserver, no
> matter which one is used by Seaside, will read the code from a process
> different than the UI process.
>

Yes, but there's still no implication that source access should be
thread-safe.  The Seaside access to the IDE is still happening in the
context of a cooperatively threaded Smalltalk, and there are places in
Seaside where access to the IDE could be serialized without relying on
support for thread safe source access when there is no thread-safe access
to adding/removing methods.  So for the Seaside browser to function
properly synchronization needs to be added to the general interface between
Seaside and the IDE, not just source access.  For example, if we had a
Seaside Squeak IDE server that allowed sharing between multiple programmers
I suggest that the right way to serialize access is to provide some kind of
synchronized queue between Seaside and the IDE, not to try and make the IDE
thread-safe.  Updating things like class definitions, which potentially
imply recompiling all methods in a class hierarchy require that no other
modifications to the class hierarchy are occrring while a class and its
subclasses are being redefined.

>       > So let me reiterate.
> >       >
> >       > SourceFiles is modified to have a single writable version of the
> changes file and a single read-only version of sources nd changes files.
> Source code is read through the readable copy and new source written
> >       through the
> >       > writable copy.  Whenever the debugger accesses source it does so
> through a method that first saves the files, substitutes copies in
> SourcesFiles, evaluates its block (that will access source through the
> copies),
> >       and then
> >       > ensures that the original files are restored.  There can be
> error checking for writing to the changes file in the debugger while writes
> are in progress to the original writable changes file, although I'm not
> >       sure this is
> >       > necessary; folks debugging source file access usually know what
> they're doing.
> >       >
> >       > The result is that
> >       > - normal source reading does not require creating a read-only
> copy; it already exists.
> >
> >       Do you mean that the existence of #readOnlyCopy is satisfactory?.
> >
> >
> > Yes.
> >
> >       Creating a copy for every single file access is painfully slow.
> >
> >
> > Exactly.
> >
> >       CurrentReadOnlySourceFiles only exists to remedy that by reusing
> the same
> >       read-only copy.
> >
> >
> > I feel like you're not understanding my proposal.  Apologies if I'm
> presuming.  With my proposal the only time a new copy is created is when
> the debugger wants to access the source of a method.  That happens on the
> order of
> > seconds, not microseconds as happens when scanning for source.
>
> You say that each tool should use the same shared file streams. If so,
> that implies that all sends of #readOnlyCopy have unnecessarily been added
> over the years except for those which are related to the debugger.
>

That's my opinion yes.

> And if I were to remove them along with CurrentReadOnlySourceFiles,
> everything would stay normal, right?
>

I think so.  Provided the debugger accesses source carefully, everything
should be OK.  It's certainly worth an experiment, right?

> >       > - the debugger does not interfere with source access because it
> is careful to use copies and leave the originals undisturbed
> >
> >       That's exactly what I tried to imply by stating there being no
> problem
> >       with the debugger before the mass use of read-only copies were
> introduced.
> >
> >
> > Can you not see that in any scheme there is the potential for chaos if
> the debugger is accessing the source as one steps through code of methods
> that themselves are in the process of accessing source?  And so it is key
> that
> > the debugger not* perturb the system when it itself accesses source?
> >
> > I'm confused.  We seem to be talking past each other.  I feel like
> you're blocking a reasonable proposal but I don't really understand what
> your objections are.  I apologize.  I'm not trying to be confrontational,
> but I do
> > think my proposal is important and has merit and I feel frustrated by
> you because I can't quite understand why you're against it.  If you can
> identify a serious flaw I'll happily abandon it.  But I need to understand
> the flaw
> > first.
>
> I think that we have different ideas about what the problem is:
>
> 1) You say that the debugger is a key source of problems right now.
> I say that I'm not aware of issues with the debugger. I know what the
> potential problem could be, but I don't know whether it exists or not
> right now because I have not seen any issues with the debugger lately.
> If you have a reproducible case, please share it here.
>

No I'm not saying that the debugger is a source of problems.  I'm saying
that if one wants to debug source access then w.r.t. source access the
debugger must not interfere.  Since the debugger necessarily accesses
source as it is used, it must not perturb source access while it is being
used to debug source access.

Let me be clear.  Let's say we want to step through

      (Object >> #at:) timeStamp

as we stepped through the debugger would access several methods in
CompiledMethod and FileStream.  If we only had one read-only copy for the
source file then the potential exists for the source pointer of the
read-only copy to be changed after it has been set to point at the chunk
for Object>>#at:, and hence get the wrong answer, maybe answering the
timeStamp for CompiledMethod>>preamble or some such.  So every time the
debugger accesses source it must be careful not to disturb the single
read-only copy of the source file.  It can easily do this by using a
private copy of the source file.

2) You say that the source files can safely be treated as
> only-to-be-used-by-the-IDE. I say that's not the case. IMO, I should
> be able to fork a process scanning the sources and do something else while
> it's processing the code. I also think that external tools like Seaside
> should be able to read the source files without messing up the image.
>

Have you ever done this is practice?

>
> Did I understand you correctly based on these two points above?
>

First, no.  Second, yes.  Is my clarification adequate?

> Levente
>
> >
> >
> >
> >       Levente
> >
> >       > - CurrentReadOnlySourceFiles and cacheDuring: can be discarded
> >       >
> >       >
> >       >
> >       >
> >       >
> >       >       Levente
> >       >
> >       >       >
> >       >       >> Levente
> >       >       >
> >       >       > Eliot
> >       >
> >       >
> >       >
> >       > --
> >       > _,,,^..^,,,_
> >       > best, Eliot
> >       >
> >       >
> >
> >
> >
> > --
> > _,,,^..^,,,_
> > best, Eliot
> >
> >
>

-- 
_,,,^..^,,,_
best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20200401/8df366b0/attachment-0001.html>