[squeak-dev] Server timeouts and 504 return codes

Chris Muller ma.chris.m at gmail.com
Mon Jan 28 03:04:58 UTC 2019


Whew!   :)

> >>>>> Yes, the SqueakMap server image is one part of the dynamic, but I
> >>>>> think another is a bug in the trunk image.  I think the reason Tim is
> >>>>> not seeing 45 seconds before error is because the timeout setting of
> >>>>> the high-up client is not being passed all the way down to the
> >>>>> lowest-level layers -- e.g., from HTTPSocket --> WebClient -->
> >>>>> SocketStream --> Socket.  By the time it gets down to Socket which
> >>>>> does the actual work, it's operating on its own 30 second timeout.
> >>>>
> >>>> I would expect subsecond reponse times. 30 seconds is just unacceptably
> >>>> long.
> >>>
> >>> Well, it depends on if, for example, you're in the middle of
> >>> Antarctica with a slow internet connection in an office with a fast
> >>> connection.  A 30 second timeout is just the maximum amount of time
> >>> the client will wait for the entire process before presenting a
> >>> debugger, that's all it can do.
> >>
> >> We can be sure that Tim should get subsecond response times instead of
> >> timeouts after 30 seconds.
> >
> > Right, but timeout settings are a necessary tool sometimes, my point
> > was that we should fix client code in trunk to make timeouts work
> > properly.
> >
> > Incidentally, 99% of SqueakMap requests ARE subsecond -- just go to
> > map.squeak.org and click around and see.  For the remaining 1% that
> > aren't, the issue is known and we're working on a new server to fix
> > that.
>
> Great! That was my point: the image needs to be fixed.

But, you're referring to the server image as "the image needs to be
fixed", which I've already conceded, whereas I'm referring to the
client image -- our trunk image -- as also needing the suspected
bug(s) with WebClient (et al) fixed.

> >>>>> It is a fixed amount of time, I *think* still between 30 and 45
> >>>>> seconds, that it takes the SqueakMap server to save its model after an
> >
> > and so if in the meantime it can simply be made to wait 45s instead of
> > 30s, then current SqueakMap will only be that occasional delay at
> > worst, instead of the annoying debugger we currently get.
>
> I don't see why that would make a difference: the user would get a
> debugger anyway, but only 15 seconds later.

No!  :)  As I said:

> >>>>> It is a fixed amount of time, I *think* still between 30 and 45
> >>>>> seconds, that it takes the SqueakMap server to save its model

So they would get a response < 15s later, not a debuuger.

The server needs the same amount of time to save every time whenever
it happens -- it's very predictable -- and right now to avoid a
debugger Squeak trunk image simply needs to be fixed to honor the 45s
timeout instead of ignoring it and always defaulting to 30.

> > Are alan and andreas co-located?
>
> They are cloud servers in the same data center.
>
> >
> >> The file doesn't have to be read from the disk either.
> >
> > I assume you mean "read from disk" on alan?  What about after it's
> > cached so many mcz's in RAM that its paging out to swap file?  To me,
> > wasing precious RAM (of any server) to cache old MCZ file contents
> > that no one will ever download (because they become old very quickly)
> > feels wasteful.  Dragster cars are wasteful too, but yes, they are
> > "faster"... on a dragstrip.  :)  I guess there'd have to be some kind
> > of application-specific smart management of the cache...
>
> Nginx's proxy_cache can handle that all automatically. Also, we don't need
> a large cache. A small, memory-only cache would do it.

How "small" could it be and still contain all the MCZ's you want to
use to update an an "old" image?

> > Levente, what about the trunk directory listing, can it cache that?
>
> Sure.
>
> > That is the _#1 thing_ source.squeak.org is accessing and sending back
> > over, and over, and over again -- every time that MC progress box that
> > says, "Updating [repository name]".
>
> Right, unless you update an older image.

System resources should not be allocated to optimizing "build" and
"initialize" use-cases.  Those UC's are one offs run by developers,
typically even in the background.

System resources should be optimized around actual **end-user's
interacting with UI's**...

> >> If the client does have the mcz, then we save the complete file transfer.
> >>
> >>>
> >>> I don't know what the speed between alan <---> andreas is, but I doubt
> >>> it's much slower than client <---> alan in most cases, so the savings
> >>> would seem to be minimal..?
> >>
> >> The image wouldn't have to open a file, read its content from the disk and
> >> send that through a socket.
> >
> > By "the image" I assume you mean the SqueakSource server image.  But
> > opening the file takes very little time.  Original web-sites were
> > .html files, remember how fast those were?  Plus, filesystems "cache"
> > file contents into their own internal caches anyway...
>
> Each file uses one external semaphore, each socket uses three. If you use
> a default image, there can be no more than 256 external semaphores which
> is ridiculous for a server,

So, that is that (256 / 4 = 64) concurrent requests for a MCZ before
it is full?   Probably enough for our small community, but you also
said that's just a default we can increase?  Something I'd like to
know if I need for Magma too, where can I find this setting?

> and it'll just grind to a halt when some load
> arrives. Every time the external semaphore table is full, a GC is
> triggered to try clear it up via the finalization process.
> Reading a file into memory is slow, writing it to a socket is slow.
> (Compared to nginx which uses sendfile to let the kernel handle that).
> And Squeak can only use a single process to handle everything.

To me, it comes back to UX.  If we ever get enough load for that to be
an issue, it might be worth looking into.

> > Yes, it still has to return back through alan but I assume alan does
> > not wait for a "full download" received from andreas before its
> > already pipeing back to the Squeak client.  If true, then it seems
> > like it only amounts to saving one hop, which would hardly be
> > noticeable over what we have now.
>
> The goal of caching is not about saving a hop, but to avoid handling files
> in Squeak.
>
> >
> >> Nginx does that thing magnitudes faster than
> >> Squeak.
> >
> > The UX would not be magnitudes faster though, right?
>
> Directly by letting nginx serving the file, no, but the server image would
> be less likely to get stalled (return 5xx responses).

SqueakMap and SqueakSource.com are old still with plans for upgrading,
but are you still getting 5xx's on source.squeak.org?

> But the caching scheme I described in this thread would make the UX a lot
> quicker too, because data would not have to be transferred when you
> already have it.

I assume you mean "data would not have to be transferred" from andreas
to alan... from within the same data center..!   :)

> >>>>>> That would also let us save bandwidth by not downloading files already
> >>>>>> sitting in the client's package cache.
> >>>>>
> >>>>> How so?  Isn't the package-cache checked before hitting the server at
> >>>>> all?  It certainly should be.
> >>>>
> >>>> No, it's not. Currently that's not possible, because different files can
> >>>> have the same name. And currently we have no way to tell them apart.
> >>>
> >>> No.  No two MCZ's may have the same name, certainly not withiin the
> >>> same repository, because MCRepository cannot support that.  So maybe
> >>
> >> Not at the same time, but it's possible, and it just happened recently
> >> with Chronology-ul.21.
> >> It is perfectly possible that a client has a version in its package cache
> >> with the same name as a different version on the server.
> >
> > But we don't want to restrict what's possible in our software design
> > because of that.  That situation is already a headache anyway.  Same
> > name theoretically can come only from the same person (if we ensure
> > unique initials) and so this is avoidable / fixable by resaving one of
> > them as a different name...
>
> It wasn't me who created the duplicate. If your suggestion had been in
> place, some images out there, including mine, would have been broken by
> the update process.

I don't think so, since I said it would open up the .mcz in
package-cache and verify the UUID.

I guess I don't know what you mean -- I see only one Chronology-ul.21
in the ancestry currently anyway..

> >>> we need project subdirectories under package-cache to properly
> >>> simulate each cached Repository.  I had no idea we were neutering 90%
> >>> of the benefits of our package-cache because of this too, and just
> >>> sitting here, I can't help wonder whether this is why MCProxy doesn't
> >>> work properly either!
> >>>
> >>> The primary purpose of a cache is to *check it first* to speed up
> >>> access to something, right?  What you say about package-cache sounds
> >>
> >> I don't know. It wasn't me who designed it. :)
> >
> > I meant ANY "cache".
> >
> >   https://en.wikipedia.org/wiki/Cache_(computing)
>
> It still depends on the purpose of the cache. It's possible that
> package-cache is just a misnomer or it was just a plan to use it as a
> cache which hasn't happened yet.
>
> >
> > For Monticello, package-cache's other use-case is when an
> > authentication issue occurs when trying to save to a HTTP repository.
> > At that point the Version object with the new ancestry was already
> > constructed in memory, so rather than worry about trying to "undo" all
> > that, it was simpler and better to save it to a package-cache, persist
> > it safely so the client can simply move forward from there (get access
> > to the HTTP and copy it or whatever).
>
> The package-cache is also handy as a default repository and as an offline
> storage.

I'm sure you would agree it's better for client images to check their
local package-cache first before hitting nginx.


 - Chris


More information about the Squeak-dev mailing list