[SqueakDBX] Fwd: Increasing the performances of a Seaside application

Tue Jul 5 21:09:09 UTC 2011

---------- Forwarded message ----------
From: Henrik Sperre Johansen <henrik.s.johansen at veloxit.no>
Date: Fri, Jun 3, 2011 at 8:35 AM
Subject: Re: Increasing the performances of a Seaside application
To: Mariano Martinez Peck <marianopeck at gmail.com>

**
On 01.06.2011 20:54, Mariano Martinez Peck wrote:

On Tue, May 31, 2011 at 11:30 PM, Henrik Sperre Johansen <
henrik.s.johansen at veloxit.no> wrote:

>   Thanks.  I was not clear. What we actually do is:
>>>
>>>   (code = OpenDBX resultTimeout) ifTrue: [ (Delay forMilliseconds:
>>> (aQuerySettings timeout asMiliseconds)) wait  ].
>>>
>>> Is that better?  Even if it lets just run processes of the same priority,
>>> this is good anyway because what we want is at least be able to process
>>> other queries. Probably, those other processes are being done from other
>>> Process.
>>>
>>
>>  It's a bit better. There's no starvation if the timeout is greater than
>> zero, but it's still a form of busy waiting, and it limits the number of
>> queries per second per connection to at most 1000 (actually 1000 / timeout).
>> To compare this with our native implementation - PostgresV3 - I measured 6k+
>> queries per second per connection and it's still not optimized for Cog
>> (#perform: is slow on Cog).
>>
>>
> Thanks Levente. Unfortunatly I guess that's all we can do with a blocking
> FFI :(
>
> Not really :)
>

Thanks Henrik.

Before analyzing your suggestions, let me tell you something stupid we did
in DBX that I have just realized.  There are TWO different timeouts.

Yes :)

1) OpenDBX timeout: the one send by parameter to OpenDBX function:
http://www.linuxnetworks.de/doc/index.php/OpenDBX/C_API/odbx_result
that determinates the time the C function of OpenDBX waits for the result.

nextResultSet: aConnection querySettings: aQuerySettings onReturn: aBlock
    "Returns the next resultSet from the last resultSet. When there is no
more resultSets,
    the block is evaluated."
    | handle err handleArray |
    handleArray := WordArray with: 0.

    err := OpenDBX current
                apiQueryResult: aConnection handle
                handle: handleArray
                timeout: aQuerySettings timeoutAsDBXTimeSpec
                chunk: aQuerySettings pageSize.
......

2) SqueakDBX timeout: the time we wait in the IMAGE side once we got a
timeout from OpenDBX.
this is what I showed you:

    (code = OpenDBX resultTimeout) ifTrue: [ (Delay forMilliseconds:
(aQuerySettings timeout asMiliseconds)) wait  ].

So....as you can see we are using both values for both things. This is not
necessary and maybe stupid.

Yes :)

The default timeout now is 10 miliseconds

>> defaultTimeout
    "10 miliseconds"
    ^DBXQueryTimeout seconds: 0 microseconds: 10.

So...if I follow your a)  you smartly recommend to use 100ms. And in this
case you are talking about the OpenDBX timeout and only for the first time.
This way most queries will be cought in the first try and even if they do
not, we return fast. And then, for future calls of the same query (only if
there is a timeout) we use a really short timeout. For example 1ms. The idea
is to wait as much as possible in image side (Delay) rather than C.

This isn't milliseconds, this is microSeconds (when you use it for the C
call) :)
I'm not sure of DBX implementation, if it returns immediately when done or
the entire period.
Should be easily testable by setting a really long timeout, like 1 million
(1second) and repeating a query you know only take a couple of milliseconds
to complete, say 10 times.
If the test takes 10 seconds wall time, you know it wait entire period, if
not, then it's safe to set the timeout to the maximum amount of time you
feel it's acceptable to block the image. (at first call, subsequent should
still block for mimimum amount in C and rather use the delay)

At the same time, with b) you recommend you use an incremental SqueakDBX
timeout (the Delay). So we can start with 1 ms and then grow 2 4 8 16 32 64
128 256 512 up to 1024. And if we get until 1024 we continue using that
value?   but isn't 1ms too small? because this value will be used if a
timeout happened (the result took more than 100ms). So it is quite weird
that it will be ready just 1ms after. No?

so...did I understand correctly ?

The timeout in C call is in microseconds, thus 100 means 1/10th of a
millisecond, not 100 milliseconds.
Thus starting at 1ms delay makes more sense. Other than that, you understood
perfectly.

>
> You could
> a) Use a default timeout for the first call which means it actually
> completes more queries on the first try yet still returns fast, say 100ms
> rather than 1ms.
> (For later calls just to check if it is possibly finished, you probably
> want to block for as short a time as possible though)
> b) Use an exponentially growing value for the Delay rather than a constant
> one, starting at 1ms and max some other value
> 1 2 4 8 16 32 64 128 256 512 1024 for instance, polling once per second
> shouldn't hurt other processes at all, yet give ok responsiveness for
> queries > 1 seconds.
>
> This way, you (in the cases where potential is 9k queries /sec) will have a
> hard cap at 10k queries (due to the 100ms block time), and hurt those above
> that as little as possible using Delays. What you don't have though, is a
> cap of around 1k, due to calls never completing in 1ms, and having to wait
> (at least, I don't know the default value of aQuerySettings timeout) 1ms for
> each due to minimum delay wait time resolution.
>
>
>
>
   Btw, unless the microseconds and seconds are switched, this could be
> simpler (as well as misspelled :) ):
> DBXQueryTimeout >> asMiliseconds
>     ^ (self seconds * 1000) + (((self microseconds / 1000) asFloat)
> integerPart asInteger )
>     ^ (self seconds * 1000) + (self microseconds // 1000)
>
>
Thanks :)

Of course, this round down. If you want it rounded UP to closes millisecond,
you can do:
  ^ (self seconds * 1000) + (999 + self microseconds // 1000)
Or if you want to round to nearest:
  ^ (self seconds * 1000) + (500 + self microseconds // 1000)

>  DBXTimeSpec also has an field called nseconds which contains microseconds,
> rather confusing :)
>

yes, I know. The problem was the OpenDBX/C uses that structure but from
image side it was nicer to use microseconds hehehehe

The C struct contains microseconds, and that's what it's being used as
image-side as well :)
Ie. it should be named microSeconds or something instead (mseconds is too
ambiguos )

TLDR; You understood perfectly, using the same timeout for both blocking in
C, and waiting in image between C calls is not the best idea.
Using a longer initial C timeout ensures you get -better- than Delay
resolution response times in the cases where that is possible.

Cheers,
Henry

PS. Is there a lock somewhere?
What happens if you do two queries, how do you handle waiting for both at
the same time, and getting the correct result set to the correct sender?

-- 
Mariano
http://marianopeck.wordpress.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeakdbx/attachments/20110705/76cbf2af/attachment.htm