Socket Problems

Lukas Renggli renggli at hotmail.com
Fri Sep 12 19:11:10 UTC 2003


Hi Andreas, Ian,

> My best guess (from afar) is that the connection is closed 
> immediately after the reply has been sent and that something 
> is going wrong in the handling of this situation.

Right, that seems to be the problem. 

The result of the posted code on the production server (Debian Linux, 
Squeak Unix VM 3.6-beta6) is the following (with -notimer already turned 
on):

	a) client statusString.    --> 'connected'
	b) client getData.         --> 'Response'
	c) client statusString.    --> 'connected'
	d) client getData.         --> 'getData timeout' exception is raised
	e) client statusString.    --> 'otherEndClosedButNotThisEnd'

Is this ok? I've tried to run in several times, but it looks all the 
time the same. Should we try it in a loop? Should we write a test that 
runs it in parallel with different sockets? 

> Yes they are. Unless you attempt to use the same socket 
> simultanously from different threads there is no problem. 
> If you do the latter then you have the "common" issues of
> synchronizing access to a shared resource from different 
> threads.

Ok, lets check that on the production machine: 

	PGConnection allInstances collect: [ :each | each instVarAt: 4 ]
	--> #(a Socket[connected] a Socket[connected] a Socket[connected] ...

	PGConnection allInstances collect: [ :each | (each instVarAt: 4) 
oopString ]
	--> #('1810' '2601' '1928' '1596' '2804' '2495' '3428' '194' '3276' ...

All sockets used by the PGConnection instances seem to be different, 
right? So this should be not the problem.

> First of all, run the above example and see what this gets
> you. It may be that the VM is merely reporting that the 
> "otherEndClosedButNotThisEnd" and that your code deduces from
> this that there is no readable data on the socket (e.g., 
> #isConnected will return false which in turn may screw 
> #getData and friends). In this situation you may still be able
> to read data from the socket (can't say without having tried
> it; but see what #primSocket:receiveDataInto:startingAt:count:
> reports) and simply work around this problem.

So that would rewrite the message reading the data from the Postgres 
server? For clarity I removed the log-statements, it basically looks 
like this:

	PGConnection>>next
		readIndex >= lastReadIndex ifTrue: [
			(socket waitForDataUntil: Socket standardDeadline)
				ifFalse: [self 
error: 'timed out getting data'].
			[(lastReadIndex _ socket 
receiveDataInto: readBuffer) = 0
				ifTrue: [(Delay forMilliseconds: 
100) wait].
			lastReadIndex = 0] whileTrue.
			readIndex _ 0].
	
readIndex _ readIndex + 1.
	^ readBuffer at: readIndex

I do not understand what you actually suggest to change.

Thanks a lot for your help,
Lukas

-- 
Lukas Renggli
http://renggli.freezope.org



More information about the Squeak-dev mailing list