Bug 1691 - Timeout connecting to bridge registry
: Timeout connecting to bridge registry
Status: RESOLVED FIXED
: Virtual Venues Client Software
Client UI
: 3.1 beta 1
: PC Linux
: P2 normal (vote)
: ---
Assigned To:
:
:
:
  Show dependency treegraph
 
Reported: 2007-08-08 06:27 by
Modified: 2008-07-16 15:30 (History)


Attachments
VenueClient Logs (719.79 KB, application/x-gzip-compressed)
2007-08-08 06:29, Antonio Bandeira
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2007-08-08 06:27:05
This is about the registry problem that I had yesterday (07-08-2007) at about 
13h30 (Lisbon time - GMT+1).

Using FC7 and AG3.1b1.

I was trying to update the bridge list and, after hitting ‘Find Additional 
Bridges’ and/or ‘Purge Bridge Cache’ under Preferences the list went empty.
Tried for a few times to get the list filled, also tried to restart the client 
and even my box, but no luck.
When I came back from lunch, things was back to normal. At VenueClient startup 
the splash screen said something like: ‘updating bridge information’ and, when 
done, the bridge list was back.

The VenueClient logs can be found here: 
http://streamer.fe.up.pt/Members/bandeira/VenueClient-log.tgz/view
------- Comment #1 From 2007-08-08 06:29:04 -------
Created an attachment (id=208) [details]
VenueClient Logs
------- Comment #2 From 2007-08-10 14:36:39 -------
The error from the log is as below.  We'll consider adding some logging to the
registry peer (server) to understand what is happening with these connections.  
The only concern with doing this now is that we need a lighter weight mechanism
for importing logging support, which now depends on application initialization
that brings in much code unrelated to registries and logging.

08/07/2007 02:11:26 PM -1208940864 RegistryClient     RegistryClient.py:66 ERROR
Failed to connect to registry vv3.mcs.anl.gov:8030
Traceback (most recent call last):
  File
"/usr/lib/python2.5/site-packages/AccessGrid3/AccessGrid/Registry/RegistryClient.py",
line 61, in _connectToRegistry
    if self.PingRegistryPeer(tmpServerProxy) > -1:
  File
"/usr/lib/python2.5/site-packages/AccessGrid3/AccessGrid/Registry/RegistryClient.py",
line 78, in PingRegistryPeer
    startTime = serverProxy.Ping(time.time())
  File "/usr/lib/python2.5/xmlrpclib.py", line 1147, in __call__
    return self.__send(self.__name, args)
  File "/usr/lib/python2.5/xmlrpclib.py", line 1437, in __request
    verbose=self.__verbose
  File "/usr/lib/python2.5/xmlrpclib.py", line 1185, in request
    errcode, errmsg, headers = h.getreply()
  File "/usr/lib/python2.5/httplib.py", line 1195, in getreply
    response = self._conn.getresponse()
  File "/usr/lib/python2.5/httplib.py", line 924, in getresponse
    response.begin()
  File "/usr/lib/python2.5/httplib.py", line 385, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.5/httplib.py", line 343, in _read_status
    line = self.fp.readline()
  File "/usr/lib/python2.5/socket.py", line 330, in readline
    data = recv(1)
timeout: timed out
------- Comment #3 From 2007-08-16 14:48:55 -------
More information:

Exceptions of this type appear numerous times in the output of RegistryPeer.py:

Traceback (most recent call last):
  File "/usr/lib/python2.3/SocketServer.py", line 463, in process_request_thread
    self.finish_request(request, client_address)
  File "/usr/lib/python2.3/SocketServer.py", line 254, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/lib/python2.3/SocketServer.py", line 521, in __init__
    self.handle()
  File "/usr/lib/python2.3/BaseHTTPServer.py", line 324, in handle
    self.handle_one_request()
  File "/usr/lib/python2.3/BaseHTTPServer.py", line 307, in handle_one_request
    self.raw_requestline = self.rfile.readline()
  File "/usr/lib/python2.3/socket.py", line 338, in readline
    data = self._sock.recv(self._rbufsize)
timeout: timed out
------- Comment #4 From 2008-04-08 14:13:57 -------
Updated title only.
------- Comment #5 From 2008-07-16 15:30:27 -------
This problem was caused by eventual exhaustion of available file handles.  When
connections would time out, we weren't properly freeing the associated file
handle, so they were accumulated.  This has been resolved.