<?xml version="1.0" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "http://bugzilla.mcs.anl.gov/accessgrid/bugzilla.dtd">

<bugzilla version="3.2.3"
          urlbase="http://bugzilla.mcs.anl.gov/accessgrid/"
          maintainer="webmaster@mcs.anl.gov"
>

    <bug>
          <bug_id>77</bug_id>
          
          <creation_ts>2003-02-19 15:48</creation_ts>
          <short_desc>Venue Server caused VenueClient and Nodemanager to hang</short_desc>
          <delta_ts>2003-08-14 17:30:12</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>Virtual Venue Server Software</product>
          <component>Virtual Venue Server</component>
          <version>2.0-beta1</version>
          <rep_platform>PC</rep_platform>
          <op_sys>Windows XP</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          
          
          <priority>P2</priority>
          <bug_severity>critical</bug_severity>
          <target_milestone>---</target_milestone>
          
          <blocked>79</blocked>
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Shawn Davis">wdavis@ncsa.uiuc.edu</reporter>
          <assigned_to name="Robert Olson">olson@mcs.anl.gov</assigned_to>
          <cc>lefvert@mcs.anl.gov</cc>

      

      
          <long_desc isprivate="0">
            <who name="Shawn Davis">wdavis@ncsa.uiuc.edu</who>
            <bug_when>2003-02-19 15:48:34</bug_when>
            <thetext>VenueClient and NodeManagement hung when trying to change venues, same 
happened for another user connected to the same venue.  VenueServer continued 
to checkpoint properly.
When I restarted the venueserver, the venueclient and nodemanagement came back 
to life and entered the venue properly.</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Ivan R. Judson">judson@mcs.anl.gov</who>
            <bug_when>2003-02-22 10:42:54</bug_when>
            <thetext>Shawn,

Is this still happening with Alpha 3?

--Ivan</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Ivan R. Judson">judson@mcs.anl.gov</who>
            <bug_when>2003-02-24 15:18:26</bug_when>
            <thetext>And was it perhaps an expired credential? If so this is another of your we&apos;re 
working on :-)/</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Shawn Davis">wdavis@ncsa.uiuc.edu</who>
            <bug_when>2003-02-24 15:51:26</bug_when>
            <thetext>No, wasn&apos;t related to the expired credentials.  

As to whether or not I&apos;m seeing it in Alpha 3, I have not yet encountered it, 
but will notify you if it happens again.</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Shawn Davis">wdavis@ncsa.uiuc.edu</who>
            <bug_when>2003-02-27 13:51:47</bug_when>
            <thetext>I am encountering similar behavior trying to connect to the transitional venue 
server right now.  So maybe it is still present in alpha 3?</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Ivan R. Judson">judson@mcs.anl.gov</who>
            <bug_when>2003-02-27 18:47:20</bug_when>
            <thetext>If I read your timestamp right, you were seeing it at about 1:51PM CST.  That 
should have been before I restarted it with the latest code base which fixes a 
few bugs.

Let me know if you see it again and/or if you cna figure out how to 
deterministically cause this behavior.

--Ivan</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Ivan R. Judson">judson@mcs.anl.gov</who>
            <bug_when>2003-03-12 02:00:28</bug_when>
            <thetext>Can you confirm this is still broken?  I&apos;m confounded as to how to track this 
down :-)
</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Shawn Davis">wdavis@ncsa.uiuc.edu</who>
            <bug_when>2003-03-19 12:22:06</bug_when>
            <thetext>yes.  this bug still exists. it should be noted that any applications that are 
connected to the venueserver at the time the lockup occurs hang also.

The applications can be unfrozen by manually breaking one point of the 
communication that is causing the deadlock.  So, if the hang was caused by 
communications between a venueserver and a venueclient, killing one of the two 
processes resumes normal operation of the rest of the processes involved.  

I just had this lockup occur, and this time, I chose to kill the venueclient 
instead of the server.  The following output was sent to the venueserver 
console.
----------------------------------------
Exception happened during processing of request from (&apos;141.142.66.181&apos;, 2369)
Traceback (most recent call last):
  File &quot;c:\python22\lib\SocketServer.py&quot;, line 221, in handle_request
    self.process_request(request, client_address)
  File &quot;c:\python22\lib\SocketServer.py&quot;, line 240, in process_request
    self.finish_request(request, client_address)
  File &quot;c:\python22\lib\SocketServer.py&quot;, line 253, in finish_request
    self.RequestHandlerClass(request, client_address, self)
  File &quot;c:\python22\lib\SocketServer.py&quot;, line 514, in __init__
    self.handle()
  File &quot;c:\python22\lib\BaseHTTPServer.py&quot;, line 266, in handle
    method()
  File &quot;C:\Python22\lib\site-
packages\AccessGrid\hosting\pyGlobus\AGGSISOAP.py&quot;,
 line 3928, in do_POST
    self.send_response(status)
  File &quot;c:\python22\lib\BaseHTTPServer.py&quot;, line 313, in send_response
    self.wfile.write(&quot;%s %s %s\r\n&quot; %
  File &quot;C:\Python22\lib\site-packages\pyGlobus\io.py&quot;, line 364, in write
    self.sock.write(str, len(str))
  File &quot;C:\Python22\lib\site-packages\pyGlobus\io.py&quot;, line 253, in write
    raise ex
IOBaseException: a system call failed (Invalid argument)
----------------------------------------

There is no output in any of the logs that provides indication of an error.

The venue management tool that was also frozen came back to life as soon as 
the venueclient was killed.
I was then able to reconnect my venueclient to the same instance of the 
venueserver and it operated as expected.</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Ivan R. Judson">judson@mcs.anl.gov</who>
            <bug_when>2003-04-21 15:58:59</bug_when>
            <thetext>Can you try to make this happen in beta 3?</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Ivan R. Judson">judson@mcs.anl.gov</who>
            <bug_when>2003-04-29 09:19:18</bug_when>
            <thetext>*** Bug 174 has been marked as a duplicate of this bug. ***</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Ivan R. Judson">judson@mcs.anl.gov</who>
            <bug_when>2003-05-29 18:50:57</bug_when>
            <thetext>I&apos;m going to reclassify this as the &quot;general stability bug&quot; and reassign it to 
Bob who&apos;s actively working on this.</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Ivan R. Judson">judson@mcs.anl.gov</who>
            <bug_when>2003-05-29 18:52:56</bug_when>
            <thetext>*** Bug 295 has been marked as a duplicate of this bug. ***</thetext>
          </long_desc>
          <long_desc isprivate="0">
            <who name="Robert Olson">olson@mcs.anl.gov</who>
            <bug_when>2003-08-14 17:30:12</bug_when>
            <thetext>Resolving, much of the underlying instability we saw has been replaced.
if we see more problems, please refile.</thetext>
          </long_desc>
      
      

    </bug>

</bugzilla>