Bug 4643 - race condition for requests with 2 directory creation
: race condition for requests with 2 directory creation
Status: RESOLVED FIXED
: RFT
RFT
: 4.0.2
: All All
: P3 major
: 4.0.3
Assigned To:
:
:
:
: 4506
  Show dependency treegraph
 
Reported: 2006-08-04 16:28 by
Modified: 2006-11-13 12:17 (History)


Attachments
Patch on globus_4_0_branch to fix this (9.41 KB, patch)
2006-08-04 16:53, Ravi Madduri
Details
Patch on globus_4_0_branch to fix concurrency (4.75 KB, patch)
2006-08-04 18:09, Ravi Madduri
Details
Patch on globus_4_0_branch to fix concurrency (11.97 KB, patch)
2006-08-07 11:43, Ravi Madduri
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2006-08-04 16:28:11
Here is an example request I lifted from campaign bug 4506
This is what condor-g does for all jobs.

<fileStageIn>
  <maxAttempts>5</maxAttempts>
  <transferCredentialEndpoint>
      ...
  </transferCredentialEndpoint>

  <transfer>
      <sourceUrl>
        gsiftp://osg-test2.unl.edu:2811/tmp/condor_g_scratch.0x9fa5438.30776\
                 /empty_dir_u1465/
      </sourceUrl>
      <destinationUrl>
        gsiftp://osg-test1.unl.edu:2811/home/gpn/.globus/scratch
      </destinationUrl>
  </transfer>

  <transfer>
      <sourceUrl>
        gsiftp://osg-test2.unl.edu:2811/tmp/condor_g_scratch.0x9fa5438.30776\
                 /empty_dir_u1465/</sourceUrl>
      <destinationUrl>
        gsiftp://osg-test1.unl.edu:2811/home/gpn/.globus/scratch/\
                 job_8ff6c880-0cc4-11db-b248-a9807d8bba43/
      </destinationUrl>
  </transfer>

  <transfer>
      <sourceUrl>
        gsiftp://osg-test2.unl.edu:2811/home/feller/myTests/\
                  3500_jobs_2006_07_06_Mxm1024M/mysleep
      </sourceUrl>
      <destinationUrl>
        gsiftp://osg-test1.unl.edu:2811/home/gpn/.globus/scratch/\
                 job_8ff6c880-0cc4-11db-b248-a9807d8bba43/mysleep
      </destinationUrl>
  </transfer>

  <transfer>
      <sourceUrl>
        gsiftp://osg-test2.unl.edu:2811/home/feller/myTests/\
                 3500_jobs_2006_07_06_Mxm1024M/test_input
      </sourceUrl>
      <destinationUrl>
        gsiftp://osg-test1.unl.edu:2811/home/gpn/.globus/scratch/\
                  job_8ff6c880-0cc4-11db-b248-a9807d8bba43/test_input
     </destinationUrl>
  </transfer>

</ns2:fileStageIn>
------- Comment #1 From 2006-08-04 16:51:40 -------
Fix in globus_4_0_branch. I will attach a patch to this bug for 4.0.2 releases
and merge the fix to trunk
------- Comment #2 From 2006-08-04 16:53:28 -------
Created an attachment (id=1024) [details]
Patch on globus_4_0_branch to fix this

Patch on globus_4_0_branch to fix this
------- Comment #3 From 2006-08-04 18:09:51 -------
Created an attachment (id=1025) [details]
Patch on globus_4_0_branch to fix concurrency
------- Comment #4 From 2006-08-07 03:55:37 -------
I found 2 errors during one of the last tests. They did not affect the success
of the jobs. 
I saw those errors in a much bigger number in a test where the gridftp-server
on osg-test2 could not be reached (probably due to network problems). 


1 time(s):
-----------------
ERROR service.TransferClient [WorkThread-N,normalNonExtendedTransfer:871]
Exception in transfer
org.globus.ftp.exception.ServerException: Reply wait timeout. (error code 4)
at org.globus.ftp.vanilla.FTPControlChannel.waitFor(FTPControlChannel.java:213)
at org.globus.ftp.vanilla.TransferMonitor.run(TransferMonitor.java:125)
at java.lang.Thread.run(Thread.java:534)

1 time(s):
-----------------
ERROR service.TransferWork [WorkThread-N,run:726] Transient transfer error
Transient transfer error
Reply wait timeout. (error code 4) [Caused by: Reply wait timeout. (error
code 4)]
Transient transfer error
Reply wait timeout. (error code 4)
. Caused by
org.globus.ftp.exception.ServerException: Reply wait timeout. (error code 4)
at org.globus.ftp.vanilla.FTPControlChannel.waitFor(FTPControlChannel.java:213)
at org.globus.ftp.vanilla.TransferMonitor.run(TransferMonitor.java:125)
at java.lang.Thread.run(Thread.java:534)
------- Comment #5 From 2006-08-07 11:43:28 -------
Created an attachment (id=1026) [details]
Patch on globus_4_0_branch to fix concurrency
------- Comment #6 From 2006-08-07 15:27:49 -------
fix merged to trunk