Bugzilla – Bug 4643
race condition for requests with 2 directory creation
Last modified: 2006-11-13 12:17:26
You need to log in before you can comment on or make changes to this bug.
Here is an example request I lifted from campaign bug 4506 This is what condor-g does for all jobs. <fileStageIn> <maxAttempts>5</maxAttempts> <transferCredentialEndpoint> ... </transferCredentialEndpoint> <transfer> <sourceUrl> gsiftp://osg-test2.unl.edu:2811/tmp/condor_g_scratch.0x9fa5438.30776\ /empty_dir_u1465/ </sourceUrl> <destinationUrl> gsiftp://osg-test1.unl.edu:2811/home/gpn/.globus/scratch </destinationUrl> </transfer> <transfer> <sourceUrl> gsiftp://osg-test2.unl.edu:2811/tmp/condor_g_scratch.0x9fa5438.30776\ /empty_dir_u1465/</sourceUrl> <destinationUrl> gsiftp://osg-test1.unl.edu:2811/home/gpn/.globus/scratch/\ job_8ff6c880-0cc4-11db-b248-a9807d8bba43/ </destinationUrl> </transfer> <transfer> <sourceUrl> gsiftp://osg-test2.unl.edu:2811/home/feller/myTests/\ 3500_jobs_2006_07_06_Mxm1024M/mysleep </sourceUrl> <destinationUrl> gsiftp://osg-test1.unl.edu:2811/home/gpn/.globus/scratch/\ job_8ff6c880-0cc4-11db-b248-a9807d8bba43/mysleep </destinationUrl> </transfer> <transfer> <sourceUrl> gsiftp://osg-test2.unl.edu:2811/home/feller/myTests/\ 3500_jobs_2006_07_06_Mxm1024M/test_input </sourceUrl> <destinationUrl> gsiftp://osg-test1.unl.edu:2811/home/gpn/.globus/scratch/\ job_8ff6c880-0cc4-11db-b248-a9807d8bba43/test_input </destinationUrl> </transfer> </ns2:fileStageIn>
Fix in globus_4_0_branch. I will attach a patch to this bug for 4.0.2 releases and merge the fix to trunk
Created an attachment (id=1024) [details] Patch on globus_4_0_branch to fix this Patch on globus_4_0_branch to fix this
Created an attachment (id=1025) [details] Patch on globus_4_0_branch to fix concurrency
I found 2 errors during one of the last tests. They did not affect the success of the jobs. I saw those errors in a much bigger number in a test where the gridftp-server on osg-test2 could not be reached (probably due to network problems). 1 time(s): ----------------- ERROR service.TransferClient [WorkThread-N,normalNonExtendedTransfer:871] Exception in transfer org.globus.ftp.exception.ServerException: Reply wait timeout. (error code 4) at org.globus.ftp.vanilla.FTPControlChannel.waitFor(FTPControlChannel.java:213) at org.globus.ftp.vanilla.TransferMonitor.run(TransferMonitor.java:125) at java.lang.Thread.run(Thread.java:534) 1 time(s): ----------------- ERROR service.TransferWork [WorkThread-N,run:726] Transient transfer error Transient transfer error Reply wait timeout. (error code 4) [Caused by: Reply wait timeout. (error code 4)] Transient transfer error Reply wait timeout. (error code 4) . Caused by org.globus.ftp.exception.ServerException: Reply wait timeout. (error code 4) at org.globus.ftp.vanilla.FTPControlChannel.waitFor(FTPControlChannel.java:213) at org.globus.ftp.vanilla.TransferMonitor.run(TransferMonitor.java:125) at java.lang.Thread.run(Thread.java:534)
Created an attachment (id=1026) [details] Patch on globus_4_0_branch to fix concurrency
fix merged to trunk