Bugzilla – Bug 4802
4.0.x GRAM4 Performance Profiling and Improvements
Last modified: 2008-05-06 16:26:43
You need to log in before you can comment on or make changes to this bug.
Title: 4.0.x GRAM4 Performance Profiling and Improvements Definition: ========================================== The next phase in 4.0.x GRAM4 performance analysis is to look for performance improvements with the help of a profiler, such as yourkit. Analyzing service log files and debug messages has helped identify obvious problem areas, which have been fixed or improved. However, there could still be improvements to be made at the core or in common routines that execute quickly but called frequently enough to slow down performance. We need both sequential and concurrent job scenarios profiled. software ----------- service: GRAM4 service, latest from the globus_4_0_branch concurrent client: condor-g version ?? sequential client: globusrun-ws Test scenarios to run and gather service profile information for each: ---------------------------- 1) execute XX globusrun-ws jobs sequentially that do not do any staging and do not do delegation. 2) execute XX globusrun-ws jobs sequentially that stage In and Out a 10KB file, do delegation, create a unique job dir, and do file cleanup 3) execute XX condor-g jobs concurrently that do not do any staging and do not do delegation. 4) execute XX condor-g jobs concurrently that stage In and Out a 10KB file, do delegation, create a unique job dir, and do file cleanup XX should be as many jobs as necessary in order to gather useful results. Sequetial jobs might require 10 or more? For concurrent jobs, I'd link at least 50 or 100 would need to be submitted. Test/profile results will be collected at - http://www-unix.mcs.anl.gov/~gawor/GRAM/profiling/4-0-3 Deliverables: ========================================== 1) Collected test profile results for above tests 2) Analysis report for each of the profile results Include number of jobs submits in the scenarios 3) Modified code committed to CVS for any improvements made Tasks: ========================================== 1) Mark beginning sequential job performance results for code base. Use results from http://www-unix.mcs.anl.gov/~feller/GRAM/perf/4-0/ 2) Execute test scenario 1 a. collect profile report b. analyze profile report c. Make code changes to improve as necessary d. repeat until no more obvious improvements can be identified 3) Execute test scenario 2 a. collect profile report b. analyze profile report c. Make code changes to improve as necessary d. repeat until no more obvious improvements can be identified 4) Mark ending sequential job performance results for code base. Use results from http://www-unix.mcs.anl.gov/~feller/GRAM/perf/4-0/ 5) Mark beginning concurrent job performance results for code base. Use results from http://www-unix.mcs.anl.gov/~feller/GRAM/perf/4-0/ 6) Execute test scenario 3 a. collect profile report b. analyze profile report c. Make code changes to improve as necessary d. repeat until no more obvious improvements can be identified 7) Execute test scenario 4 a. collect profile report b. analyze profile report c. Make code changes to improve as necessary d. repeat until no more obvious improvements can be identified 8) Mark ending concurrent job performance results for code base. Use results from http://www-unix.mcs.anl.gov/~feller/GRAM/perf/4-0/
Created an attachment (id=1124) [details] GRAM/RFT optimizations This document lists some possible optimizations that can be done in GRAM and RFT. Some are simple and some might require bigger changes.
Created an attachment (id=1125) [details] GRAM optimizations patch (against globus_4_0_branch) The patch contains a few optimizations: 1) Removed unnecessary reflection code in the StateMachine and the Resource class. 2) Reduced the number of times the resource state is persisted to disk. 3) Changed the runScript calls so that they do not start an unnecessary thread. 4) Reduced the number of calls to get the FileSystemMapping class. This patch should be reviewed by the GRAM team.
Jarek, Thanks for the analysis and patch. Peter assigning this over to you.
These changes were not applied since there were bigger issues to resolve first.
Double-checked this. We should get at least items 3 and 4 from Comment #2 into 4.0.6 and 4.2. They seem quite easy to do and remove some really unnecessary issues.
We decided to not change item 4 for now since this the current solution enables an admin to change the file system mapping without restarting the container. Item 3 is fixed now. Removing the target milestone 4.0.6.