Bug 4802 - 4.0.x GRAM4 Performance Profiling and Improvements
: 4.0.x GRAM4 Performance Profiling and Improvements
Status: RESOLVED FIXED
: GRAM
Campaign
: 4.0.3
: Macintosh All
: P3 normal
: ---
Assigned To:
:
:
: 5760
: 4050
  Show dependency treegraph
 
Reported: 2006-10-20 16:58 by
Modified: 2008-05-06 16:26 (History)


Attachments
GRAM/RFT optimizations (27.50 KB, application/msword)
2006-11-09 12:40, Jarek Gawor
Details
GRAM optimizations patch (against globus_4_0_branch) (28.26 KB, patch)
2006-11-09 12:55, Jarek Gawor
Details


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2006-10-20 16:58:54
Title: 4.0.x GRAM4 Performance Profiling and Improvements

Definition:
==========================================

The next phase in 4.0.x GRAM4 performance analysis is to look for
performance improvements with the help of a profiler, such as yourkit.
Analyzing service log files and debug messages has helped identify obvious
problem areas, which have been fixed or improved. However, there could still
be improvements to be made at the core or in common routines that execute
quickly but called frequently enough to slow down performance. We need both
sequential and concurrent job scenarios profiled.


software
-----------
service: GRAM4 service, latest from the globus_4_0_branch
concurrent client: condor-g version ??
sequential client: globusrun-ws


Test scenarios to run and gather service profile information for each:
----------------------------
1) execute XX globusrun-ws jobs sequentially that do not do any staging and do
not do delegation.

2) execute XX globusrun-ws jobs sequentially that stage In and Out a 10KB file,
do delegation, create a unique job dir, and do file cleanup


3) execute XX condor-g jobs concurrently that do not do any staging and do not
do delegation.

4) execute XX condor-g jobs concurrently that stage In and Out a 10KB file, do
delegation, create a unique job dir, and do file cleanup

XX should be as many jobs as necessary in order to gather useful results. 
Sequetial jobs might require 10 or more?  For concurrent jobs, I'd link at
least 50 or 100 would need to be submitted.

Test/profile results will be collected at -
http://www-unix.mcs.anl.gov/~gawor/GRAM/profiling/4-0-3


Deliverables:
==========================================

1) Collected test profile results for above tests
2) Analysis report for each of the profile results
    Include number of jobs submits in the scenarios
3) Modified code committed to CVS for any improvements made


Tasks:
==========================================

1) Mark beginning sequential job performance results for code base. Use results
from http://www-unix.mcs.anl.gov/~feller/GRAM/perf/4-0/

2) Execute test scenario 1
    a. collect profile report
    b. analyze profile report
    c. Make code changes to improve as necessary
    d. repeat until no more obvious improvements can be identified

3) Execute test scenario 2
    a. collect profile report
    b. analyze profile report
    c. Make code changes to improve as necessary
    d. repeat until no more obvious improvements can be identified

4) Mark ending sequential job performance results for code base. Use results
from http://www-unix.mcs.anl.gov/~feller/GRAM/perf/4-0/

5) Mark beginning concurrent job performance results for code base. Use results
from http://www-unix.mcs.anl.gov/~feller/GRAM/perf/4-0/

6) Execute test scenario 3
    a. collect profile report
    b. analyze profile report
    c. Make code changes to improve as necessary
    d. repeat until no more obvious improvements can be identified

7) Execute test scenario 4
    a. collect profile report
    b. analyze profile report
    c. Make code changes to improve as necessary
    d. repeat until no more obvious improvements can be identified

8) Mark ending concurrent job performance results for code base. Use results
from http://www-unix.mcs.anl.gov/~feller/GRAM/perf/4-0/
------- Comment #1 From 2006-11-09 12:40:22 -------
Created an attachment (id=1124) [details]
GRAM/RFT optimizations

This document lists some possible optimizations that can be done in GRAM and
RFT. Some are simple and some might require bigger changes.
------- Comment #2 From 2006-11-09 12:55:03 -------
Created an attachment (id=1125) [details]
GRAM optimizations patch (against globus_4_0_branch) 

The patch contains a few optimizations:
1) Removed unnecessary reflection code in the StateMachine and the Resource
class.
2) Reduced the number of times the resource state is persisted to disk.
3) Changed the runScript calls so that they do not start an unnecessary thread.
4) Reduced the number of calls to get the FileSystemMapping class.

This patch should be reviewed by the GRAM team.
------- Comment #3 From 2006-11-21 17:42:22 -------
Jarek,

Thanks for the analysis and patch.  Peter assigning this over to you.
------- Comment #4 From 2007-11-30 09:42:05 -------
These changes were not applied since there were bigger issues to resolve first.
------- Comment #5 From 2007-12-01 01:02:41 -------
Double-checked this. We should get at least items 3 and 4
from Comment #2 into 4.0.6 and 4.2. They seem quite easy
to do and remove some really unnecessary issues.
------- Comment #6 From 2008-01-07 16:36:18 -------
We decided to not change item 4 for now since this
the current solution enables an admin to change
the file system mapping without restarting the container.
Item 3 is fixed now. Removing the target milestone 4.0.6.