Bug 5820 - Improve Condor Logfile Processing in GRAM
: Improve Condor Logfile Processing in GRAM
: 4.0.6
: Macintosh All
: P3 normal
: 5.0.2
Assigned To:
  Show dependency treegraph
Reported: 2008-01-30 14:55 by
Modified: 2010-07-06 09:35 (History)



You need to log in before you can comment on or make changes to this bug.

Description From 2008-01-30 14:55:43
CAMPAIGN: Improve Condor Logfile Processing in GRAM

Technologies: GRAM2, GRAM4, Condor LRM


The implementation of the LRM for Condor has problems when used in
high-activity systems because all job state changes are stored in a single log
file. The condor log files tend to get extremely large.

GRAM2 parses the entire log file each time a job is polled, causing
performance problems.  GRAM4's SEG relies on the condor log to
remain a stream, so it cannot be rotated safely by system administrators while
the Globus Container is running. Also, users can insert any job information
they like into the log because of its liberal file permissions.

This campaign aims to modify the SEG / condor interaction so that per-job log
files can be used by GRAM2 and GRAM4, and old condor log files can be safely
removed. The goals are to have a less costly implementation of GRAM's
interfaces when used with condor.

- Develop algorithm for using multiple logfiles within the SEG framework which
  is able to can safely recover from abnormal ends.
- Modify the condor LRM module and setup to write to per-job logs instead of a
  common log
- Modify SEG protocol to allow the Job Manager to signal to the SEG when
  recover state is updated
- Modify condor SEG to implement the multiple logfile algorithm.
------- Comment #1 From 2008-01-30 15:15:21 -------
Algorithm description is at:
------- Comment #2 From 2008-01-30 15:19:40 -------
Patches implementing the algorithm:
------- Comment #3 From 2008-02-05 10:36:23 -------
*** Bug 5731 has been marked as a duplicate of this bug. ***
------- Comment #4 From 2010-07-06 09:35:05 -------
GRAM5 in 5.0.2 will process each job in a separate log file, see
http://jira.globus.org/browse/GRAM-130 for details.