Bug 4410 - GT4 WS-GRAM Auditing
: GT4 WS-GRAM Auditing
: development
: Macintosh All
: P3 normal
: 4.0.5
Assigned To:
: Auditing
: 4409
  Show dependency treegraph
Reported: 2006-05-17 11:13 by
Modified: 2007-06-22 17:46 (History)



You need to log in before you can comment on or make changes to this bug.

Description From 2006-05-17 11:13:52
Title: GT4 WS-GRAM Auditing

Technologies:    Globus Resource Allocation Manager (GRAM)


Production grids need a way to produce audit trails for job submissions.  Today
the method is to write info to a container wide log file.  This is a simple
method that all services can easily do.  However, it is not an effective way to
programmatically find or query for entries.  Nor is it an efficient way to
store a large number of audit records.

A WS GRAM audit record per job, indexed by a grid job id, can be useful to
provide access to audit or other local job related information.  For example,
it could be used to join to a grid's local accounting database.  Often, these
accounting databases do not have knowledge of the service's grid job id (the
service that provided access to the local compute resource).

To accommodate this need, it is proposed that GRAM be altered to generate audit
records in a database. The initial implementation to get this going will be in
the form of a log4j subclass of
JDBCAppender modified to parse the data items and format the audit records in
the desired format.   This schema will support multiple WS GRAM services using
the same gram audit table or later consolidation of multiple gram audit tables
to a single table.  The job_grid_id contains the information required to
uniquely identify the service that generated this record.

Initially, the following PostgreSQL database schema will be used:

create table gram_audit_table (
     "job_grid_id" varchar(256) primary key,
     "local_job_id" varchar(512) not null,
     "submission_job_id" varchar(512) not null,
     "subject_name" varchar(256) not null,
     "username" varchar(16) not null,
     "creation_time" timestamp not null,
     "queued_time" timestamp not null,
     "stage_in_gid" varchar(256),
     "stage_out_grid_id" varchar(256),
     "clean_up_grid_id" varchar(256),
     "globus_toolkit_version" varchar(16) not null,
     "resource_manager_type" varchar(16) not null,
     "job_description" text not null,
     "success_flag" boolean not null

The format of the string that is logged to the database appender is a
comma-separated list of double-quoted strings. The items in the list MUST be in
the same order as listed in the above schema. Any double quotes within the
strings MUST be converted to the substring """.  Any value that is allowed
to be null by the schema and is indeed null MUST be indicated with the string
"NULL". Such strings will be converted to actual null values in the database.
Here is an example:

G Lane 364243","lane","Wed May 03 16:06:44 MDT 2006","Wed May 03 16:06:45 MDT
2006","NULL","NULL","NULL","4.0 Community","Fork","<ns1:serviceLevelAgreement
  <ns1:job xsi:type=&quot;ns1:JobDescriptionType&quot;>
    <ns2:ReferenceProperties xsi:type=&quot;ns2:ReferencePropertiesType&quot;>
     <ns5:ResourceID ns04:type=&quot;ns05:string&quot;
    <ns2:ReferenceParameters xsi:type=&quot;ns2:ReferenceParametersType&quot;/>
   <ns1:executable xmlns:xsd=&quot;http://www.w3.org/2001/XMLSchema";
   <ns1:directory xmlns:xsd=&quot;http://www.w3.org/2001/XMLSchema";
   <ns1:stdout xmlns:xsd=&quot;http://www.w3.org/2001/XMLSchema";
   <ns1:stderr xmlns:xsd=&quot;http://www.w3.org/2001/XMLSchema";
   <ns1:count xmlns:xsd=&quot;http://www.w3.org/2001/XMLSchema";


1) Audit table schema (see above).
2) New subclass of JDBCAppender that handles generation of audit records from
logging statements.
3) Modified code to use the JDBCAppender subclass to actually log the audit


1) Create PostgreSQL 8.0 schema for the audit table.
2) Create AuditDatabaseAppender which extends JDBCAppender
3) Determine correct container-log4j.properties statements to add for enabling
the logging to the new appender.
4) Create separate Log instance in StateMachine intended solely for logging
audit data.
5) Add queuedTime to the resource data.
6) Add code to save the queuedTime after submission of the local job to the
resource manager succeeds.
7) Change the resource data to have an EPR datum for each staging operation as
opposed to one that just gets nulled out after a staging operation is
8) Alter the StateMachine code to use the separate staging EPR resource data.
9) Add code to processDoneState() and processFailedState() in StateMachine to
format and log the audit data using the audit Log instance.
10) Modify the MEJR ID to be service generated instead of using the client
generated submission_job_id
11) Test with PostgreSQL 8.0.
12) Run throughput tests to determine the effects/cost when auditing to a DB is
turned on;  write followup campaign as needed depending on findings

Time Estimate: 5 days
------- Comment #1 From 2006-06-12 15:59:52 -------
This campaign is done except for the last task of running throughput tests.
------- Comment #2 From 2006-06-15 11:43:19 -------
> Any double quotes within the strings MUST be converted to the
> substring "&quot;"

I thought using setString on a PreparedStatement should take care of this for
you, escaping special characters in DB specific ways.
------- Comment #3 From 2006-06-15 11:54:42 -------
That's not why they're escaped. They're escaped so that the code can
differentiate the quoted substrings from any other quotes in the substrings. It
needs to break up one long string into the individual data items by parsing on
quote boundries.
------- Comment #4 From 2006-10-02 12:03:45 -------
Although not explicitly in the task list, this code is in the community branch
and should be merged to HEAD so that it makes it into 4.2.
------- Comment #5 From 2006-10-17 14:20:59 -------
Two outstanding tasks: 1. merge with trunk 2. generate patch forVDT

I have created a branch off trunk (bug_4410_branch_1) to start merging the
code. This was committed as part of community branch and is interspersed with
other non-audit commits. So kicked off a manual merge, but lot of context
change from branch to trunk. Have sent diffs from both for Peter to review.
------- Comment #6 From 2006-10-17 14:40:11 -------
I'm reassigning this back to myself until I can get bug_4410_branch_1 (the
trunk merge result) to work properly.
------- Comment #7 From 2006-10-17 17:55:02 -------
I got everything to compile but I haven't tested anything yet. I'll do that
first thing tomorrow.
------- Comment #8 From 2006-10-20 14:33:48 -------
I've finished the merge to the trunk. Here are the log4j settings that are
needed to activate audit logging:

log4j.category.org.globus.exec.service.exec.StateMachine.audit=INFO, AUDIT


The second one is only needed if gatekeeper audit logging is going to be used.

I've added a second Derby database since trunk GRAM was already setting one up
for resource state. The audit logging stuff has been updated to use that DB by
default. The audit db configuration is also now consistent with the resource
state DB by using a DataSource and not extending the JDBCAppender class.

Since the JDBCAppender class could not be used anymore, I went ahead and
reworked the AuditDatabaseAppender to take an AuditData class as the logging
message and pull data from that with accessor methods instead of requiring that
the logging message be encoding as a single comma-separated string. The
command-line client that the gatekeeper relies on now parses the data into this
class before logging it.

I'm handing this back to Rachana for whatever else needs to be done.
------- Comment #9 From 2006-10-26 11:16:05 -------
The outstanding task here is to generate patches from community branch for VDT.
Plan to create diff against the latest VDT source, have contacted them to get
------- Comment #10 From 2006-11-06 11:13:30 -------
On testing this using the API provided to convert EPR to string, an issue with
EPRs written to files or serialized differently was found. A simpler algorithm,
the extracts the resource key value and the to address to generate digest has
been committed. Standalone testing of GRAM audit has been completed. Will need
to generate new patches.
------- Comment #11 From 2006-11-09 13:27:40 -------
It has been decided that a component release with updates will be done. So VDT
patches are not required. Closing campaign.