Bugzilla – Bug 5395
Locally invoked services seem to get initialized twice with local invocations in a recovery situation
Last modified: 2007-11-30 10:00:58
You need to log in before you can comment on or make changes to this bug.
In recovery situations with local invocations from WS-GRAM to RFT ReliableFileTransferImpl (RFT service) seems to get initialized/loaded two times: first by the container and another time by the LocalInvocationHelper. This does not happen in a non-recovery situation: Here the Provider who loads the service object in LocalInvocationHelper seems to know that a service instance had already been created and does not initialize it twice. See attached container.log. The statements INFO service.ReliableFileTransferImpl [main,<init>:316] 1 INFO service.ReliableFileTransferImpl [main,<init>:318] 2 INFO service.ReliableFileTransferImpl [main,<init>:320] 3 INFO service.ReliableFileTransferImpl [main,<init>:323] 4 INFO service.ReliableFileTransferImpl [main,<init>:324] java:comp/env//... INFO service.ReliableFileTransferImpl [main,<init>:327] 5 happen in a static block of ReliableFileTransferImpl and thus during intialization. (The scenario was created by submitting 100 jobs with staging using condor-g to the container. At some point where about 30 transfer resources had been persisted to disk the container was shut down and restarted)
ah, attachments are disabled, so here's the log inline: 2007-06-18 12:19:58,946 INFO exec.ManagedExecutableJobHome [Thread-4,recover:200] Recovered resource with ID 096f7ce0-1dc0-11dc-b346-c0e3db988a21. 2007-06-18 12:19:58,999 INFO exec.ManagedExecutableJobHome [Thread-4,recover:200] Recovered resource with ID 09aa14e0-1dc0-11dc-b346-c0e3db988a21. 2007-06-18 12:19:59,052 INFO exec.ManagedExecutableJobHome [Thread-4,recover:200] Recovered resource with ID 0e8d7470-1dc0-11dc-b34c-c0e3db988a21. 2007-06-18 12:19:59,082 INFO service.ReliableFileTransferImpl [main,<init>:316] 1 2007-06-18 12:19:59,083 INFO service.ReliableFileTransferImpl [main,<init>:318] 2 2007-06-18 12:19:59,084 INFO service.ReliableFileTransferImpl [main,<init>:320] 3 2007-06-18 12:19:59,085 INFO service.ReliableFileTransferImpl [main,<init>:323] 4 2007-06-18 12:19:59,086 INFO service.ReliableFileTransferImpl [main,<init>:324] java:comp/env//services/ReliableFileTransferService/dbConfiguration 2007-06-18 12:19:59,094 INFO service.ReliableFileTransferImpl [main,<init>:327] 5 2007-06-18 12:19:59,144 INFO exec.ManagedExecutableJobHome [Thread-4,recover:200] Recovered resource with ID ff266e60-1dbf-11dc-b328-c0e3db988a21. 2007-06-18 12:19:59,167 INFO utils.LocalStagingHelper [RunQueueThread_6,poll:249] polling for status of RFT resource` 2007-06-18 12:19:59,178 INFO utils.LocalInvocationHelper [RunQueueThread_6,getServiceObject:320] CREATE new service object ReliableFileTransferService 2007-06-18 12:19:59,184 INFO service.ReliableFileTransferImpl [RunQueueThread_6,<init>:316] 1 2007-06-18 12:19:59,185 INFO exec.ManagedExecutableJobHome [Thread-4,recover:200] Recovered resource with ID 09b8e1f0-1dc0-11dc-b346-c0e3db988a21. 2007-06-18 12:19:59,185 INFO service.ReliableFileTransferImpl [RunQueueThread_6,<init>:318] 2 2007-06-18 12:19:59,189 INFO service.ReliableFileTransferImpl [RunQueueThread_6,<init>:320] 3 2007-06-18 12:19:59,189 INFO service.ReliableFileTransferImpl [RunQueueThread_6,<init>:323] 4 2007-06-18 12:19:59,190 INFO service.ReliableFileTransferImpl [RunQueueThread_6,<init>:324] java:comp/env//services/ReliableFileTransferService/dbConfiguration 2007-06-18 12:19:59,190 INFO service.ReliableFileTransferImpl [RunQueueThread_6,<init>:327] 5 2007-06-18 12:19:59,225 INFO exec.ManagedExecutableJobHome [Thread-4,recover:200] Recovered resource with ID 0d746350-1dc0-11dc-b34c-c0e3db988a21. 2007-06-18 12:19:59,263 INFO exec.ManagedExecutableJobHome [Thread-4,recover:200] Recovered resource with ID ff1c0e20-1dbf-11dc-b328-c0e3db988a21. 2007-06-18 12:19:59,281 INFO utils.LocalStagingHelper [RunQueueThread_12,poll:249] polling for status of RFT resource` 2007-06-18 12:19:59,299 INFO exec.ManagedExecutableJobHome [Thread-4,recover:200] Recovered resource with ID 0d9754a0-1dc0-11dc-b34c-c0e3db988a21. 2007-06-18 12:19:59,326 INFO exec.ManagedExecutableJobHome [Thread-4,recover:200] Recovered resource with ID 09dae8e0-1dc0-11dc-b348-c0e3db988a21. 2007-06-18 12:19:59,354 INFO exec.ManagedExecutableJobHome [Thread-4,recover:200] Recovered resource with ID fee43540-1dbf-11dc-b327-c0e3db988a21. 2007-06-18 12:19:59,369 INFO utils.LocalStagingHelper [RunQueueThread_2,poll:249] polling for status of RFT resource` 2007-06-18 12:19:59,388 INFO exec.ManagedExecutableJobHome [Thread-4,recover:200] Recovered resource with ID fe9244b0-1dbf-11dc-b327-c0e3db988a21. 2007-06-18 12:19:59,396 INFO utils.LocalStagingHelper [RunQueueThread_1,poll:249] polling for status of RFT resource` 2007-06-18 12:19:59,416 INFO exec.ManagedExecutableJobHome [Thread-4,recover:200] Recovered resource with ID fe5b0810-1dbf-11dc-b327-c0e3db988a21. 2007-06-18 12:19:59,421 INFO utils.LocalStagingHelper [RunQueueThread_4,poll:249] polling for status of RFT resource` 2007-06-18 12:19:59,622 INFO exec.ManagedExecutableJobHome [Thread-4,recover:200] Recovered resource with ID fe97c2f0-1dbf-11dc-b327-c0e3db988a21. 2007-06-18 12:19:59,628 INFO utils.LocalStagingHelper [RunQueueThread_7,poll:249] polling for status of RFT resource` 2007-06-18 12:19:59,648 INFO exec.ManagedExecutableJobHome [Thread-4,recover:200] Recovered resource with ID 0887dc00-1dc0-11dc-b346-c0e3db988a21.
Finally we found that in a recovery situation two objects of RFT had been created: one by a WS-GRAM thread in recovery and one by a container thread during initialization of the services. With the help of Mike we probably fixed it. First tests look good: RFT's constructor is now called only once. Before a service can be loaded in the LocalInvocationHelper ServiceManager.initializeService(msgCtx) has to be called for the service in question. Also the MessageContext has to be populated with more Axis related properties prior to initializing the service.
I guess this should not be closed, until we really know why we solved it with initialization. I'll make it a non-blocker in the meantime and set milestone to 4.0.6
This was fixed in 4.0.5. But additional investigation should be done to verify that this method is the proper way to fix it.