Bugzilla – Bug 6516
network database for the service
Last modified: 2009-12-30 16:36:44
You need to log in before you can comment on or make changes to this bug.
Test network database connection (starting with derby) and make any adjustments necessary. This has many benefits including taking advantage of a good database installation (not on the service node, backed up, highly available, etc) and allowing simultaneous queries (uvic work with Nagios, etc).
As a second step, what about using postgres as the backend since this is already required to be setup for RFT? I'm not sure if there are any other globus components using postgres? Thoughts?
Update on this. I made scripts for starting Derby in network mode and a mechanism to get a db client password etc. in through the service. But I did not commit this to the released service yet because it is rife with problems in failure modes. All seems to work fine if the DB is up and running. But if it is killed, the JDBC drivers will log "connection refused" etc. but the Nimbus operations relying on those DB operations succeeding will in some cases carry on as if nothing bad happened, and not move things to a corrupted state. For example, the user may terminate a VM and the service will return 'terminated' to the remote client but the failed DB operation causes the service to not fully terminate. I was not able to narrow down all the cases where this could happen. But it introduces inconsistencies between what the client sees and what the service thinks, so it is unacceptable to me to include yet without more thorough testing and fixing.
Is there a branch available? It should be possible to run the network server separately so that the Nimbus service still uses the embedded driver to talk directly to the database and the network server handles connections from external derby clients.
(In reply to comment #3) > Is there a branch available? Committed to nimbus-netderby-branch for "workspace/vm" directory in CVS. To activate: 1. Deploy and configure as normal. 2. Stop the container 3. cd $GLOBUSLOCATION/etc/nimbus/workspace-service/other/ 4. cp db-netderby.xml db-ACTIVE.xml 5. sh $GLOBUSLOCATION/share/nimbus/netderby/netderby-start.sh 6. Restart the container username/password are set in "db-ACTIVE.xml" (when it is a copy of db-netderby.xml) and in $GLOBUSLOCATION/share/nimbus/netderby/environment Since the DBs are created as embedded first (via the build scripts), the username is set to "APP" by derby, so this has been made the default username. Otherwise you will get "schema $USERNAME does not exist" with your given USERNAME value.
After many hours of messing around with the code to get an embedded network server up and running, I've found a very simple solution. under the derby system directory (default is $GLOBUS_LOCATION/var) create a file named 'derby.properties' and add the line 'derby.drda.startNetworkServer=true' to the file. Add the derbynet.jar (get the correct one for your derby version) to the '$GLOBUS_LOCATION/lib' directory. Restart the Globus container and the derby network server will be started created for you in a new thread. No other modifications are needed. Nimbus will continue to access the derbyDBs with the embedded drivers. You can check '$GLOBUS_LOCATION/var/derby.log' to see that the server started correctly, the top line should be something along the lines of: 'Apache Derby Network Server - 10.4.2.0 - (689064) started and ready to accept connections on port 1527 at 2009-02-04 22:44:39.870 GMT' The derby DB's are accessible on: 'localhost:1527/nimbus/WorkspacePersistenceDB' and, 'localhost:1527/nimbus/WorkspaceAccountingDB' This however has no access control to the databases, leading to obvious problems. This may be ok, because the server seems to refuse any connection not from local host. I will continue to look into the security issue. Any Thoughts on this solution? Some links with more info on the derby.properties file: http://db.apache.org/derby/docs/10.4/adminguide/tadminconfigsettingnetwrokserverproperties.html#tadminconfigsettingnetwrokserverproperties Cheers, Matt Vliet mvliet@uvic.ca
Tim I would like to hear your thoughts on the questions raised by Matt Vliet in comment number #5 on this bug? We've got our nagios plug-ins updated to work with the network DB but I'd like to know if you like Matt solution before we commit.
I think it's important that the network DB in that situation is not just localhost but authentication/authorized by either local unix user or username/password. Maybe we could look in the derby code where this "derby.drda.startNetworkServer" property is analyzed and see how they start the database thread... and then copy that + add whatever is needed for security.
Tim that's a great idea. Matt can you look at that this week?
(In reply to comment #7) > I think it's important that the network DB in that situation is not just > localhost but authentication/authorized by either local unix user or > username/password. > Ok, here is a patch to enable the network server, and ensure user/pass authentication. https://particle.phys.uvic.ca/~mvliet/netserver.patch Here is the derby.properties file for $GLOBUS_LOCATION/var https://particle.phys.uvic.ca/~mvliet/derby.properties And here is a short wiki entry on what is going on. https://wiki.gridx1.ca/twiki/bin/view/Main/NimbusDerbyNetworkServer This is my first time creating patch files for source trees, so please feel free to let me know if I've done anything wrong.
I read this over and it looks great, thanks!
Tim could we get this in the code base?
I think so, especially if the port is not listening by default
Tim, I was wondering if you had considered adding this according to the patch that Matt Vliet made. I'm thinking about getting ready for the GSoC and the publishing tools we were planing on are going to be dependent on this network server working. Would you want us to add it?
Sure, please do add it.
Ok great. Will do it this week. I was also thinking that it would be good to modify the Ant script such that the read/write password for the regular user is auto generated thus avoiding the extra config step. We will look into that too.
(In reply to comment #15) > thinking that it would be good to > modify the Ant script such that the read/write password for the regular user is > auto generated thus avoiding the extra config step. We will look into that too. Cool, maybe call this script which is already in every Nimbus install? ./etc/nimbus/workspace-service/other/shared-secret-suggestion.py Would be pretty easy to duplicate in Java (and easily call from an Ant target) obviously, that would keep the Python dependency out...
This has been resolved in this commit: http://github.com/nimbusproject/nimbus/commit/d197b32f8a2eef1fc940e704adca5f2744701034