Bug 6516 - network database for the service
: network database for the service
Status: RESOLVED FIXED
: Nimbus
Workspace service
: TP2.1
: PC Linux
: P3 enhancement
: TP2.1.1
Assigned To:
:
:
:
: 6533
  Show dependency treegraph
 
Reported: 2008-11-04 07:45 by
Modified: 2009-12-30 16:36 (History)


Attachments


Note

You need to log in before you can comment on or make changes to this bug.


Description From 2008-11-04 07:45:12
Test network database connection (starting with derby) and make any adjustments
necessary.  This has many benefits including taking advantage of a good
database installation (not on the service node, backed up, highly available,
etc) and allowing simultaneous queries (uvic work with Nagios, etc).
------- Comment #1 From 2008-11-04 17:27:19 -------
As a second step, what about using postgres as the backend since this is
already required to be setup for RFT? I'm not sure if there are any other
globus components using postgres? Thoughts?
------- Comment #2 From 2009-01-15 10:07:06 -------
Update on this.  I made scripts for starting Derby in network mode and a
mechanism to get a db client password etc. in through the service.  But I did
not commit this to the released service yet because it is rife with problems in
failure modes.

All seems to work fine if the DB is up and running.  But if it is killed, the
JDBC drivers will log "connection refused" etc. but the Nimbus operations
relying on those DB operations succeeding will in some cases carry on as if
nothing bad happened, and not move things to a corrupted state. For example,
the user may terminate a VM and the service will return 'terminated' to the
remote client but the failed DB operation causes the service to not fully
terminate.

I was not able to narrow down all the cases where this could happen.  But it
introduces inconsistencies between what the client sees and what the service
thinks, so it is unacceptable to me to include yet without more thorough
testing and fixing.
------- Comment #3 From 2009-01-16 16:50:12 -------
Is there a branch available?
It should be possible to run the network server separately  so that the Nimbus
service still uses the embedded driver to talk directly to the database and the
network server handles connections from external derby clients. 
------- Comment #4 From 2009-01-21 16:02:21 -------
(In reply to comment #3)
> Is there a branch available?

Committed to nimbus-netderby-branch for "workspace/vm" directory in CVS.

To activate:

1. Deploy and configure as normal.
2. Stop the container
3. cd $GLOBUSLOCATION/etc/nimbus/workspace-service/other/
4. cp db-netderby.xml db-ACTIVE.xml
5. sh $GLOBUSLOCATION/share/nimbus/netderby/netderby-start.sh
6. Restart the container


username/password are set in "db-ACTIVE.xml" (when it is a copy of
db-netderby.xml) and in $GLOBUSLOCATION/share/nimbus/netderby/environment

Since the DBs are created as embedded first (via the build scripts), the
username is set to "APP" by derby, so this has been made the default username. 
Otherwise you will get "schema $USERNAME does not exist" with your given
USERNAME value.
------- Comment #5 From 2009-02-05 12:01:55 -------
After many hours of messing around with the code to get an embedded network
server up and running, I've found a very simple solution.

under the derby system directory (default is $GLOBUS_LOCATION/var) create a
file named 'derby.properties' and add the line
'derby.drda.startNetworkServer=true' to the file.

Add the derbynet.jar (get the correct one for your derby version) to the
'$GLOBUS_LOCATION/lib' directory.

Restart the Globus container and the derby network server will be started
created for you in a new thread.  No other modifications are needed.  Nimbus
will continue to access the derbyDBs with the embedded drivers.

You can check '$GLOBUS_LOCATION/var/derby.log' to see that the server started
correctly, the top line should be something along the lines of:
'Apache Derby Network Server - 10.4.2.0 - (689064) started and ready to accept
connections on port 1527 at 2009-02-04 22:44:39.870 GMT'

The derby DB's are accessible on:
'localhost:1527/nimbus/WorkspacePersistenceDB' and,
'localhost:1527/nimbus/WorkspaceAccountingDB'

This however has no access control to the databases, leading to obvious
problems.  This may be ok, because the server seems to refuse any connection
not from local host.  I will continue to look into the security issue.

Any Thoughts on this solution?


Some links with more info on the derby.properties file:
http://db.apache.org/derby/docs/10.4/adminguide/tadminconfigsettingnetwrokserverproperties.html#tadminconfigsettingnetwrokserverproperties

Cheers,
Matt Vliet
mvliet@uvic.ca
------- Comment #6 From 2009-02-17 11:09:52 -------
Tim I would like to hear your thoughts on the questions raised by Matt Vliet in
comment number #5 on this bug? We've got our nagios plug-ins updated to work
with the network DB but I'd like to know if you like Matt solution before we
commit.
------- Comment #7 From 2009-02-17 11:26:47 -------
I think it's important that the network DB in that situation is not just
localhost but authentication/authorized by either local unix user or
username/password.

Maybe we could look in the derby code where this
"derby.drda.startNetworkServer" property is analyzed and see how they start the
database thread... and then copy that + add whatever is needed for security.
------- Comment #8 From 2009-02-17 11:32:13 -------
Tim that's a great idea. Matt can you look at that this week?
------- Comment #9 From 2009-02-27 17:35:32 -------
(In reply to comment #7)
> I think it's important that the network DB in that situation is not just
> localhost but authentication/authorized by either local unix user or
> username/password.
> 

Ok, here is a patch to enable the network server, and ensure user/pass
authentication.
https://particle.phys.uvic.ca/~mvliet/netserver.patch

Here is the derby.properties file for $GLOBUS_LOCATION/var
https://particle.phys.uvic.ca/~mvliet/derby.properties

And here is a short wiki entry on what is going on.
https://wiki.gridx1.ca/twiki/bin/view/Main/NimbusDerbyNetworkServer


This is my first time creating patch files for source trees, so please feel
free to let me know if I've done anything wrong.
------- Comment #10 From 2009-03-02 13:04:51 -------
I read this over and it looks great, thanks!
------- Comment #11 From 2009-03-02 17:43:35 -------
Tim could we get this in the code base?
------- Comment #12 From 2009-03-03 15:00:08 -------
I think so, especially if the port is not listening by default
------- Comment #13 From 2009-04-21 14:15:39 -------
Tim, I was wondering if you had considered adding this according to the patch
that Matt Vliet made. I'm thinking about getting ready for the GSoC and the
publishing tools we were planing on are going to be dependent on this network
server working. Would you want us to add it?
------- Comment #14 From 2009-04-21 14:21:22 -------
Sure, please do add it.
------- Comment #15 From 2009-04-21 14:30:42 -------
Ok great. Will do it this week. I was also thinking that it would be good to
modify the Ant script such that the read/write password for the regular user is
auto generated thus avoiding the extra config step. We will look into that too.
------- Comment #16 From 2009-04-21 15:18:21 -------
(In reply to comment #15)
> thinking that it would be good to
> modify the Ant script such that the read/write password for the regular user is
> auto generated thus avoiding the extra config step. We will look into that too.

Cool, maybe call this script which is already in every Nimbus install?

./etc/nimbus/workspace-service/other/shared-secret-suggestion.py

Would be pretty easy to duplicate in Java (and easily call from an Ant target)
obviously, that would keep the Python dependency out...
------- Comment #17 From 2009-12-30 16:36:44 -------
This has been resolved in this commit:
http://github.com/nimbusproject/nimbus/commit/d197b32f8a2eef1fc940e704adca5f2744701034