Skip to content


Keeping Grid Engine configuration information in Subversion

Something I’ve been doing for the past couple weeks on the BASS is storing the Grid Engine configuration in Subversion. This allows me to do cool stuff like diff’ing and rolling back to previous configurations.

The Script

#!/bin/bash

# Configuration values
SGE_ROOT=/usr/share/gridengine
SGE_CELL=default
SVN_URL=file:///home/user/subversion
SVN_IMPORT_DIR=gridconfig
SVN_LOAD=/usr/share/doc/subversion-1.4.2/svn_load_dirs.pl

# The real work
TMPD=`mktemp -d`
source $SGE_ROOT/$SGE_CELL/common/settings.sh
$SGE_ROOT/util/upgrade_modules/save_sge_config.sh $TMPD
find $TMPD -name accounting -exec rm -v {} \;
sed -i "/^load_values/d" $TMPD/execution/*

$SVN_LOAD -no_user_input $SVN_URL $SVN_IMPORT_DIR $TMPD

rm -fr $TMPD

Run this regularly from cron to enable the real power that I’ll discuss below.

Explanation

Lines 4-8 define configuration variables used later on. The SGE_* variables might already be defined for your shell. If not, you know what they should be. SVN_URL is the URL of the Subversion repository you’d like to store the configuration in. It should have already been created using svnadmin create <dir>. SVN_IMPORT_DIR is the name of the directory in your repository that will store the configuration information. The real magic of this script comes from svn_load_dirs.pl, which comes with Subversion. Provide the path to it in the SVN_LOAD variable.

Lines 11-13 setup the SGE environment and dump the current configuration to a temporary directory. The save_sge_config.sh script comes with Grid Engine and is a gem. It writes the grid configuration to a directory structure. It’s worthwhile to run it on your own and browse the output.

Lines 14 and 15 do some post-processing on the dumped configuration. In particular, the accounting file is deleted because in our case, it’s huge and dynamic, so I don’t want to store multiple copies of it in the repository. Another piece of dynamic information is the load_values for each execution host. These will change each time you run the script and don’t provide any useful historical information that you can’t get in better form from Ganglia, so I get rid of them. One last piece of dynamic information is the backup_date file which contains the date and time that save_sge_config.sh. I like to keep this around because it provides some context, but you could also delete that file here.

With the configuration gathered, line 17 simply calls svn_load_dirs.pl to load it into Subversion. A commit message in the form of ‘Load /tmp/sakfjaskfj into gridconfig.’ is attached to the import.

Cool stuff

Now that all the configuration information is stored in Subversion, the world is our oyster. We can use any of the normal Subversion tools to learn more about the configuration.

Finding out what’s changed

By using svn diff, you can find out what’s changed between different dates. For example, here are the changes that I made between October 2nd and 3rd:

user@host ~$ svn diff file:///home/user/subversion/gridconfig -r {2009-10-02}:{2009-10-03}
Index: backup_date
============================================
--- backup_date (revision 177)
+++ backup_date (revision 181)
@@ -1 +1 @@
-2009-10-01_00:10:01
+2009-10-02_00:10:01
Index: usersets/superusers
============================================
--- usersets/superusers  (revision 177)
+++ usersets/superusers  (revision 181)
@@ -2,4 +2,4 @@
 type    ACL
 fshare  0
 oticket 0
-entries johnny
+entries johnny,billy

You’ll notice the backup_date is there, as I mentioned previously. Additionally, you’ll notice I added billy to the superusers userset. Good to know.

WebSVN Example

WebSVN

WebSVN is an online Subversion repository browser. Set it up correctly to point at your new gridconfig repository and you can get the same diff information as before through your browser.

Rollback your grid configuration

Did something go horribly wrong with your configuration? Rollback to a previous version using the load_sge_config.sh that also comes with Grid Engine and is located in $SGE_ROOT/util/upgrade_modules. Simply checkout the version of the repository you want to load and pass it to load_sge_config.sh. If you do this successfully, let me know. So far we haven’t had any catestrophic configuration changes.

Something else?

If you manage to do something else interesting with your versioned configuration information, leave a comment below or send me a message.

Update: Fixed a typo in the code with the $TMPD variable.

Tagged with bash, gridengine, sge, subversion.


2 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. JoeSchuler says

    what’s the difference between $TMPDIR vs. $TMPD

  2. Edward Dale says

    Typo! I’ve just fixed it. Thanks for the heads up.