Something I’ve been doing for the past couple weeks on the BASS is storing the Grid Engine configuration in Subversion. This allows me to do cool stuff like diff’ing and rolling back to previous configurations.
The Script
#!/bin/bash
# Configuration values
SGE_ROOT=/usr/share/gridengine
SGE_CELL=default
SVN_URL=file:///home/user/subversion
SVN_IMPORT_DIR=gridconfig
SVN_LOAD=/usr/share/doc/subversion-1.4.2/svn_load_dirs.pl
# The real work
TMPD=`mktemp -d`
source $SGE_ROOT/$SGE_CELL/common/settings.sh
$SGE_ROOT/util/upgrade_modules/save_sge_config.sh $TMPD
find $TMPD -name accounting -exec rm -v {} \;
sed -i "/^load_values/d" $TMPD/execution/*
$SVN_LOAD -no_user_input $SVN_URL $SVN_IMPORT_DIR $TMPD
rm -fr $TMPD
Run this regularly from cron to enable the real power that I’ll discuss below.
Explanation
Lines 4-8 define configuration variables used later on. The SGE_* variables might already be defined for your shell. If not, you know what they should be. SVN_URL is the URL of the Subversion repository you’d like to store the configuration in. It should have already been created using svnadmin create <dir>. SVN_IMPORT_DIR is the name of the directory in your repository that will store the configuration information. The real magic of this script comes from svn_load_dirs.pl, which comes with Subversion. Provide the path to it in the SVN_LOAD variable.
Lines 11-13 setup the SGE environment and dump the current configuration to a temporary directory. The save_sge_config.sh script comes with Grid Engine and is a gem. It writes the grid configuration to a directory structure. It’s worthwhile to run it on your own and browse the output.
Lines 14 and 15 do some post-processing on the dumped configuration. In particular, the accounting file is deleted because in our case, it’s huge and dynamic, so I don’t want to store multiple copies of it in the repository. Another piece of dynamic information is the load_values for each execution host. These will change each time you run the script and don’t provide any useful historical information that you can’t get in better form from Ganglia, so I get rid of them. One last piece of dynamic information is the backup_date file which contains the date and time that save_sge_config.sh. I like to keep this around because it provides some context, but you could also delete that file here.
With the configuration gathered, line 17 simply calls svn_load_dirs.pl to load it into Subversion. A commit message in the form of ‘Load /tmp/sakfjaskfj into gridconfig.’ is attached to the import.
Cool stuff
Now that all the configuration information is stored in Subversion, the world is our oyster. We can use any of the normal Subversion tools to learn more about the configuration.
Finding out what’s changed
By using svn diff, you can find out what’s changed between different dates. For example, here are the changes that I made between October 2nd and 3rd:
user@host ~$ svn diff file:///home/user/subversion/gridconfig -r {2009-10-02}:{2009-10-03}
Index: backup_date
============================================
--- backup_date (revision 177)
+++ backup_date (revision 181)
@@ -1 +1 @@
-2009-10-01_00:10:01
+2009-10-02_00:10:01
Index: usersets/superusers
============================================
--- usersets/superusers (revision 177)
+++ usersets/superusers (revision 181)
@@ -2,4 +2,4 @@
type ACL
fshare 0
oticket 0
-entries johnny
+entries johnny,billy
You’ll notice the backup_date is there, as I mentioned previously. Additionally, you’ll notice I added billy to the superusers userset. Good to know.
WebSVN
WebSVN is an online Subversion repository browser. Set it up correctly to point at your new gridconfig repository and you can get the same diff information as before through your browser.
Rollback your grid configuration
Did something go horribly wrong with your configuration? Rollback to a previous version using the load_sge_config.sh that also comes with Grid Engine and is located in $SGE_ROOT/util/upgrade_modules. Simply checkout the version of the repository you want to load and pass it to load_sge_config.sh. If you do this successfully, let me know. So far we haven’t had any catestrophic configuration changes.
Something else?
If you manage to do something else interesting with your versioned configuration information, leave a comment below or send me a message.
Update: Fixed a typo in the code with the $TMPD variable.

what’s the difference between $TMPDIR vs. $TMPD
Typo! I’ve just fixed it. Thanks for the heads up.