<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:series="http://unfoldingneurons.com/"
	>

<channel>
	<title>scompt.com &#187; sge</title>
	<atom:link href="http://scompt.com/blog/archives/tag/sge/feed/" rel="self" type="application/rss+xml" />
	<link>http://scompt.com</link>
	<description>The website of Edward Dale</description>
	<lastBuildDate>Sun, 04 Sep 2011 15:54:07 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
		<item>
		<title>Creating a maintenance window in GridEngine using JSVs</title>
		<link>http://scompt.com/blog/archives/2010/01/23/creating-a-maintenance-window-in-gridengine-using-jsvs</link>
		<comments>http://scompt.com/blog/archives/2010/01/23/creating-a-maintenance-window-in-gridengine-using-jsvs#comments</comments>
		<pubDate>Sat, 23 Jan 2010 15:15:16 +0000</pubDate>
		<dc:creator>Edward Dale</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[gridengine]]></category>
		<category><![CDATA[jsv]]></category>
		<category><![CDATA[sge]]></category>

		<guid isPermaLink="false">http://scompt.com/?p=351</guid>
		<description><![CDATA[One thing that&#8217;s always been a pain on the BASS is getting the cluster clear when we have a maintenance window, which usually happens monthly. Our initial solution was to send out an email notifying users of the upcoming outage and then simply killing all of the jobs at the appointed time. This obviously has [...]]]></description>
			<content:encoded><![CDATA[<p>One thing that&#8217;s always been a pain on the <a href="http://www.cs.unc.edu/bass"><acronym title="Biomedical Analysis and Simulation Supercomputer">BASS</acronym></a> is getting the cluster clear when we have a maintenance window, which usually happens monthly.  Our initial solution was to send out an email notifying users of the upcoming outage and then simply killing all of the jobs at the appointed time.  This obviously has its downsides.  The solution that is currently implemented uses a <acronym title="Job Submission Verifier">JSV</acronym> to modify the hard runtime limit on the job and notify the user of the change.<br />
<span id="more-351"></span><br />
On the BASS, the default runtime for a job is 2 days (set in the cluster-wide sge_request file).  Therefore, to be most effective, the maintenance must be scheduled at least two days in advance to make sure all incoming jobs have their runtime limited.  A couple users require longer runtime limits, so we have to make sure they&#8217;re covered too.</p>
<p>The JSV below calculates the time at which the incoming job will reach the hard runtime limit and if it is after the configured maintenance window begin, modifies the hard runtime limit and notifies the user.  This ensures that all jobs running on the cluster will end before the maintenance window begins and that the users know that.</p>
<p>Without further ado, here&#8217;s the code.  It looks for a file named <code>maintenance</code> in the <code>$SGE_ROOT/$SGE_CELL/common</code> directory that contains the unix timestamp that the maintenance window begins.  It&#8217;s executed client-side from the cluster-wide <code>sge_request</code> file in the same directory.  The script requires the <code>Date</code> and <code>Time</code> libraries for some date math that it does.  I&#8217;m not a perl programmer, so bear with me.</p>
<pre class="brush: perl; title: ; notranslate">
#!/usr/bin/perl

use strict;
use warnings;
no warnings qw/uninitialized/;

use Env qw(SGE_ROOT SGE_CELL);
use lib &quot;$SGE_ROOT/$SGE_CELL/common/perl/lib/perl5/site_perl/5.8.8/&quot;;
use Date::Format;
use Time::Duration;
use lib &quot;$SGE_ROOT/util/resources/jsv&quot;;
use JSV qw( :DEFAULT jsv_sub_is_param jsv_sub_add_param jsv_sub_get_param jsv_send_env jsv_log_info jsv_is_param jsv_get_param );

sub hms2s {
	my $input = shift;
	if( $input =~ m/(\d*):(\d*):(\d*)/ ) {
		my $h = $1 || 0;
		my $m = $2 || 0;
		my $s = $3 || 0;
		return $h*3600+$m*60+$s
	} elsif( $input =~ m/(\d+)/) {
		return $1;
	} else {
		return 0;
	}
}

jsv_on_start(sub {
	jsv_send_env();
});

jsv_on_verify(sub {

	my $data_file=&quot;$SGE_ROOT/$SGE_CELL/common/maintenance&quot;;
	my $success = open(DAT, $data_file); 

	if( !$success ) {
		jsv_accept(&quot;No maintenance window scheduled.&quot;);
		return;
	}

	my $maintenance_begin = &lt;DAT&gt;;
	close(DAT);
	if( !$maintenance_begin ) {
		jsv_accept(&quot;No maintenance window scheduled.&quot;);
		return;
	}

	# Allow a 5 minute window for jobs to die before the maintenance officially starts.
	my $delta = 300;
	my $now = time();

	if( $maintenance_begin-$delta &lt; $now ) {
		jsv_log_info('*'x81);
		jsv_log_info('* Maintenance is currently in progress');
		jsv_log_info('* For more information, see http://blahblah');
		jsv_log_info('*'x81);
		jsv_reject();
		return;
	}

	my $requested_rt = hms2s(jsv_sub_get_param('l_hard', 'h_rt'));
	if( $now + $requested_rt &gt; $maintenance_begin - $delta ) {
		my $time_to_run = $maintenance_begin - $delta - $now;
		jsv_sub_add_param('l_hard','h_rt',$time_to_run);
		jsv_log_info('*'x81);
		jsv_log_info('* A maintenance window is scheduled for '.time2str('%m/%d/%Y %H:%M:%S', $maintenance_begin, 'EST'));
		jsv_log_info('* Your job will be allowed to run for '.duration($time_to_run));
		jsv_log_info('* For more information, see http://blahblah');
		jsv_log_info('*'x81);
	}
	jsv_correct('Job is accepted');
	return;
}); 

jsv_main();
</pre>
]]></content:encoded>
			<wfw:commentRss>http://scompt.com/blog/archives/2010/01/23/creating-a-maintenance-window-in-gridengine-using-jsvs/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Keeping Grid Engine configuration information in Subversion</title>
		<link>http://scompt.com/blog/archives/2009/10/13/versioned-grid-engine-configuration</link>
		<comments>http://scompt.com/blog/archives/2009/10/13/versioned-grid-engine-configuration#comments</comments>
		<pubDate>Tue, 13 Oct 2009 11:46:17 +0000</pubDate>
		<dc:creator>Edward Dale</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[bash]]></category>
		<category><![CDATA[gridengine]]></category>
		<category><![CDATA[sge]]></category>
		<category><![CDATA[subversion]]></category>

		<guid isPermaLink="false">http://scompt.com/?p=305</guid>
		<description><![CDATA[Something I&#8217;ve been doing for the past couple weeks on the BASS is storing the Grid Engine configuration in Subversion. This allows me to do cool stuff like diff&#8217;ing and rolling back to previous configurations. The Script Run this regularly from cron to enable the real power that I&#8217;ll discuss below. Explanation Lines 4-8 define [...]]]></description>
			<content:encoded><![CDATA[<p>Something I&#8217;ve been doing for the past couple weeks on the <a href="http://www.cs.unc.edu/bass" title="Biomedical Analysis and Simulation Supercomputer">BASS</a> is storing the <a href="http://gridengine.sunsource.net/">Grid Engine</a> configuration in Subversion.  This allows me to do cool stuff like diff&#8217;ing and rolling back to previous configurations.</p>
<p><span id="more-305"></span></p>
<h2>The Script</h2>
<pre class="brush: bash; title: ; notranslate">
#!/bin/bash

# Configuration values
SGE_ROOT=/usr/share/gridengine
SGE_CELL=default
SVN_URL=file:///home/user/subversion
SVN_IMPORT_DIR=gridconfig
SVN_LOAD=/usr/share/doc/subversion-1.4.2/svn_load_dirs.pl

# The real work
TMPD=`mktemp -d`
source $SGE_ROOT/$SGE_CELL/common/settings.sh
$SGE_ROOT/util/upgrade_modules/save_sge_config.sh $TMPD
find $TMPD -name accounting -exec rm -v {} \;
sed -i &quot;/^load_values/d&quot; $TMPD/execution/*

$SVN_LOAD -no_user_input $SVN_URL $SVN_IMPORT_DIR $TMPD

rm -fr $TMPD
</pre>
<p>Run this regularly from cron to enable the real power that I&#8217;ll discuss below.</p>
<h2>Explanation</h2>
<p>Lines 4-8 define configuration variables used later on.  The <code>SGE_*</code> variables might already be defined for your shell.  If not, you know what they should be.  <code>SVN_URL</code> is the URL of the Subversion repository you&#8217;d like to store the configuration in.  It should have already been created using <code>svnadmin create &lt;dir&gt;</code>.  <code>SVN_IMPORT_DIR</code> is the name of the directory in your repository that will store the configuration information.  The real magic of this script comes from <code>svn_load_dirs.pl</code>, which comes with Subversion.  Provide the path to it in the <code>SVN_LOAD</code> variable.</p>
<p>Lines 11-13 setup the SGE environment and dump the current configuration to a temporary directory.  The <code>save_sge_config.sh</code> script comes with Grid Engine and is a gem.  It writes the grid configuration to a directory structure.  It&#8217;s worthwhile to run it on your own and browse the output.</p>
<p>Lines 14 and 15 do some post-processing on the dumped configuration.  In particular, the <code>accounting</code> file is deleted because in our case, it&#8217;s huge and dynamic, so I don&#8217;t want to store multiple copies of it in the repository.  Another piece of dynamic information is the <code>load_values</code> for each execution host.  These will change each time you run the script and don&#8217;t provide any useful historical information that you can&#8217;t get in better form from <a href="http://ganglia.sourceforge.net/">Ganglia</a>, so I get rid of them.  One last piece of dynamic information is the <code>backup_date</code> file which contains the date and time that <code>save_sge_config.sh</code>.  I like to keep this around because it provides some context, but you could also delete that file here.</p>
<p>With the configuration gathered, line 17 simply calls <code>svn_load_dirs.pl</code> to load it into Subversion.  A commit message in the form of &#8216;Load /tmp/sakfjaskfj into gridconfig.&#8217; is attached to the import.</p>
<h2>Cool stuff</h2>
<p>Now that all the configuration information is stored in Subversion, the world is our oyster.  We can use any of the normal Subversion tools to learn more about the configuration.</p>
<h3>Finding out what&#8217;s changed</h3>
<p>By using <code>svn diff</code>, you can find out what&#8217;s changed between different dates.  For example, here are the changes that I made between October 2nd and 3rd:</p>
<pre class="brush: bash; title: ; notranslate">
user@host ~$ svn diff file:///home/user/subversion/gridconfig -r {2009-10-02}:{2009-10-03}
Index: backup_date
============================================
--- backup_date (revision 177)
+++ backup_date (revision 181)
@@ -1 +1 @@
-2009-10-01_00:10:01
+2009-10-02_00:10:01
Index: usersets/superusers
============================================
--- usersets/superusers  (revision 177)
+++ usersets/superusers  (revision 181)
@@ -2,4 +2,4 @@
 type    ACL
 fshare  0
 oticket 0
-entries johnny
+entries johnny,billy
</pre>
<p>You&#8217;ll notice the <code>backup_date</code> is there, as I mentioned previously.  Additionally, you&#8217;ll notice I added billy to the superusers userset.  Good to know.</p>
<p><a href="http://scompt.com/wordpress/wp-content/uploads/2009/10/websvn.png"><img src="http://scompt.com/wordpress/wp-content/uploads/2009/10/websvn-150x150.png" alt="WebSVN Example" title="WebSVN Example" width="150" height="150" class="alignright size-thumbnail wp-image-312" /></a></p>
<h3>WebSVN</h3>
<p><a href="http://www.websvn.info/">WebSVN</a> is an online Subversion repository browser.  Set it up correctly to point at your new <code>gridconfig</code> repository and you can get the same diff information as before through your browser.</p>
<h3 style="clear:both">Rollback your grid configuration</h3>
<p>Did something go horribly wrong with your configuration?  Rollback to a previous version using the <code>load_sge_config.sh</code> that also comes with Grid Engine and is located in <code>$SGE_ROOT/util/upgrade_modules</code>.  Simply checkout the version of the repository you want to load and pass it to <code>load_sge_config.sh</code>.  If you do this successfully, let me know.  So far we haven&#8217;t had any catestrophic configuration changes.</p>
<h2>Something else?</h2>
<p>If you manage to do something else interesting with your versioned configuration information, leave a comment below or send me a <a href="/contact/">message</a>.</p>
<p><strong>Update: </strong>Fixed a typo in the code with the <code>$TMPD</code> variable.</p>
]]></content:encoded>
			<wfw:commentRss>http://scompt.com/blog/archives/2009/10/13/versioned-grid-engine-configuration/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

