<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:dc="https://purl.org/dc/elements/1.1/" xmlns:content="https://purl.org/rss/1.0/modules/content/" xmlns:foaf="https://xmlns.com/foaf/0.1/" xmlns:og="https://ogp.me/ns#" xmlns:rdfs="https://www.w3.org/2000/01/rdf-schema#" xmlns:schema="https://schema.org/" xmlns:sioc="https://rdfs.org/sioc/ns#" xmlns:sioct="https://rdfs.org/sioc/types#" xmlns:skos="https://www.w3.org/2004/02/skos/core#" xmlns:xsd="https://www.w3.org/2001/XMLSchema#" version="2.0" xml:base="https://www.linuxjournal.com/tag/slurm">
  <channel>
    <title>Slurm</title>
    <link>https://www.linuxjournal.com/tag/slurm</link>
    <description/>
    <language>en</language>
    
    <item>
  <title>Using MySQL for Load Balancing and Job Control under Slurm</title>
  <link>https://www.linuxjournal.com/content/using-mysql-load-balancing-and-job-control-under-slurm</link>
  <description>  &lt;div data-history-node-id="1338863" class="layout layout--onecol"&gt;
    &lt;div class="layout__region layout__region--content"&gt;
      
            &lt;div class="field field--name-field-node-image field--type-image field--label-hidden field--item"&gt;  &lt;img src="https://www.linuxjournal.com/sites/default/files/nodeimage/story/slurm_logo.png" width="436" height="400" alt="" typeof="foaf:Image" class="img-responsive" /&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-author field--type-ds field--label-hidden field--item"&gt;by &lt;a title="View user profile." href="https://www.linuxjournal.com/users/steven-buczkowski" lang="" about="https://www.linuxjournal.com/users/steven-buczkowski" typeof="schema:Person" property="schema:name" datatype="" xml:lang=""&gt;Steven Buczkowski&lt;/a&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"&gt;&lt;p&gt;
Like most things these days, modern atmospheric science is all about
big data. Whether it's an instrument flying in an aircraft taking
sets of images several times a second and producing three quarters
of a terabyte of data per flight day over a two-week campaign or a
satellite instrument producing hundreds of gigs of spectral data
daily over a 10–15 year lifetime, data volume is enormous. Simply
analyzing a day's worth of data to keep track of basic instrument
stability is CPU-intensive. Fully processing a day to retrieve the
state of the atmosphere or looking at trends across a decade's worth
of data is exponentially so. 
&lt;/p&gt;
&lt;p&gt;
High-performance parallel cluster
computing is the name of the game. For years I've done this on a
very basic level by kicking off a handful of copies of my processing
scripts on a couple computers around the lab, but after a recent
move into a new lab, I got my first chance to work on a real cluster
system, processing data from a satellite-borne hyperspectral sounder
called AIRS (see Resources). AIRS is one of the instruments onboard NASA's AQUA
satellite that was launched in late 2002 and has been in continuous
operation since. Data from AIRS and similar instruments is used to
map out vertical profiles of atmospheric temperature and
trace gases globally, but we have to be able to process it first.
&lt;/p&gt;

&lt;p&gt;
The cluster computing game here is strictly to get a whole lot of
computers doing the same thing to a whole lot of data so that we can
process it faster than we collect it (much faster would be
preferable). Since I was new to this game just a few months ago,
I've had much to learn about cluster computing and how to design
algorithms and processing software to take advantage of multiple CPUs for
processing. This was my first experience where I had
hundreds of CPUs at my disposal, and it really has changed how I
process data in general. I started this article to describe how I was
shown to parallelize this type of data processing and a method I put
together that makes the process much cleaner. 
&lt;/p&gt;

&lt;h3&gt;
Basic Slurm&lt;/h3&gt;

&lt;p&gt;
The cluster system here consists of 240 compute nodes, each with
dual, 8-core processors and 64GB of main memory running Red Hat
Enterprise Linux. Cluster jobs are scheduled to run through the
Slurm workload manager (see Resources). In a nutshell, Slurm is a suite of programs
that works to allocate computer resources among users and compute
jobs and enforce sharing rules to make sure everyone gets a chance
to get their work in. The two most important programs in the suite
for actually working on the system are &lt;code&gt;sbatch&lt;/code&gt; and
&lt;code&gt;srun&lt;/code&gt;.
&lt;/p&gt;

&lt;p&gt;
&lt;code&gt;sbatch&lt;/code&gt; is the entry point to the Slurm scheduler and reads a
high-level Bash control script that specifies job parameters (number
of nodes needed, memory per process, expected run times and so on) and
spawns the requested number of identical jobs via calls to srun.
&lt;/p&gt;&lt;/div&gt;
      
            &lt;div class="field field--name-node-link field--type-ds field--label-hidden field--item"&gt;  &lt;a href="https://www.linuxjournal.com/content/using-mysql-load-balancing-and-job-control-under-slurm" hreflang="und"&gt;Go to Full Article&lt;/a&gt;
&lt;/div&gt;
      
    &lt;/div&gt;
  &lt;/div&gt;

</description>
  <pubDate>Mon, 19 Oct 2015 18:39:35 +0000</pubDate>
    <dc:creator>Steven Buczkowski</dc:creator>
    <guid isPermaLink="false">1338863 at https://www.linuxjournal.com</guid>
    </item>

  </channel>
</rss>
