[gridengine users] suspend_threshold depending on job I/O

Dave Love d.love at liverpool.ac.uk
Wed Nov 14 22:20:37 UTC 2012


Txema Heredia Genestar <txema.heredia at upf.edu> writes:

> Hi all,
>
> we have a 300-core cluster with a ~150Tb shared directory (GPFS). Our
> users run some genomic analysis that use huge files and usually cannot
> fit the 500Gb internal HDD of the nodes. As you can imagine, sometimes
> things get pretty intense and all the nagios disk alarms start going
> off (the disk "works" but we got 10+ sec timeouts).

Hmm...  GPFS was responsible for most of the trouble on the one cluster
I've had it on.  On the other hand, we've had Gaussian (ugh) jobs
reliably wedge our Solaris NFS server in a very odd way, but only those
jobs as the start up.  Anyhow, it's definitely a significant sort of
issue generally.

> Knowing that I cannot trust our users to request any "disk_intensive"
> parameter/flag, I was pondering on setting a suspend_threshold in the
> queues, watching the shared disk status (e.g. timing an ls to the
> shared disk) and start suspending jobs when the disk has, say, a 3 sec
> delay. This would be a nice fix for our issue, but it has some
> problems: When there are both "IO-intensive" and "normal" jobs, and
> the suspend_threshold kicks in, SGE will start suspending jobs
> ¿without any particular criteria? (I don't know this part), and lots
> of innocent "normal" jobs will be suspended through all the nodes
> before the disk load is stabilized.

The right way to do this is presumably to restrict the i/o similarly to
other resources, but there's no gridengine built-in way to do that, and
it will be heavily OS-specific.  I was looking at traffic shaping under
GNU/Linux in our case, but the need went away, fortunately.  Some
versions of Linux cgroups might help with this sort of thing, but again
that's highly OS-specific and would need non-trivial implementation
effort.  I assume it would be possible to restrict the i/o more
generally by interposing on library/system calls in the usual way, but
that's probably tricky.

-- 
Community Grid Engine:  http://arc.liv.ac.uk/SGE/



More information about the users mailing list