[gridengine users] Control tmpdir usage on SGE

Mark Dixon m.c.dixon at leeds.ac.uk
Thu Oct 6 11:47:49 UTC 2016

On Wed, 5 Oct 2016, William Hay wrote:
> Our prolog and epilog (parallel) ssh into the slave nodes and do the 
> equivalent of run-parts on directories full of scripts some of which 
> check if they are running on the head node of the job before doing 
> anything. If we did want the epilog to save TMPDIRS from slave nodes 
> we'd just have to decide how to name them I guess.

Presumably this would work for you capture-wise because you're creating 
your own TMPDIRs rather than using the ones provided by the execd. (As 
Reuti pointed out, the execd TMPDIRs on slave nodes are ephemeral.)

It'd be a pity to switch to doing it that way: the execd TMPDIR can be 
paired with an xfs project quota scheme which is nice and tidy. I imagine 
that deleting TMPDIRs via an epilog has a greater number of failure modes, 
not all of which can be avoided by purging old directories at boot, like 
intermittent network problems. How has that worked for you in practice?

Also, passwordless ssh between compute nodes has been useful to avoid. Not 
only principle of least privilege - it's handy to help identify 
applications that aren't tightly integrated.

Maybe our users can live with just the master node's TMPDIR.



