[gridengine users] Control tmpdir usage on SGE
m.c.dixon at leeds.ac.uk
Thu Oct 6 11:47:49 UTC 2016
On Wed, 5 Oct 2016, William Hay wrote:
> Our prolog and epilog (parallel) ssh into the slave nodes and do the
> equivalent of run-parts on directories full of scripts some of which
> check if they are running on the head node of the job before doing
> anything. If we did want the epilog to save TMPDIRS from slave nodes
> we'd just have to decide how to name them I guess.
Presumably this would work for you capture-wise because you're creating
your own TMPDIRs rather than using the ones provided by the execd. (As
Reuti pointed out, the execd TMPDIRs on slave nodes are ephemeral.)
It'd be a pity to switch to doing it that way: the execd TMPDIR can be
paired with an xfs project quota scheme which is nice and tidy. I imagine
that deleting TMPDIRs via an epilog has a greater number of failure modes,
not all of which can be avoided by purging old directories at boot, like
intermittent network problems. How has that worked for you in practice?
Also, passwordless ssh between compute nodes has been useful to avoid. Not
only principle of least privilege - it's handy to help identify
applications that aren't tightly integrated.
Maybe our users can live with just the master node's TMPDIR.
More information about the users