[gridengine dev] [DRAFT PATCH] Enhancement: exempt certain programs from execd control
Mark Dixon
m.c.dixon at leeds.ac.uk
Fri Nov 11 10:29:23 UTC 2011
On Fri, 11 Nov 2011, William Hay wrote:
...
> While I'm normally in favour of fully general facilities I wonder if
> this is a little too general. If the purpose of the patch is to
> exclude SGE infrastructure from being counted as part of the job then
> could the same effect not be achieved by having qrsh drop the group
> assigned to the job by the execd. Obvious downsides to my approach:
>
> i)I'm only guessing it will work as I've neither read the relevant
> code nor tested it.
> ii)As an obvious cause/consequence of the above my idea is pure vaporware.
> iii)Requires that qrsh start with some privileges, check the sgeconfig
> to get the gid_range, drop groups and then drop privileges. Errors in
> said code could represent a security problem.
Hi William, thanks for the interest :)
It may well be too general a feature. I admit I cannot think of another
good example of where it would come in useful (perhaps for starting an
ancillary process which uses lots of h_vmem e.g. due to mmap, but very
little real memory?). Even if this doesn't go anywhere, it has been a
useful exercise in becoming familiar with the GE code base!
Also, dropping the appropriate group with a setgroups call does work -
I've got a proof of concept SUID wrapper to qrsh that does this.
However, I really don't like the idea of using SUID root binaries where
they're not necessary: they're a pain to make sure they do the right
thing, a pain to install correctly, and they make people suspicious.
I may well be wrong, but SUID root here looks like overkill to me.
> If you don't trust your users then pointing SGE_ROOT and SGE_CELL at a
> private SGE config could allow arbitrary programs to escape SGE's
> clutches.
...
Are you talking about the SUID solution here, or the suggested
"exempt" feature?
If SUID: probably ok - we're just talking about the qrsh/qsh/qlogin client
dropping the group here. It'll escape counting of resource limits, but the
daemon side won't. The qrsh/qsh/qlogin processes might escape being
killed, but only if you've enabled (and are relying on) the
ENABLE_ADDGRP_KILL feature and the daemon side doesn't terminate cleanly.
If the feature: the bit that "exempts" the list of programs from resource
counting runs within the root-owned execd, which it gets from the
root-owned qmaster, so I'm not sure how a user would arrange this to
happen. Job processes would be killed as normal (my subject heading was
inaccurate).
Cheers,
Mark
--
-----------------------------------------------------------------
Mark Dixon Email : m.c.dixon at leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-----------------------------------------------------------------
More information about the dev
mailing list