[gridengine dev] [DRAFT PATCH] Enhancement: exempt certain programs from execd control

Mark Dixon m.c.dixon at leeds.ac.uk
Fri Nov 11 10:29:23 UTC 2011


On Fri, 11 Nov 2011, William Hay wrote:
...
> While I'm normally in favour of fully general facilities I wonder if
> this is a little too general.  If the purpose of the patch is to
> exclude SGE infrastructure from being counted as part of the job then
> could the same effect not be  achieved by having qrsh drop the group
> assigned to the job by the execd.  Obvious downsides to my approach:
>
> i)I'm only guessing it will work as I've neither read the relevant
> code nor tested it.
> ii)As an obvious cause/consequence of the above my idea is pure vaporware.
> iii)Requires that qrsh start with some privileges, check the sgeconfig
> to get the gid_range, drop groups and then drop privileges.  Errors in
> said code could represent a security problem.

Hi William, thanks for the interest :)

It may well be too general a feature. I admit I cannot think of another 
good example of where it would come in useful (perhaps for starting an 
ancillary process which uses lots of h_vmem e.g. due to mmap, but very 
little real memory?). Even if this doesn't go anywhere, it has been a 
useful exercise in becoming familiar with the GE code base!

Also, dropping the appropriate group with a setgroups call does work - 
I've got a proof of concept SUID wrapper to qrsh that does this.

However, I really don't like the idea of using SUID root binaries where 
they're not necessary: they're a pain to make sure they do the right 
thing, a pain to install correctly, and they make people suspicious.

I may well be wrong, but SUID root here looks like overkill to me.

> If you don't trust your users then pointing SGE_ROOT and SGE_CELL at a
> private SGE config could allow arbitrary programs to escape SGE's
> clutches.
...

Are you talking about the SUID solution here, or the suggested 
"exempt" feature?

If SUID: probably ok - we're just talking about the qrsh/qsh/qlogin client 
dropping the group here. It'll escape counting of resource limits, but the 
daemon side won't. The qrsh/qsh/qlogin processes might escape being 
killed, but only if you've enabled (and are relying on) the 
ENABLE_ADDGRP_KILL feature and the daemon side doesn't terminate cleanly.

If the feature: the bit that "exempts" the list of programs from resource 
counting runs within the root-owned execd, which it gets from the 
root-owned qmaster, so I'm not sure how a user would arrange this to 
happen. Job processes would be killed as normal (my subject heading was 
inaccurate).

Cheers,

Mark
-- 
-----------------------------------------------------------------
Mark Dixon                       Email    : m.c.dixon at leeds.ac.uk
HPC/Grid Systems Support         Tel (int): 35429
Information Systems Services     Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-----------------------------------------------------------------


More information about the dev mailing list