[gridengine users] how create and monitor a new consumable?

William Hay w.hay at ucl.ac.uk
Fri Apr 10 10:36:55 UTC 2015


On Fri, 10 Apr 2015 04:05:11 +0000
Marlies Hankel <m.hankel at uq.edu.au> wrote:

> Dear all,
> 
> I am using OGS/Grid Engine 2011.11 as installed under ROCKS 6..1.1.
> 
> We have a global storage which hosts all home directories and which
> also has a large space which I would like to use as scratch space for
> jobs. As the storage is faster and so much bigger than the small
> local disk I would like $TMPDIR to default to /scratch. It will also
> help with some of our MPI applications that need global scratch space.
> 
> I know I can set the tmp directory to /scratch but I would also like
> to have the space as a consumable complex that can be requested by
> the users if scratch space is needed. I also would like jobs to be
> killed if they go over the requested amount.
> 
> I can set up a complex named scratch and make this consumable but how
> to I make sure that jobs do not go over the requested amount? Or is
> there already a complex that would do this I could use?

I suspect this is something that will depend quite a lot on what cluster
filesystem you are using if any also whether your users are trying to
get around the limit or not.  The unix filesystem model doesn't easily 
support setting directory quotas because files can have hard links in
more than one directory.

Possibility set a group quota for the additional group that SGE assigns
to each job.  You would need to ensure different hosts normally used
different group ranges but fiddle around in the per-host spool to ensure
the group assigned on the master host is used for any slave tasks.
Wouldn't automatically kill the job but it would probably die of its
own accord once it started getting errors on file I/O.  You could
add a monitoring script that killed jobs that have hit their quota.
WARNING the above idea is untested and random stuff may break. 

William
 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://gridengine.org/pipermail/users/attachments/20150410/1ce2fb23/attachment.sig>


More information about the users mailing list