[gridengine users] resource types -- changing BOOL to INT but keeping qsub unchanged

William Hay w.hay at ucl.ac.uk
Tue Jan 2 09:11:57 UTC 2018


On Fri, Dec 22, 2017 at 05:55:26PM -0500, bergman at merctech.com wrote:
> True, but even with that info, there doesn't seem to be any universal
> way to tell an arbitrary GPU job which GPU to use -- they all default
> to device 0.

With Nvidia GPUs we use a prolog script that manipulates lock files
to select a GPU then chgrp's the selected /dev/nvidia? file so the group is
the group associated with the job.   An epilog script undoes all of this.  
The /dev/nvidia? files permissions are set to be inaccessible to anyone 
other than owner(root) and the group.  However you have to pass
a magic option to the kernel to prevent permissions from being reset
whenever anyone tries to access the device.

This seems to be a fairly bullet proof way of restricting jobs to
their assigned GPU.


William
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://gridengine.org/pipermail/users/attachments/20180102/ac918761/attachment.sig>


More information about the users mailing list