[gridengine users] problem with PE / complex setup - not sure, this might even be a bug?

Reuti reuti at staff.uni-marburg.de
Tue Jul 16 18:51:27 UTC 2013


Hi,

Am 16.07.2013 um 19:10 schrieb Tina Friedrich:

> I have just noticed some (to my mind) weird behaviour.
> 
> I've recently upgraded from SGE6.2u5 to SGE8.1.3 (using sge installer upgrade functionality to import my configuration etc).
> 
> We have a consumable complex - called gpus - set up on a couple of nodes that have (as the name suggests) GPUs attached to them.

consumable YES or consumable JOB?

-- Reuti


> That works fine so far - qstat -F gpus=* shows them; I schedule a job with 'qsub -l gpus=1' and one gets used (according to qstat), ... all well.
> 
> We also have a (pretty standard I think?) smp environment - this, also, works (i.e. 'qsub -pe smp 4' gives me 4 slots).
> 
> Combining them fails, we just noticed. Well, mostly fails. Fails in most cases anyway.
> 
> What happens is, when I schedule a job requesting
> 
> 'qsub -l gpus=1 -pe smp X'
> 
> with X anything >2 (i.e. asking for more than 2 slots) it never gets run.
> 
> With X<=2 (i.e. asking for two slots or less) it runs.
> 
> qstat -F gpus=*,slots=* explains why, I think. The nodes have 2 GPUs available (qstat -F confirms that). Now, when I submit asking for 1 GPU and 2 slots (i.e. '-l gpus=1 -pe smp 2'), I'd expect the there to be 1 GPU resource and 2 slots used. However, qstat shows 2 GPU resources used (i.e. non left) - and two slots used, as expected.
> 
> I just tried with another consumable (called matlab) I've introduced; same sort of thing happens - the nodes have 1 available; scheduling as '-l matlab -pe smp 8' ends me up with the node showing "-7" matlab resources left (I'd expect 0 left).
> 
> So somehow, it seems that the number of slots asked for (via the PE) gets also uses as the number of GPUs requested?
> 
> Is that me doing something stupid in my definitions, or a bug? Pretty sure this worked "as expected" in SGE6.2u5.
> 
> Tina
> 
> -- 
> Tina Friedrich, Computer Systems Administrator, Diamond Light Source Ltd
> Diamond House, Harwell Science and Innovation Campus - 01235 77 8442
> 
> -- 
> This e-mail and any attachments may contain confidential, copyright and or privileged material, and are for the use of the intended addressee only. If you are not the intended addressee or an authorised recipient of the addressee please notify us of receipt by returning the e-mail and do not use, copy, retain, distribute or disclose the information in or attached to the e-mail.
> Any opinions expressed within this e-mail are those of the individual and not necessarily of Diamond Light Source Ltd. Diamond Light Source Ltd. cannot guarantee that this e-mail or any attachments are free from viruses and we cannot accept liability for any damage which you may sustain as a result of software viruses which may be transmitted in or with the message.
> Diamond Light Source Limited (company no. 4375679). Registered in England and Wales with its registered office at Diamond House, Harwell Science and Innovation Campus, Didcot, Oxfordshire, OX11 0DE, United Kingdom
> 
> 
> 
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users





More information about the users mailing list