[gridengine users] display GPU with qstat

Reuti reuti at staff.uni-marburg.de
Fri Dec 1 10:41:11 UTC 2017


Hi,

> Am 01.12.2017 um 08:51 schrieb Ariel Vives <ariel.vives at telecom-paristech.fr>:
> 
> 
> 
> ----- Mail original -----
>> De: "Reuti" <reuti at staff.Uni-Marburg.DE>
>> À: "Ariel Vives" <ariel.vives at telecom-paristech.fr>
>> Cc: "users" <users at gridengine.org>
>> Envoyé: Jeudi 30 Novembre 2017 17:21:24
>> Objet: Re: [gridengine users] display GPU with qstat
> 
>> Hi:
>> 
>>> Am 30.11.2017 um 16:16 schrieb Ariel Vives <ariel.vives at telecom-paristech.fr>:
>>> 
>>> Hi all,
>>> 
>>> I'm runngin a Debian 9 Strech.
>>> 
>>> Gridengine is 8.1.9.
>>> 
>>> Evrything works fine.
>>> 
>>> I've got 1 or 2 GPU on each node.
>>> 
>>> users can use it when submitting jobs.
>>> 
>>> BUT :-)
>>> 
>>> but I can't display gpu usage with qstat on the portal.
>>> 
>>> gpus does'nt appear.
>>> 
>>> Is there a way to do that ?
>>> 
>>> With qconf, I've added gpu :
>>> gpu   gpu   INT   <=    YES    YES        0        0
>> 
>> The option -F will list resources in general (by default all or a dedicated
>> one):
>> 
>> qstat -F gpu
>> qhost -F gpu
>> 
> 
> Thanks !
> 
> And is it possible to have the load of the gpus ?

Sure. But you will have to set up a load_sensor specific for your GPUs. And it would only allow a feedback to see that a particular gpu is fully loaded.


> It's what my researchers want to launch jobs on node where gpu are not in use.

Why do they want to select a node by hand? With the complex gpu which you set up right now, SGE is aware and selecting an exechost where this resource is free. Checking the actual load (to be acquired by a load sensor) by visual inspection might give the impression that a node ist free, despite the fact that it's in a serial phase and the gpu usage may start soon. Then the node will become overloaded.

Having set up the consumable complex, should allow SGE to schedule the jobs without manual interference.

That a user requests a gpu and is not using it, can't be revised by a load sensor. Only by human negotiation. It's the same like someone requests a PE with 8 slots and starts only a serial computation.

-- Reuti


> I've read many things about load_sensor, but not sure it's the same purpose.
> 
> 
>> -- Reuti
>> 
>>> 
>>> for each node I've added :
>>> complex_values        gpu=2 (if 2 gpus)
>>> or
>>> complex_values        gpu=1 (if 1 gpu)
>>> 
>>> If yo uneed any information on my config to help...
>>> 
>>> Thanks
>>> 
>>> Ariel
>>> _______________________________________________
>>> users mailing list
>>> users at gridengine.org
>>> https://gridengine.org/mailman/listinfo/users
> 
> -- 
> Ariel Vives
> Division des Systèmes d'Information - Télécom ParisTech
> mail : ariel.vives at telecom-paristech.fr
> tél. : 01.45.81.71.86
> 
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users





More information about the users mailing list