[gridengine users] control_slaves on PE

Roberto Nunnari roberto.nunnari at supsi.ch
Wed Jan 14 10:50:37 UTC 2015

Il 14.01.2015 11:05, Reuti ha scritto:
> Hi,
> Am 14.01.2015 um 10:09 schrieb Roberto Nunnari:
>> Hi.
>> man sge_pe states:
>> control_slaves
>>   This parameter can be set to TRUE or FALSE (the default). It indicates whether Oracle Grid Engine is the creator of the slave tasks of a parallel  application  via  sge_execd(8)  and  sge_shepherd(8) and thus has full control over all processes in a parallel application, which enables capabilities such as resource limitation and correct accounting. However, to gain control over the slave tasks of a parallel  application,  a sophisticated  PE  interface  is  required, which works closely together with Oracle Grid Engine facilities. Such PE interfaces are available through your local Oracle Grid Engine support office.
>> Does that mean that you need to buy some software from Oracle in order to take advantage of 'control_slaves TRUE' ?
> No.
> It mainly refers to the fact that it depends on the parallel application whether any preparation might be necessary by supplying scripts for start/stop_proc_args and set up or tuning the started application not to do nasty things like jumping out of the process tree.
> Technically its value must be set to TRUE to allow that a started job script is allowed to perform `qrsh --inherit ...` to reach other nodes without any `rsh`/`ssh` at all (in my clusters `ssh` is available for admin staff only).

Interesting.. once I try to do the same, but a program stopped to work.. 
so I implemented a (half)solution where ssh is for admins only on the 
master node, and for all users on the execution nodes.

> While these scripts were mandatory for many parallel applications in the past, MPICH and Open MPI (./configure --with-sge for the latter) in the actual versions support SGE out of the box.
> For Open MPI you can look for the value:
> $ ompi_info | grep grid
>                   MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.6.5)

Yes. It's like that, thank you. :-)

> whether it's set up in your version. Care must be taken with Open MPI 1.8 and newer as by default they issue a core binding independent from SGE's one and always start at socket/core 0/0, i.e. if more than one Open MPI job is running on a node it's necessary to either switch of Open MPI's core binding (and/or use SGE's one) or reformat the by SGE granted core list that it can be used by Open MPI.

humm.. I see that on CentOS 6.6 they introduced openmpi 1.8.1..
# ompi_info | grep grid
    MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.8.1)

while on CentOS 6.4:
# ompi_info | grep grid
    MCA ras: gridengine (MCA v2.0, API v2.0, Component v1.5.4)

..so does that means that even though it's version 1.8.1, it doesn't use 
the default core binding that breaks SGE? I rephrase my question: if I 
upgrade my execution nodes from CentOS 6.4 (that use openmpi 1.5.4) to 
CentOS 6.6 (that use openmpi 1.8.1) SGE PE jobs will continue to work or 
will it need some tweeks?

You talk about 'switch off openmpi's core binding and/or use SGE's 
one'.. how do you do it? at build time or at run time? What's the 
command line switch?

Thank you and best regards.

> -- Reuti
>> In my production environment, I have four PEs and two are set as 'control_slaves FALSE' and two 'control_slaves TRUE'.. and as long as I know, all of them behave as expected.. that has been like that for about 9 years, since I inherited the SGE cluster..
>> Can anybody cast some light on it, please?
>> my present environment:
>> - OGE 6.2u7
>> - on the execution nodes: openmpi 1.5.4
>> - on the master node: openmpi 1.4
>> Thank you and best regards.
>> Robi
>> _______________________________________________
>> users mailing list
>> users at gridengine.org
>> https://gridengine.org/mailman/listinfo/users

More information about the users mailing list