[gridengine users] Tight integration problem with mvapich2 2.0

Götz Waschk goetz.waschk at gmail.com
Tue Jan 6 13:04:10 UTC 2015


On Tue, Jan 6, 2015 at 12:38 PM, William Hay <w.hay at ucl.ac.uk> wrote:

> While I don't know about a fix but the first thing I would check is
> whether your job is tightly integrated (that is starting slave processes
> via grid engine).  To check this log into a node running slave processes
> and check whether they are descended from an sge_shepherd.
>
Dear William,

yes, it is tightly integrated and this was working fine before the upgrade
and is still working fine with openmpi. The pstree output looks like this
on a slave node:
      |-sge_execd-+-load-sensor
     |           |-sge_shepherd-+-mycoshepherd
     |           |
|-qrsh_starter---hydra_pmi_proxy---8*[mpitests-IMB-MP---{mpitests-IMB-M}]
     |           |              `-6*[{sge_shepherd}]
     |           `-4*[{sge_execd}]

So the MPI processes are all children of the execd.

Regards, Götz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gridengine.org/pipermail/users/attachments/20150106/0fbf4c56/attachment.html>


More information about the users mailing list