[gridengine users] Intermittent commlib errors with MPI jobs

Brendan Moloney moloney at ohsu.edu
Thu Nov 8 09:32:25 UTC 2012

>> Hello,
>> I have MPICH2 tightly
>Which version? It should work out-of-the-box with SGE.

Version is 1.4 and yes it does have built in integration.

>> integrated with OGS 2011.11.  Everything is working great in general.  I have noticed when I submit a moderate number of small MPI jobs (e.g. 100 jobs each using two cores) that I will get intermittent commlib errors like:
>> commlib error: got select error (Broken pipe)
>> executing task of job 138060 failed: failed sending task to execd at node1.ohsu.edu: can't find connection
>This sounds like a network problem unrelated to SGE. Do you use a private network inside the cluster or can you outline the network configuration - do you have a dedicated switch for the cluster?

Dedicated switch. One node is elsewhere on the LAN, but I see this error come up between two nodes on the dedicated switch. None of the nodes show packet errors. 

>> Sometimes I get "Connection reset by peer"
>Which startup of slave tasks do you use, i.e.:
>$ qconf -sconf
>qlogin_command               builtin
>qlogin_daemon                builtin
>rlogin_command               builtin
>rlogin_daemon                builtin
>rsh_command                  builtin
>rsh_daemon                   builtin
>It sound like an SSH problem with your mentioned output above and your settings could be different.

I am indeed using SSH with a wrapper script for adding the group ID:

qlogin_command               /usr/global/bin/qlogin-wrapper
qlogin_daemon                /usr/global/bin/rshd-wrapper
rlogin_command               /usr/bin/ssh
rlogin_daemon                /usr/global/bin/rshd-wrapper
rsh_command                  /usr/bin/ssh
rsh_daemon                   /usr/global/bin/rshd-wrapper

>> instead of "Broken pipe". I have the allocation rule set to round robin, so the second process is always spawned on a remote host.
>For small jobs I would configure it to run on only one machine - unless they create large scratch files.

Yes but I would like to have a single MPI parallel environment, and in general round robin is the best option for my setup. 


More information about the users mailing list