[gridengine users] trouble running MPI jobs through SGE

Reuti reuti at staff.uni-marburg.de
Fri Apr 10 10:12:28 UTC 2015

> Am 10.04.2015 um 04:51 schrieb Marlies Hankel <m.hankel at uq.edu.au>:
> Dear all,
> I have a ROCKS 6.1.1 install and I have also installed the SGE roll. So the base config was done via the ROCKS install. The only changes I have made are setting the h_vmem complex to consumable and setting up a scratch complex. I have also set the h_vmem for all hosts.

And the VASP job does work without h_vmem? We are using VASP too and have no problems with any set h_vmem.

> I can run single CPU jobs fine and can execute simple things like 
> mpirun -np 40 hostname 
> but I cannot run proper MPI programs. I get the following error. 
> mpirun noticed that process rank 0 with PID 27465 on node phi-0-3 exited on signal 11 (Segmentation fault).

Are you using the correct `mpiexec` also during execution of a job, i.e. between the nodes - maybe the interactive login has a different $PATH set than inside a job script?

And if it's from Open MPI: was the application compiled with the same version of Open MPI which's `mpiexec` is used later on on all nodes?

> Basically the queues error logs on the head node and the execution nodes show nothing (/opt/gridengine/default/spool/../messages), also the .e, .o and .pe, .po also show nothing. The above error is in the standard output file of the program. I am trying VASP but have also tried a home grown MPI code. Both of these have been running out of the box via SGE for years on our old cluster (which was not ROCKS). I have tried the supplied orte PE (programs are compiled with openmpi 1.8.4

The easiest would be to stay with Open MPI 1.6.5 as long as possible. In the 1.8 series they changed some things which might hinder a proper use:

- The core binding is enabled by default in Open MPI 1.8. Having two MPI jobs on a node they may use the same cores and leave others idle. One can use "--bind-to none" and leave the binding of SGE in effect (if any). The behavior is different in that way, as SGE will give a job a set of cores, and the Linux scheduler is free to move the processes around inside this set. The native binding in Open MPI is per process (something SGE can't do of course, as Open MPI opens additional forks after the initial startup of  `orted`. (Sure, the given set of cores by SGE could be rearranged to give this list to Open MPI).

- Open MPI may scan the network before the actual jobs start to get all possible routes between the nodes. Depending on the network setup this may take 1-2 minutes.

-- Reuti

>  compiled with intel and with --with-sge and --with-verbs) and have also tried one where I specify catch rsh and startmpi and stopmpi scripts but it made no difference. It seems as if the program does not even start. I am not even trying to run over several nodes yet. 
> Adding to that is that I can run the program (VASP) perfectly fine by ssh to a node and just running from the command line. And also over several nodes via a hostfile. So VASP itself is working fine. 
> I had a look at env and made sure ulimits are set OK (need ulimit -s unlimted for VASP to work) but all looks OK. 
> Has anyone seen this problem before? Or do you have any suggestion on what to do to get some info on where it actually goes wrong? 
> Thanks in advance 
> Marlies
> -- 
> ------------------
> Dr. Marlies Hankel
> Research Fellow, Theory and Computation Group
> Australian Institute for Bioengineering and Nanotechnology (Bldg 75)
> eResearch Analyst, Research Computing Centre and Queensland Cyber Infrastructure Foundation
> The University of Queensland
> Qld 4072, Brisbane, Australia
> Tel: +61 7 334 63996 | Fax: +61 7 334 63992 | mobile:0404262445
> Email: 
> m.hankel at uq.edu.au | www.theory-computation.uq.edu.au
> Notice: If you receive this e-mail by mistake, please notify me, 
> and do not make any use of its contents. I do not waive any 
> privilege, confidentiality or copyright associated with it. Unless 
> stated otherwise, this e-mail represents only the views of the 
> Sender and not the views of The University of Queensland.
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users

More information about the users mailing list