[gridengine users] trouble running MPI jobs through SGE

Marlies Hankel m.hankel at uq.edu.au
Mon Apr 13 05:18:30 UTC 2015


And also answering my original problem, setting H_MEMORYLOCKED now has 
openmpi-1.8.4 working and VASP running. Seems that openmpi-1.6.5 was a 
bit more verbose in reporting the error with memlock while the newer 
version just crashed with a segfault.

Thank you all for your help

Marlies

On 04/13/2015 02:54 PM, Marlies Hankel wrote:
> Answering my own question
>
> setting in qconf -mconf
> execd_params         H_MEMORYLOCKED=unlimited
>
> does the trick.
>
> Marlies
>
> On 04/13/2015 02:12 PM, Marlies Hankel wrote:
>> Dear all,
>>
>> I now have at least a simple hello world mpi program running by using 
>> openmpi-1.6.5 (thanks Reuti). But now I get a new error:
>>
>>  The OpenFabrics (openib) BTL failed to initialize while trying to
>> allocate some locked memory.  This typically can indicate that the
>> memlock limits are set too low.  For most HPC installations, the
>> memlock limits should be set to "unlimited".  The failure occured
>> here:
>>
>>   Local host:    cpu-1-5.local
>>   OMPI source:   btl_openib_component.c:1216
>>   Function:      ompi_free_list_init_ex_new()
>>   Device:        mlx4_0
>>   Memlock limit: 65536
>>
>> In my normal shell ulimit -l unlimited but when I go through SGE it 
>> is set to 64k.
>>
>> How do I change this?
>>
>> Can I set S_MEMORYLOCKED and H_MEMORYLOCKED to unlimited or is there 
>> another way to set these that they are taken as set for the shell by 
>> SGE?
>>
>> Best wishes
>>
>> Marlies
>>
>>
>> On 04/11/2015 11:02 PM, Reuti wrote:
>>> Am 11.04.2015 um 14:02 schrieb Marlies Hankel:
>>>
>>>> Dear Reuti,
>>>>
>>>> No, I did not use ScaLAPACK for now.
>>> Aha, I asked as I never got the ScLAPACK version of VASP running, 
>>> only the traditional parallelization.
>>>
>>>
>>>> We do not have intelMPI and at the moment I needed to get things 
>>>> going to get our new cluster up and usable.
>>>>
>>>> All our calculations are MPI based, not just VASP, and my own home 
>>>> grown code does not run either through SGE, so I hope I can find 
>>>> the problem soon....
>>> Does this happen to a simple mpihello application too?
>>>
>>> -- Reuti
>>>
>>>
>>>> Best wishes
>>>>
>>>> Marlies
>>>>
>>>> On 04/11/2015 07:40 PM, Reuti wrote:
>>>>> Am 11.04.2015 um 03:16 schrieb Marlies Hankel:
>>>>>
>>>>>> Dear all,
>>>>>>
>>>>>> Yes, I checked the paths and that looked ok. Also, I made sure 
>>>>>> that it finds the right MPI version and vasp path etc.
>>>>>>
>>>>>> I do not think the h_vmem is the problem as I do not get any 
>>>>>> errors in the queue logs for example. Also, in the end I change 
>>>>>> h_vmem to be not consumable and I also asked for a lot and that 
>>>>>> made no difference.
>>>>>>
>>>>>> I will try and use a 1.6.5 openMPI version and see if that makes 
>>>>>> any difference.
>>>>>>
>>>>>> Would the network scan cause SGE to abort the job?
>>>>> No. But there is a delay in startup.
>>>>>
>>>>> BTW: Are you using ScaLAPACK for VASP?
>>>>>
>>>>> -- Reuti
>>>>>
>>>>>
>>>>>> I do get some message about finding to IBs but I also get that 
>>>>>> when I run interactively (ssh to node not via a qlogin). I have 
>>>>>> switched that off to via mca to make sure this was not causing 
>>>>>> trouble.
>>>>>>
>>>>>> Best wishes
>>>>>>
>>>>>> Marlies
>>>>>>
>>>>>>
>>>>>> On 04/10/2015 08:12 PM, Reuti wrote:
>>>>>>>> Am 10.04.2015 um 04:51 schrieb Marlies Hankel<m.hankel at uq.edu.au>:
>>>>>>>>
>>>>>>>> Dear all,
>>>>>>>>
>>>>>>>> I have a ROCKS 6.1.1 install and I have also installed the SGE 
>>>>>>>> roll. So the base config was done via the ROCKS install. The 
>>>>>>>> only changes I have made are setting the h_vmem complex to 
>>>>>>>> consumable and setting up a scratch complex. I have also set 
>>>>>>>> the h_vmem for all hosts.
>>>>>>> And the VASP job does work without h_vmem? We are using VASP too 
>>>>>>> and have no problems with any set h_vmem.
>>>>>>>
>>>>>>>
>>>>>>>> I can run single CPU jobs fine and can execute simple things like
>>>>>>>>
>>>>>>>> mpirun -np 40 hostname
>>>>>>>>
>>>>>>>> but I cannot run proper MPI programs. I get the following error.
>>>>>>>>
>>>>>>>> mpirun noticed that process rank 0 with PID 27465 on node 
>>>>>>>> phi-0-3 exited on signal 11 (Segmentation fault).
>>>>>>> Are you using the correct `mpiexec` also during execution of a 
>>>>>>> job, i.e. between the nodes - maybe the interactive login has a 
>>>>>>> different $PATH set than inside a job script?
>>>>>>>
>>>>>>> And if it's from Open MPI: was the application compiled with the 
>>>>>>> same version of Open MPI which's `mpiexec` is used later on on 
>>>>>>> all nodes?
>>>>>>>
>>>>>>>
>>>>>>>> Basically the queues error logs on the head node and the 
>>>>>>>> execution nodes show nothing 
>>>>>>>> (/opt/gridengine/default/spool/../messages), also the .e, .o 
>>>>>>>> and .pe, .po also show nothing. The above error is in the 
>>>>>>>> standard output file of the program. I am trying VASP but have 
>>>>>>>> also tried a home grown MPI code. Both of these have been 
>>>>>>>> running out of the box via SGE for years on our old cluster 
>>>>>>>> (which was not ROCKS). I have tried the supplied orte PE 
>>>>>>>> (programs are compiled with openmpi 1.8.4
>>>>>>> The easiest would be to stay with Open MPI 1.6.5 as long as 
>>>>>>> possible. In the 1.8 series they changed some things which might 
>>>>>>> hinder a proper use:
>>>>>>>
>>>>>>> - The core binding is enabled by default in Open MPI 1.8. Having 
>>>>>>> two MPI jobs on a node they may use the same cores and leave 
>>>>>>> others idle. One can use "--bind-to none" and leave the binding 
>>>>>>> of SGE in effect (if any). The behavior is different in that 
>>>>>>> way, as SGE will give a job a set of cores, and the Linux 
>>>>>>> scheduler is free to move the processes around inside this set. 
>>>>>>> The native binding in Open MPI is per process (something SGE 
>>>>>>> can't do of course, as Open MPI opens additional forks after the 
>>>>>>> initial startup of `orted`. (Sure, the given set of cores by SGE 
>>>>>>> could be rearranged to give this list to Open MPI).
>>>>>>>
>>>>>>> - Open MPI may scan the network before the actual jobs start to 
>>>>>>> get all possible routes between the nodes. Depending on the 
>>>>>>> network setup this may take 1-2 minutes.
>>>>>>>
>>>>>>> -- Reuti
>>>>>>>
>>>>>>>
>>>>>>>>   compiled with intel and with --with-sge and --with-verbs) and 
>>>>>>>> have also tried one where I specify catch rsh and startmpi and 
>>>>>>>> stopmpi scripts but it made no difference. It seems as if the 
>>>>>>>> program does not even start. I am not even trying to run over 
>>>>>>>> several nodes yet.
>>>>>>>>
>>>>>>>> Adding to that is that I can run the program (VASP) perfectly 
>>>>>>>> fine by ssh to a node and just running from the command line. 
>>>>>>>> And also over several nodes via a hostfile. So VASP itself is 
>>>>>>>> working fine.
>>>>>>>>
>>>>>>>> I had a look at env and made sure ulimits are set OK (need 
>>>>>>>> ulimit -s unlimted for VASP to work) but all looks OK.
>>>>>>>>
>>>>>>>> Has anyone seen this problem before? Or do you have any 
>>>>>>>> suggestion on what to do to get some info on where it actually 
>>>>>>>> goes wrong?
>>>>>>>>
>>>>>>>> Thanks in advance
>>>>>>>>
>>>>>>>> Marlies
>>>>>>>> -- 
>>>>>>>>
>>>>>>>> ------------------
>>>>>>>>
>>>>>>>> Dr. Marlies Hankel
>>>>>>>> Research Fellow, Theory and Computation Group
>>>>>>>> Australian Institute for Bioengineering and Nanotechnology 
>>>>>>>> (Bldg 75)
>>>>>>>> eResearch Analyst, Research Computing Centre and Queensland 
>>>>>>>> Cyber Infrastructure Foundation
>>>>>>>> The University of Queensland
>>>>>>>> Qld 4072, Brisbane, Australia
>>>>>>>> Tel: +61 7 334 63996 | Fax: +61 7 334 63992 | mobile:0404262445
>>>>>>>> Email:
>>>>>>>> m.hankel at uq.edu.au | www.theory-computation.uq.edu.au
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Notice: If you receive this e-mail by mistake, please notify me,
>>>>>>>> and do not make any use of its contents. I do not waive any
>>>>>>>> privilege, confidentiality or copyright associated with it. Unless
>>>>>>>> stated otherwise, this e-mail represents only the views of the
>>>>>>>> Sender and not the views of The University of Queensland.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> users mailing list
>>>>>>>> users at gridengine.org
>>>>>>>> https://gridengine.org/mailman/listinfo/users
>>>>>> -- 
>>>>>>
>>>>>> ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms
>>>>>>
>>>>>> Please note change of work hours: Monday, Wednesday and Friday
>>>>>>
>>>>>> Dr. Marlies Hankel
>>>>>> Research Fellow
>>>>>> High Performance Computing, Quantum Dynamics& Nanotechnology
>>>>>> Theory and Computational Molecular Sciences Group
>>>>>> Room 229 Australian Institute for Bioengineering and 
>>>>>> Nanotechnology  (75)
>>>>>> The University of Queensland
>>>>>> Qld 4072, Brisbane
>>>>>> Australia
>>>>>> Tel: +61 (0)7-33463996
>>>>>> Fax: +61 (0)7-334 63992
>>>>>> mobile:+61 (0)404262445
>>>>>> Email: m.hankel at uq.edu.au
>>>>>> http://web.aibn.uq.edu.au/cbn/
>>>>>>
>>>>>> ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms
>>>>>>
>>>>>> Notice: If you receive this e-mail by mistake, please notify me, 
>>>>>> and do
>>>>>> not make any use of its contents. I do not waive any privilege,
>>>>>> confidentiality or copyright associated with it. Unless stated
>>>>>> otherwise, this e-mail represents only the views of the Sender 
>>>>>> and not
>>>>>> the views of The University of Queensland.
>>>>>>
>>>>>>
>>>>>>
>>>> -- 
>>>>
>>>> ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms
>>>>
>>>> Please note change of work hours: Monday, Wednesday and Friday
>>>>
>>>> Dr. Marlies Hankel
>>>> Research Fellow
>>>> High Performance Computing, Quantum Dynamics& Nanotechnology
>>>> Theory and Computational Molecular Sciences Group
>>>> Room 229 Australian Institute for Bioengineering and 
>>>> Nanotechnology  (75)
>>>> The University of Queensland
>>>> Qld 4072, Brisbane
>>>> Australia
>>>> Tel: +61 (0)7-33463996
>>>> Fax: +61 (0)7-334 63992
>>>> mobile:+61 (0)404262445
>>>> Email: m.hankel at uq.edu.au
>>>> http://web.aibn.uq.edu.au/cbn/
>>>>
>>>> ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms
>>>>
>>>> Notice: If you receive this e-mail by mistake, please notify me, 
>>>> and do
>>>> not make any use of its contents. I do not waive any privilege,
>>>> confidentiality or copyright associated with it. Unless stated
>>>> otherwise, this e-mail represents only the views of the Sender and not
>>>> the views of The University of Queensland.
>>>>
>>>>
>>>>
>>
>

-- 

------------------

Dr. Marlies Hankel
Research Fellow, Theory and Computation Group
Australian Institute for Bioengineering and Nanotechnology (Bldg 75)
eResearch Analyst, Research Computing Centre and Queensland Cyber Infrastructure Foundation
The University of Queensland
Qld 4072, Brisbane, Australia
Tel: +61 7 334 63996 | Fax: +61 7 334 63992 | mobile:0404262445
Email: m.hankel at uq.edu.au | www.theory-computation.uq.edu.au


Notice: If you receive this e-mail by mistake, please notify me,
and do not make any use of its contents. I do not waive any
privilege, confidentiality or copyright associated with it. Unless
stated otherwise, this e-mail represents only the views of the
Sender and not the views of The University of Queensland.





More information about the users mailing list