[gridengine users] How can i make gridengine not to use ssh?

Reuti reuti at staff.uni-marburg.de
Thu Aug 11 10:16:23 UTC 2016


I just compiled openmpi-2.0.0 on my own and it looks like a regression to use `ssh` although it's running under SGE. Also for `mpicc` it was necessary to supply "-ldl" to succeed, this wasn't necessary in former versions.

I'll look into it.

For now I think it's best to stay with 1.10.3.

Note that after 1.6.5 they do a core binding (bad in case if several Open MPI jobs are running on one and the same node, as all will use core 0 upwards) and check the network topology. If it's set up with dead routes/interfaces (which normally won't matter), the startup of the parallel job may be delayed by one to two minutes (until they face a timeout).

-- Reuti


> Am 10.08.2016 um 21:15 schrieb Ulrich Hiller <hiller at mpia-hd.mpg.de>:
> 
> Hello,
> 
> My problem: How can i make gridengine not to use ssh?
> 
> Installed:
> openmpi-2.0.0 - configured with sge support.
> gridengine (son of gridengine) 8.1.9-1
> 
> I have a simple openmpi program 'teste' which only gives "hello world"
> output.
> I start it with:
> qsub -pe orte 160 -V -j yes -cwd -S /bin/bash <<< "mpiexec -n 160 teste
>>> /home/ljohndoe/out.dat"
> on the master node.
> I get back the error:
> 
> Host key verification failed.
> Host key verification failed.
> Permission denied, please try again.
> Permission denied, please try again.
> Received disconnect from 192.168.117.6: 2: Too many authentication
> failures for johndoe
> Permission denied, please try again.
> Permission denied, please try again.
> Received disconnect from 192.168.117.5: 2: Too many authentication
> failures for johndoe
> [...]
> 
> When i configure a passwordless ssh login to the execute nodes
> (exchanging the ssh key from master with 'ssh-copy-id), it works like
> charm. So it obviuously uses ssh connection to the execute nodes.
> 
> the output of  'qconf -sconf' contains:
> 
> login_shells                 sh,bash,ksh,csh,tcsh
> qlogin_command               builtin
> qlogin_daemon                builtin
> rlogin_command               builtin
> rlogin_daemon                builtin
> rsh_command                  builtin
> rsh_daemon                   builtin
> 
> (as far as i read this was the problem of a thread some time ago in this
> list. But i seem to have the correct values)
> 
> So everything should be fine- or not?
> Also with
> qlogin -l 'h=exec01'
> and
> qrsh -l 'h=exec01'
> i can go without problems to the first node.(called exec01), and i can
> also login to all other execute nodes as well.
> 
> Is there anywhere another 'switch' where i can let qsub run _not_ over ssh?
> 
> If is is of interest, the output  of 'qconf -sp orte' is:
> pe_name            orte
> slots              9999999
> user_lists         NONE
> xuser_lists        NONE
> start_proc_args    NONE
> stop_proc_args     NONE
> allocation_rule    $round_robin
> control_slaves     FALSE
> job_is_first_task  TRUE
> urgency_slots      min
> accounting_summary FALSE
> qsort_args         NONE
> 
> Also, i do not have any ssh lines in ~/.profile or ~/.bashrc
> 
> 
> Kind regards, ulrich
> 
> 
> 
> 
> 
> 
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users





More information about the users mailing list