[gridengine users] SGE + MPICH2 + /etc/ssh/sshd_config
Gowtham
g at mtu.edu
Thu Sep 1 14:13:41 UTC 2011
Please see in-line reply
On Thu, 1 Sep 2011, Reuti wrote:
| Am 01.09.2011 um 15:28 schrieb Gowtham:
|
| > This belongs to "keeping our users from directly
| > SSHing into compute nodes" category.
| >
| >
| > On a test cluster (pauli), I have the following set up:
| >
| > 0. 1 Front end and 2 compute nodes
| >
| > Each compute node has 4 cpu cores
| >
| >
| > 1. Rocks 5.4 (service pack 2) - all rolls except
| > bio, condor and xen; runs SGE queuing system
| > 6.2u5
| >
| > [root at pauli ~]# rocks list roll
| > NAME VERSION ARCH ENABLED
| > area51: 5.4 x86_64 yes
| > base: 5.4 x86_64 yes
| > ganglia: 5.4 x86_64 yes
| > hpc: 5.4 x86_64 yes
| > kernel: 5.4 x86_64 yes
| > os: 5.4 x86_64 yes
| > sge: 5.4 x86_64 yes
| > web-server: 5.4 x86_64 yes
| > service-pack: 5.4.2 x86_64 yes
| >
| >
| > 2. MPICH2 (1.4), compiled with GCC 4.1.1, is in
| >
| > /share/apps/mpich2/1.4/gcc/4.1.2
| >
| > Configure & make/make install commands were as
| > follows
| >
| > export CC="/usr/bin/gcc"
| > export CXX="/usr/bin/g++"
| > export FC="/usr/bin/gfortran"
| > export F77="/usr/bin/gfortran"
| >
| > ./configure --prefix=/share/apps/mpich2/1.4/gcc/4.1.2
| > make
| > make install
| >
| > I compiled a simple 'hello, world' C program
| >
| > mpicc -g -Wall hello_world.c -o hello_world.x
| >
| > and 'hello_world.x' runs fine.
| >
| >
| > 3. There are two groups on this cluster
| >
| > pauli-users : all users belong to this group
| > pauli-admins : only administrators belong to this one,
| > in addition to being part of pauli-users
| >
| > I created 3 user accounts (all belonging to
| > pauli-users) and one more account that belongs to
| > pauli-users & pauli-admins
| >
| > These groups & users were created before any compute
| > node was added to the cluster
| >
| >
| > 4. The extend-compute.xml had the following lines in
| > <post> section
| >
| > <file name="/etc/ssh/sshd_config" mode="append">
| >
| > # Block non-root, non-pauli-admins users from directly
| > # accessing this compute node
| > AllowGroups root pauli-admins
| > </file>
| >
| > xmllint -noout extend-compute.xml was run and
| > no errors were found.
| >
| > rocks distribution was rebuilt and the compute
| > nodes were added via the usual insert-ethers
| >
| > I ran 'rocks sync users'
| >
| > When I check the '/etc/ssh/sshd_config' file
| > in compute nodes, I do see the line
| >
| > AllowGroups root pauli-admins
| >
| > The '/etc/group' file in compute node have lines
| > corresponding to 'pauli-users' and 'pauli-admins'
| >
| > pauli-users:x:500:
| > pauli-admins:x:501:john
| >
| >
| > 5. 'john' attempts to SSH into compute nodes get through
| > while 'greg' (just a pauli-user) are blocked
| >
| >
| > 6. Now comes SGE
| >
| > I run the 'hello_world.x' with 8 processors (spanning
| > both compute nodes) via SGE script - sge_test.sh -
| > with 8 processors
| >
| >
| > #! /bin/bash
| > # #$ -cwd
| > #$ -j y
| > #$ -S /bin/bash
| > #$ -pe mpich 8
| > #
| > # Run 'Hello, World!'
| > /share/apps/mpich2/1.4/gcc/4.1.2/bin/mpirun -n $NSLOTS \
| > -f $TMP/machines /share/apps/bin/hello_world.x
| >
| >
| > It produces desired output when I run this as 'john'
| > (a pauli-admin user)
| >
| > It hangs in 'r' state. 'sge_test.sh.po12' contains
| >
| >
| > -catch_rsh /opt/gridengine/default/spool/compute-0-0/active_jobs/12.1/pe_hostfile
| > compute-0-0
| > compute-0-0
| > compute-0-0
| > compute-0-0
| > compute-0-1
| > compute-0-1
| > compute-0-1
| > compute-0-1
| >
| >
| > 'sge_test.sh.o12' contains
| >
| >
| > Permission denied, please try again.
| > Permission denied, please try again.
| > Permission denied (publickey,gssapi-with-mic,password).
|
| What is the setting of:
|
| qrsh_command
| qrsh_daemon
|
| in `qconf -sconf`?
On front end and compute nodes, I get nothing.
[root at pauli ~]# qconf -sconf | grep "qrsh"
[root at pauli ~]#
[root at compute-0-0 ~]# qconf -sconf | grep qrsh
[root at compute-0-0 ~]#
|
| -- Reuti
|
|
| > Can someone please help me if I am doing something wrong or missing something?
| >
| > Thanks,
| > g
| >
| > --
| > Gowtham
| > Advanced IT Research Support
| > Michigan Technological University
| >
| > (906) 487/3593
| >
| > _______________________________________________
| > users mailing list
| > users at gridengine.org
| > https://gridengine.org/mailman/listinfo/users
|
|
More information about the users
mailing list