[gridengine users] qrsh commlib error with separate submit host
john.kloss at gmail.com
Tue Sep 10 15:55:40 UTC 2013
>> error: commlib error: local host name error (IP based host name resolving "Levi-Montalcini01" doesn't match client host name from connect message "Levi-Montalcini86")
Is your submit host multi-homed? I have had issues where I had a
multi-homed submit host, say, hostA, which connects to two networks
hostA-int -> "grid network"
hostA-ext -> "gateway network"
Where "gateway network" and "grid network" do not route because
they're isolated from each other.
And the hostname used by hostA to contact a compute node is hostA-ext.
The compute node can't reach hostA-ext; it can only reach hostA-int.
I had to change the hostname for hostA to hostA-int (under
/etc/hostname or /etc/sysconfig/network or /etc/node, etc.) so that
IP/hostname resolution matched for the "grid network".
Or, perhaps your submit host local hostname does not match your domain
name lookup mechanism (DNS, NIS, etc.) . That is, your submit host
thinks its name is hostA.localhost and DNS thinks it's
What do you get when you type from the submit host
More information about the users