[gridengine users] qrsh commlib error with separate submit host

Burian, John John.Burian at nationwidechildrens.org
Thu Sep 12 14:05:29 UTC 2013


On 9/10/13 4:27 PM, "Reuti" <reuti at staff.uni-marburg.de> wrote:

>It should be set up to run on the internal network only.
>
>(In case you want to use desktop machines which are only connected to the
>external network as submit hosts and need it for this purpose: do they
>have the /home mounted like the exechosts too?)


Yes, the desktop systems have access to the shared filesystems, including
the OGS spool directory, at the same mount points. No problem there.


>> The compute nodes on
>> the 'internal' network only communicate to the external network through
>> the queue master, which runs an IP Masquerade iptables rule.
>
>What is the reason for the nodes to communicate with the outside world?


Nothing to do directly with OGS. The compute nodes deliver email directly
to the mail exchanger, access to yum repositories, etc.


>> Now that I've explained it, I see what the problem is: The submit host
>>is
>> communicating with the queue master over the external network; the queue
>> master starts the interactive job on the compute node, which tries to
>> contact the submit host at its external address,  is getting routed
>> through the queue master and the iptables rule. Qrsh sees a connection
>> that claims to be from node 87, but which has the queue master's IP
>> address.
>
>Yep, you can easily change this by running the submit host's SGE access
>also on the internal network (and only there).
>
>Please have look at `man host_aliases` and
>http://arc.liv.ac.uk/SGE/howto/multi_intrfcs.html
>

Thanks for that.

John





More information about the users mailing list