[gridengine users] Lost qrsh jobs
reuti at staff.uni-marburg.de
Wed Nov 21 15:55:19 UTC 2012
Am 21.11.2012 um 16:10 schrieb François-Michel L'Heureux:
> I have an issue where some jobs I call with the qrsh commands never appear into the queue. If I run the command "ps -ef | grep qrsh" I can see them. My setup
Ok, but did it ever start on any node?
> is as follows:
> • I just have one process calling the grid engine via qrsh. This process resides on the master node.
> • I don't use nfs, I use sshfs instead.
> • I run over a dynamic cluster, which mean that at anytime nodes can be added or removed.
> Is anyone having an idea on what can cause the issue? I can counter it by looking at the process list when the queue is empty and killing/rescheduling those running a qrsh command, but I would rather prevent it.
What do you mean by "dynamic cluster". SGE needs fixed addresses per node.
> users mailing list
> users at gridengine.org
More information about the users