[gridengine users] Node refuse to run job

Jerome jerome at ibt.unam.mx
Fri Feb 10 00:07:28 UTC 2012


Dear Hung-Sheng

Thanks for your quick reply.

I've check on the CELL/spool/ on the node, and the jobs directory is empty.

On the master node, the jobs directory just contain the number of files 
corresponding to the jobs running o qiting to be run.

Should i check in a specific directory? Could you be more precise please?

Thank you

On 09/02/2012 11:59, "Hung-Sheng Tsao (Lao Tsao 老曹) Ph.D." wrote:
> check the CELL/spool/ directory of the qmaster and nodes
>
>
> On 2/9/2012 12:51 PM, Jerome wrote:
>> Dera all
>>
>> I have the SGE version GE 6.2u2_1 on a Rocks cluster.
>> Since few days, a node refuse to run a job. using "qstat -j jid", i
>> notice this line a the end of the output:
>>
>> cannot run on host "compute-2-15.local" until clean up of an previous
>> run has finished
>>
>> I revise on the node 2-15, but the jobs directory is totaly empty. To
>> be sure about what i do, i reinstall from scratch the node, and the
>> problem persists.
>> It seems to be the master how is causing this issue. Someone can help
>> me on find where is the bad information file that i have to modify to
>> let my node running the job?
>>
>> Best regards.
>


-- 
-- Jérôme
On ne peut s'empêcher de vieillir,
mais on peut s'empêcher de devenir vieux.
     (Matisse)


More information about the users mailing list