[gridengine users] [SGE-discuss] spool, no information, loss of jobs

Dave Love d.love at liverpool.ac.uk
Fri Jun 17 12:39:50 UTC 2011


Reuti <reuti at staff.uni-marburg.de> writes:

>> In a qmaster messages file are inforamations about missing file/folder at the time of ending of job:
>> ----------------
>> 6/16/2011 10:06:30|schedu|sged2|E|can't find parallel task 50993.1 task past_usage for update in function pe_task_update_master_list_usage
>> 06/16/2011 10:06:30|schedu|sged2|E|callback function for event "3941466. EVENT JOB 50993.1 task past_usage USAGE" failed
>> 06/16/2011 10:07:10|worker|sged2|E|unlink(jobs/00/0005/0993/common) failed: No such file or directory
>> 06/16/2011 10:07:10|worker|sged2|E|can not remove file job spool file: jobs/00/0005/0993/common
>
> The "common" is strange here. What I saw in the past was just a plain file like 0993 containing binary information of the job.

See libs/uti/sge_spool.h, though it doesn't explain the confusing different
variants.

-- 
Excuse the typping -- I have a broken wrist



More information about the users mailing list