[gridengine users] [SGE-discuss] spool, no information, loss of jobs
Dave Love
d.love at liverpool.ac.uk
Fri Jun 17 12:39:50 UTC 2011
Reuti <reuti at staff.uni-marburg.de> writes:
>> In a qmaster messages file are inforamations about missing file/folder at the time of ending of job:
>> ----------------
>> 6/16/2011 10:06:30|schedu|sged2|E|can't find parallel task 50993.1 task past_usage for update in function pe_task_update_master_list_usage
>> 06/16/2011 10:06:30|schedu|sged2|E|callback function for event "3941466. EVENT JOB 50993.1 task past_usage USAGE" failed
>> 06/16/2011 10:07:10|worker|sged2|E|unlink(jobs/00/0005/0993/common) failed: No such file or directory
>> 06/16/2011 10:07:10|worker|sged2|E|can not remove file job spool file: jobs/00/0005/0993/common
>
> The "common" is strange here. What I saw in the past was just a plain file like 0993 containing binary information of the job.
See libs/uti/sge_spool.h, though it doesn't explain the confusing different
variants.
--
Excuse the typping -- I have a broken wrist
More information about the users
mailing list