[gridengine users] Job exit_status =1 , but failed=0 according to qacct.. what causes that?
Reuti
reuti at staff.uni-marburg.de
Mon Jun 27 19:01:28 UTC 2011
Hi,
Am 27.06.2011 um 20:47 schrieb William Deegan:
> Greetings,
>
> I have a short running job, which doesn't exceed memory or runtime limits specified.
> Here's the qacct output:
> qname all.q
> hostname a13.company.com
> group contr
> owner user123
> project NONE
> department defaultdepartment
> jobname veri_MM_11.1_82
> jobnumber 18323
> taskid undefined
> account sge
> priority 0
> qsub_time Mon Jun 27 11:30:58 2011
> start_time Mon Jun 27 11:31:48 2011
> end_time Mon Jun 27 11:33:05 2011
> granted_pe NONE
> slots 1
> failed 0
> exit_status 1
it's the result of your job. Something like:
#!/bin/sh
exit 1
will produce it. From SGE's point of view the job ran successfully. Whether there is any application error because of wrong input data or whatever it can't decide.
NB: There are two special error codes which the user can use to trigger a special behavior in SGE: a job exiting with 99 will be rescheduled and with 100 set to application error.
-- Reuti
> ru_wallclock 77
> ru_utime 65.078
> ru_stime 3.581
> ru_maxrss 0
> ru_ixrss 0
> ru_ismrss 0
> ru_idrss 0
> ru_isrss 0
> ru_minflt 896944
> ru_majflt 0
> ru_nswap 0
> ru_inblock 0
> ru_oublock 0
> ru_msgsnd 0
> ru_msgrcv 0
> ru_nsignals 0
> ru_nvcsw 14382
> ru_nivcsw 1306
> cpu 68.660
> mem 23.400
> io 0.101
> iow 0.000
> maxvmem 1.338G
> arid undefined
>
>
> Why does the exit status=1 if the job didn't fail?
>
> -Bill
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users
More information about the users
mailing list