[gridengine users] Fwd: dispatching sge task from an sge task - is that a reasonable practice?

Skylar Thompson skylar2 at u.washington.edu
Thu Feb 25 22:20:30 UTC 2016


On Thu, Feb 25, 2016 at 05:09:41PM -0500, bergman at merctech.com wrote:
> So, the lockfile contains the jobID, or something similar?
> 
> Some jobs use a file as a flag (ie. checking the existence or content), but 
> we've largely avoided POSIX file locking (the mixture of NFS, GPFS, & CIFS
> here should work....but it gets complicated quickly).

The folks who have used this technique most successfully track job status
in a Postgres database, but having the job ID in a file would be OK too.
Note that if you run lots of jobs, older Grid Engine versions will
wraparound job IDs at 10 million. Our newer UGE clusters don't have this
limitation, but I'm unsure when this changed.
 
> Our better 'chained' jobs use the SGE "hold" feature, some are launched as
> array jobs, and some (the really ulgy ones) loop within a shell script,
> checking for files that indicate that a prerequisite job finished,
> checking for errors in the prereq, or loop over 'qstat' checking if a
> specific jobid has completed.

Yep, we recommend folks use -hold_jid whenever possible. Some of our labs
also use DRMAA but obviously that introduces even more complexity.

-- 
-- Skylar Thompson (skylar2 at u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine



More information about the users mailing list