[gridengine users] A couple of questions...

Jesse Becker beckerjes at mail.nih.gov
Wed Jun 29 15:30:42 UTC 2011


On Wed, Jun 29, 2011 at 11:08:04AM -0400, Vic wrote:
>
>>>That's the situation I'm trying to avoid.
>>
>> You've run out of resources.  What do you expect?
>
>What I need is to get a limited number of jobs into the running state *per
>invocation*. This needs not to be ticket-dependent; for assorted other
>reasons that's not going to work for us just yet. Additionally, that limit
>needs to be somewhat lower than the queue capacity so that several runs
>can be started simultaneously without queuing all their jobs.

Fundamentally, it seems there is a disconnect between a Quartus
"invocation" and SGE.  Each invocation spawns multiple SGE jobs.  So far
as SGE is concerned, these are all completely independent of each other.
SGE doesn't care that they came from the same invocation.

You additionally have a *business* requirement that they "start
quickly", but I'm not sure that is possible to do using only SGE
functionality, nor do I expect that Quartus has that ability either.


How about creating a wrapper for qsub that will create a new Project *on
the fly*, then create a new RQS (also on the fly) for that project, then
submit the jobs using that new project?

Or:  your wrapper could try to "count" the number of running jobs, and
place SGE job dependencies on them internally to try to throttle the
number that run at a given time.

Unfortunately, an RQS can't apply to a job_name pattern, or that might
be an option.

Even if you "reserve" a few slots to "get new jobs 'started'," that will
only last for so long, since you can fill those slots up as well.  Jobs
will get queued.


>Trying to change the parameters of the problem doesn't work - I can't just
>tell them they need to buy more computers, and I can't just tell them they
>need to change their expectations of when the jobs will run; either of
>these approaches will just lead to the abandonment of the exercise.

It may be that what they are asking is not directly possible, in
which case you can propose reasonable alternatives.


-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)



More information about the users mailing list