[gridengine users] A couple of questions...

Jesse Becker beckerjes at mail.nih.gov
Wed Jun 29 14:53:29 UTC 2011


On Wed, Jun 29, 2011 at 10:13:10AM -0400, Vic wrote:
>
>> Thus, you can run as many Quartus jobs as you want, but when someone
>> wants to run something else, it will go to the Pint queue, and suspend
>> the Quartus jobs if it needs to.
>
>OK, I think I've misunderstood something.

Or I have--that's more likely. :)

Let's say we have 10 compute nodes, each with 10 cores, for 100 total
cores.  Both queues Quart and Gallon have 1 slot per CPU core on each
exec host (so 100 slots per queue, 200 total slots between both queues).

>Users A and B submit runs to the Quart queue. The queue is, to all intents
>and purposes, full.

User A submits a Quartus job that take 50 slots.
User B submits a two Quartus jobs that take 30 slots/ea (60 total).

The Quart queue is now full, with 10 jobs waiting for resources.  Qstat
shoudl report something like this:

JobID JobName State User
----- ------- ----- ----
1     Alpha     r   A
2     Alpha     r   A
3     Alpha     r   A
<etc>
51    Bravo     r   B
52    Bravo     r   B
<...>
81    Bravo     r   B
82    Charlie   r   B
<...>
99    Charlie   r   B
100   Charlie   r   B
101   Charlie   qw  B
102   Charlie   qw  B
<...>
110   Charlie   qw  B


>User C could submit a different job preempting both, but user C is on
>holiday today, so we can ignore him.

He'll come back from holiday in a few minutes...

>User A has to perform another run. He submits another batch of jobs to the
>Quart queue.

Sure.  At this point, you've not given enough information to determine
what will actually happen, but there are a few basic options:

1)  There's no ticketing configuration, so SGE will operation as a FIFO.
Thus, User A's second job will have to wait for User B's jobs to finish
before the can run.  Qstat would have something like this (for the
queued jobs only):

JobID JobName State User
----- ------- ----- ----
101   Charlie   qw  B
102   Charlie   qw  B
<...>
110   Charlie   qw  B
111   Delta     qw  A
112   Delta     qw  A
113   Delta     qw  A
<etc>



2)  There *is* some sort of fairshare or share tree configruation, which
will try to reorder the *pending* jobs such that the job distribution is
"balanced".  If you allocate tickets evenly (say 100 tickets to both
users), then the second job from User A will probably run before the
remenants of User B's "Charlie" job, because User B is using more than
50% of the active resources:

JobID JobName State User
----- ------- ----- ----
111   Delta     qw  A
112   Delta     qw  A
113   Delta     qw  A
101   Charlie   qw  B
102   Charlie   qw  B
<...>
110   Charlie   qw  B
<etc>

Other, more complicated scheduling options are possible, of course.



Now let's say that User C comes back from Holiday, and runs a
NON-Quartus job.  That will go into the Gallon queue, and possibly
suspect one or more of the Quartus jobs.  When it is done, the Quartus
job(s) will resume where they left off.


>Those jobs will be queued pending the completion of the previous ones, no?
>That's the situation I'm trying to avoid.

You've run out of resources.  What do you expect? 

You could try to delay this by oversubscribing the Quartus queue, such
that you have more slots than CPUs (say 2 slots for each CPU).  That
would certainly let you "run" more jobs at the same time, but they will
all take slightly more than twice as long to complete.

>
>> Or you could buy more compute nodes.
>
>No, I couldn't. I'm just a contractor.

You can recommend that they buy more compute nodes then.  If what you've
described is a common occurance, then they are running at close to
capacity, and this is preventing users from getting work done.

If you don't already have usage metrics for your cluster, I suggest that
you start collecting them.  There are lots of good tools out there for
this, including Ganglia, Cacti, munin, or even using SGE's built-in
reporting functions.  Ganglia is pretty straightforward to install, and
makes lots of pretty pictures you can use as a justification for more
hardware.

>> Sure, we've the same issue here as well.  Not all jobs are "short," but
>> we try.
>
>Well, we have a slightly different situation in that jobs are only ever
>short if they fail spectacularly; long jobs are the norm. And this sin't
>something we can change.

Understood.


-- 
Jesse Becker
NHGRI Linux support (Digicon Contractor)



More information about the users mailing list