[gridengine users] SGE-6.2u5: Problem with scheduling of array tasks of multiple jobs.

Erik Soyez e.soyez at science-computing.de
Tue Jun 23 15:48:57 UTC 2015

Hallo William,

many thanks for your quick reply!  Okay, I need to specify:

>From the scheduler's point of view the jobs are identical.  They have no
"-l" resource requirements.  And yes, "load" is the only criteria which
restricts access to a node.  No nodes in alarm state.  Any other ideas?
I'm almost sure it has to do with the "job_load_adjustment", I just cannot
prove it yet....  :-)

Best regards

Erik Soyez.

On Tue, 23 Jun 2015, William Hay wrote:

> On Tue, 23 Jun 2015 12:33:27 +0000 Erik Soyez <e.soyez at science-computing.de> wrote:
>> Hello,
>> we have a very peculiar array job scheduling problem.
>> Cluster:  Very heterogeneous SGE cluster with different types of
>> workstations.
>> Problem:  One array job can use more job slots than two array jobs
>> together.
>> Example:  User A submitted an array job, 64 array tasks were running
>> at the same time.  Then user B submitted a similar array job and
>> after a while each of the jobs was running with approx 28 array
>> tasks.  According to the users the problem gets worse with each array
>> job.  Terminating one of the jobs immediately leads to the normal
>> usage again.
>> Question:  Does anybody know if the "job_load_adjustment" of array
>> jobs depends only on number of tasks or if the number of jobs is
>> taken into account as well?
> It should be just array tasks.  Is load the only criteria which
> restricts access to a node in your cluster?
> Is there any clue from looking at the actual nodes.  Do certain nodes
> go into alarm with fewer jobs?
>> From my point of view limits, quotas, etc. cannot be the cause of the
>> problem because then the total of job slots being used by two jobs
>> could not be less then the job slots being used by one single job.
>> The scheduler configuration though was not made for short array jobs
>> but for longer running parallel jobs on interactively used
>> workstations.
> You claim the jobs are similar but that isn't the same as identical
> This might explain the difference.  With identical jobs grid engine will
> obtain optimal packing of jobs onto nodes simply by fitting as many
> jobs onto each node as it can. For jobs with differing requirements
> and nodes with different resources there is no simple strategy for
> doing this (indeed it smells like an NP-hard problem to me)
> Imagine two nodes with consumables as shown:
> nodeA
> h_vmem=4G diskspace=30G
> nodeB
> h_vmem=6G diskspace=20G
> Array job 1 requests 2G h_vmem and 15G of diskspace
> Array job 2 requests 3G of h_vmem and 10G of diskspace
> If a task from array job 1 is scheduled to nodeB there isn't room for
> another task of either job on that node(diskspace shortage).  If a task
> from array job 2 is scheduled to nodeA there isn't room for another
> task of either job on that node(h_vmem shortage).  On the other hand you
> can fit two tasks of array job 1 on node A and two tasks of array job 2
> on nodeB.


Vorstandsvorsitzender/Chairman of the board of management:
Gerd-Lothar Leonhart
Vorstand/Board of Management:
Dr. Bernd Finkbeiner, Dr. Arno Steitz
Vorsitzender des Aufsichtsrats/
Chairman of the Supervisory Board:
Philippe Miltin
Sitz/Registered Office: Tuebingen
Registergericht/Registration Court: Stuttgart
Registernummer/Commercial Register No.: HRB 382196

More information about the users mailing list