[gridengine users] Job reservations not being applied (-R y)

Stuart Barkley stuartb at 4gh.net
Fri Feb 28 17:14:18 UTC 2020

We have a variety of job types running on our system.  Some are short
single core jobs and others are long multicore jobs.  In order for the
large jobs to not be starved we are using job reservations.  All jobs
are submitted with appropriate h_rt values.

In trying to diagnose some apparent scheduling issues I have turned
job monitoring on, but I'm not seeing reservations being applied for
lots of jobs.

    % qconf -ssconf
    algorithm                         default
    schedule_interval                 0:00:45
    maxujobs                          0
    queue_sort_method                 seqno
    job_load_adjustments              NONE
    load_adjustment_decay_time        0:7:30
    load_formula                      m_core-slots
    schedd_job_info                   true
    flush_submit_sec                  5
    flush_finish_sec                  30
    params                            MONITOR=1
    reprioritize_interval             0:0:0
    halftime                          168
    usage_weight_list                 cpu=0.500000,mem=0.500000,io=0.000000
    compensation_factor               5.000000
    weight_user                       0.250000
    weight_project                    0.250000
    weight_department                 0.250000
    weight_job                        0.000000
    weight_tickets_functional         10000000
    weight_tickets_share              0
    share_override_tickets            TRUE
    share_functional_shares           TRUE
    max_functional_jobs_to_schedule   200
    report_pjob_tickets               TRUE
    max_pending_tasks_per_job         50
    halflife_decay_list               none
    policy_hierarchy                  OF
    weight_ticket                     2.000000
    weight_waiting_time               0.000050
    weight_deadline                   3600000.000000
    weight_urgency                    0.000000
    weight_priority                   10.000000
    max_reservation                   50
    default_duration                  168:00:00

Submitted jobs:

    % qsub -R y -p 500 -l hostname=bc130 do-sleep
    Your job 7684269 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc131 do-sleep
    Your job 7684270 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc132 do-sleep
    Your job 7684271 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc133 do-sleep
    Your job 7684272 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc134 -pe thread 24 do-sleep
    Your job 7684280 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc135 -pe thread 24 do-sleep
    Your job 7684281 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc136 -pe thread 24 do-sleep
    Your job 7684282 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc137 -pe thread 24 do-sleep
    Your job 7684283 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc138 -pe thread 12 do-sleep
    Your job 7684286 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc139 -pe thread 12 do-sleep
    Your job 7684287 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc140 -pe thread 8 do-sleep
    Your job 7684289 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc141 -pe thread 4 do-sleep
    Your job 7684290 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc142 -pe thread 2 do-sleep
    Your job 7684292 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc143 -pe thread 1 do-sleep
    Your job 7684293 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc144 do-sleep
    Your job 7684294 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc145 -pe orte 1 do-sleep
    Your job 7684439 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc144 do-sleep
    Your job 7684589 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc144 -l exclusive do-sleep
    Your job 7684590 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc144 -l exclusive=0 do-sleep
    Your job 7684591 ("SLEEPER") has been submitted
    % qsub -R y -p 500 -l hostname=bc168 -pe thread 40 do-sleep
    Your job 7684595 ("SLEEPER") has been submitted

These submissions all request reservations with "-R y" and bump up the
priority to ensure they are at the top of the queue.  They request
specific hosts already running the large user jobs.  Job 7684595
requests all cores on a different node type running many of the
smaller jobs.

The do-sleep script does a random sleep and has the following at the
start of the script:

    #$ -S /bin/sh
    #$ -cwd
    #$ -j y
    #$ -m n
    #$ -N SLEEPER
    #$ -o LOGS
    #$ -p -500                      # lowest priority
    #$ -l h_rt=1:00:00              # time limit
    #$ -l h_vmem=0.5G               # memory limit

Looking at /opt/sge_root/betsy/common/schedule I see:

    % cat /opt/sge_root/betsy/common/schedule | grep ':RESERVING:' | sort | uniq -c
         93 7684269:1:RESERVING:1583062196:3660:H:bc130.fda.gov:h_vmem:536870912.000000
         93 7684269:1:RESERVING:1583062196:3660:H:bc130.fda.gov:ram:536870912.000000
         93 7684269:1:RESERVING:1583062196:3660:H:bc130.fda.gov:slots:1.000000
         93 7684269:1:RESERVING:1583062196:3660:Q:short at bc130.fda.gov:exclusive:1.000000
         93 7684269:1:RESERVING:1583062196:3660:Q:short at bc130.fda.gov:slots:1.000000
         93 7684270:1:RESERVING:1583018960:3660:H:bc131.fda.gov:h_vmem:536870912.000000
         93 7684270:1:RESERVING:1583018960:3660:H:bc131.fda.gov:ram:536870912.000000
         93 7684270:1:RESERVING:1583018960:3660:H:bc131.fda.gov:slots:1.000000
         93 7684270:1:RESERVING:1583018960:3660:Q:short at bc131.fda.gov:exclusive:1.000000
         93 7684270:1:RESERVING:1583018960:3660:Q:short at bc131.fda.gov:slots:1.000000
         93 7684271:1:RESERVING:1583063682:3660:H:bc132.fda.gov:h_vmem:536870912.000000
         93 7684271:1:RESERVING:1583063682:3660:H:bc132.fda.gov:ram:536870912.000000
         93 7684271:1:RESERVING:1583063682:3660:H:bc132.fda.gov:slots:1.000000
         93 7684271:1:RESERVING:1583063682:3660:Q:short at bc132.fda.gov:exclusive:1.000000
         93 7684271:1:RESERVING:1583063682:3660:Q:short at bc132.fda.gov:slots:1.000000
         93 7684272:1:RESERVING:1583076042:3660:H:bc133.fda.gov:h_vmem:536870912.000000
         93 7684272:1:RESERVING:1583076042:3660:H:bc133.fda.gov:ram:536870912.000000
         93 7684272:1:RESERVING:1583076042:3660:H:bc133.fda.gov:slots:1.000000
         93 7684272:1:RESERVING:1583076042:3660:Q:short at bc133.fda.gov:exclusive:1.000000
         93 7684272:1:RESERVING:1583076042:3660:Q:short at bc133.fda.gov:slots:1.000000
         73 7684294:1:RESERVING:1582975134:3660:H:bc144.fda.gov:h_vmem:536870912.000000
         73 7684294:1:RESERVING:1582975134:3660:H:bc144.fda.gov:ram:536870912.000000
         73 7684294:1:RESERVING:1582975134:3660:H:bc144.fda.gov:slots:1.000000
         73 7684294:1:RESERVING:1582975134:3660:Q:short at bc144.fda.gov:exclusive:1.000000
         73 7684294:1:RESERVING:1582975134:3660:Q:short at bc144.fda.gov:slots:1.000000
         31 7684589:1:RESERVING:1582975134:3660:H:bc144.fda.gov:h_vmem:536870912.000000
         31 7684589:1:RESERVING:1582975134:3660:H:bc144.fda.gov:ram:536870912.000000
         31 7684589:1:RESERVING:1582975134:3660:H:bc144.fda.gov:slots:1.000000
         31 7684589:1:RESERVING:1582975134:3660:Q:short at bc144.fda.gov:exclusive:1.000000
         31 7684589:1:RESERVING:1582975134:3660:Q:short at bc144.fda.gov:slots:1.000000
         30 7684590:1:RESERVING:1582978794:3660:H:bc144.fda.gov:h_vmem:536870912.000000
         30 7684590:1:RESERVING:1582978794:3660:H:bc144.fda.gov:ram:536870912.000000
         30 7684590:1:RESERVING:1582978794:3660:H:bc144.fda.gov:slots:1.000000
         30 7684590:1:RESERVING:1582978794:3660:Q:short at bc144.fda.gov:exclusive:1.000000
         30 7684590:1:RESERVING:1582978794:3660:Q:short at bc144.fda.gov:slots:1.000000
         25 7684591:1:RESERVING:1582975134:3660:H:bc144.fda.gov:h_vmem:536870912.000000
         25 7684591:1:RESERVING:1582975134:3660:H:bc144.fda.gov:ram:536870912.000000
         25 7684591:1:RESERVING:1582975134:3660:H:bc144.fda.gov:slots:1.000000
         25 7684591:1:RESERVING:1582975134:3660:Q:short at bc144.fda.gov:exclusive:1.000000
         25 7684591:1:RESERVING:1582975134:3660:Q:short at bc144.fda.gov:slots:1.000000
          2 7684595:1:RESERVING:1585427949:3660:H:bc168.fda.gov:h_vmem:21474836480.000000
          2 7684595:1:RESERVING:1585427949:3660:H:bc168.fda.gov:ram:21474836480.000000
          2 7684595:1:RESERVING:1585427949:3660:H:bc168.fda.gov:slots:40.000000
          2 7684595:1:RESERVING:1585427949:3660:P:thread:slots:40.000000
          2 7684595:1:RESERVING:1585427949:3660:Q:long at bc168.fda.gov:exclusive:40.000000
          2 7684595:1:RESERVING:1585427949:3660:Q:long at bc168.fda.gov:slots:40.000000

Among others there should have been a reservation for job 7684281 on
node bc135.  Looking at scheduling information for bc135 shows:

    % cat /opt/sge_root/betsy/common/schedule | grep 'bc135' | sort | uniq -c
        123 7678355:1:RUNNING:1582831417:172860:H:bc135.fda.gov:h_vmem:51539607552.000000
        123 7678355:1:RUNNING:1582831417:172860:H:bc135.fda.gov:ram:51539607552.000000
        123 7678355:1:RUNNING:1582831417:172860:H:bc135.fda.gov:slots:24.000000
        123 7678355:1:RUNNING:1582831417:172860:Q:short at bc135.fda.gov:exclusive:24.000000
        123 7678355:1:RUNNING:1582831417:172860:Q:short at bc135.fda.gov:slots:24.000000

It looks like most of the jobs requesting a parallel environment are
not getting reservations on nodes with existing jobs with similar
parallel environments.  However, the job 7684595 did get a reservation
for it's parallel environment.

There are almost 400 pending jobs for other users in the queue.  Most
are for a single user requesting reservations with "-R y -pe thread
24" but do not seem to be getting any reservations.  My jobs are at
the top of the queue due to the priority.

The relevant complex variables are:

    % qconf -sc | egrep 'h_vmem|ram|slots|exclusive|relop|---'
    #name               shortcut       type        relop   requestable consumable default  urgency
    exclusive           excl           BOOL        EXCL    YES         YES        0        50
    h_vmem              h_vmem         MEMORY      <=      YES         YES        2G       0
    ram                 ram            MEMORY      <=      YES         YES        0        0
    slots               s              INT         <=      YES         YES        1        100
    # >#< starts a comment but comments are not saved across edits --------

The parallel environment is:

    % qconf -sp thread
    pe_name            thread
    slots              99999
    user_lists         NONE
    xuser_lists        NONE
    start_proc_args    NONE
    stop_proc_args     NONE
    allocation_rule    $pe_slots
    control_slaves     TRUE
    job_is_first_task  TRUE
    urgency_slots      min
    accounting_summary TRUE
    qsort_args         NONE

We are running Son of Grid Engine 8.1.8.  I see issues #1552 and #1553
fixed in 8.1.9 but those don't seem relevant.

Any thoughts on what might be happening?

Stuart Barkley
I've never been lost; I was once bewildered for three days, but never lost!
                                        --  Daniel Boone

More information about the users mailing list