[gridengine users] RoundRobin scheduling among users

Reuti reuti at staff.uni-marburg.de
Mon Jan 25 21:17:16 UTC 2016


Am 25.01.2016 um 20:34 schrieb Skylar Thompson:

> Yep, we use functional tickets to accomplish this exact goal. Every user
> gets 1000 functional tickets via auto_user_fshare in sge_conf(5), though
> your exact number will depend on the number tickets and weights you have
> elsewhere in your policy configuration.

Also the waiting time should be set to 0, and less importance of the urgency (as the default is to give 1000 per slot in the complex configuration - this means more slots results in being more important):

weight_user                       0.900000
weight_project                    0.000000
weight_department                 0.000000
weight_job                        0.100000
weight_tickets_functional         1000000
weight_tickets_share              0
share_override_tickets            TRUE
share_functional_shares           TRUE
max_functional_jobs_to_schedule   200
report_pjob_tickets               TRUE
max_pending_tasks_per_job         50
halflife_decay_list               none
policy_hierarchy                  F
weight_ticket                     1.000000
weight_waiting_time               0.000000
weight_deadline                   3600000.000000
weight_urgency                    0.100000
weight_priority                   1.000000
max_reservation                   32
default_duration                  8760:00:00

-- Reuti


> On Mon, Jan 25, 2016 at 11:25:53AM -0800, Christopher Heiny wrote:
>> 
>> Hi all,
>> 
>> We've been using GridEngine for several years now, currently OGS
>> 2011.11p1 on Fedora 20 installed from Fedora RPMs.  Our job mix is
>> mostly embarassingly parallel - we use array jobs to dispatch up to 100
>> tasks, each of which might require 1, 16, 32, or 64 cores.  Each job
>> takes up a significant amount of our cluster, so that only a few jobs
>> are typically active at one time, though there may be a couple of dozen
>> pending.  Up recently, FIFO scheduling has worked fine for us.  But now
>> one of the groups has come to me with a request for a refinement in the
>> scheduling.  Here's the scenario:
>> 
>> The team has started submitting batches of 10 to 20 such jobs at a time.
>> While the work isn't done until all 20 jobs complete, analysis of
>> results can start as soon as the first job completes.  With default
>> scheduling, if Alice qsubs here 10 jobs first, and Bob qsubs his jobs a
>> minute later, Bob still needs to wait a couple of hours to get the
>> results from his first job.
>> 
>> Since there's only so much time Bob can spend at the foosball table
>> before the Big Cheese starts thinking he's goofing off, the team has
>> requested is that I implement scheduling such that Bob's first job(s)
>> will run as soon as possible (probably after one of Alice's early jobs
>> has completed), and the cluster resources will be (very roughly) split
>> between the two of them.  And if Carol comes along with a similar set of
>> jobs, that her first job(s) would run as soon as one of Alice's or Bob's
>> finishes, and then cluster resources would be split three ways (very
>> roughly) between them.  And similarly if David comes along and adds his
>> jobs to the mix, then resources get roughly split four ways.
>> 
>> Anyway, I *think* this can be achieved using user functional tickets.
>> Is that a reasonable assumption?  Or is job submission time going to
>> jump in there and mess things up in some way?
>> 
>> As a related question, is it possible to set a default number of tickets
>> per user?  Since all users are on a level playing field, this would
>> eliminate hassles of adding/editing new users as they join the
>> organization.
>> 
>> 					Thanks very much!
>> 						Chris
>> 
>> 
>> 
>> _______________________________________________
>> users mailing list
>> users at gridengine.org
>> https://gridengine.org/mailman/listinfo/users
> 
> -- 
> -- Skylar Thompson (skylar2 at u.washington.edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354
> -- University of Washington School of Medicine
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users





More information about the users mailing list