[gridengine users] schedd dies and error messages
reuti at staff.uni-marburg.de
Tue May 13 19:19:35 UTC 2014
Am 13.05.2014 um 14:32 schrieb Arnau Bria:
>> Our schedd is dying unexpectedly since a user has sent 750000 jobs in
>> 3 job arrays.
>> Am I reaching some internal SGE limit?
> after holding 2 of thos big job arrays the schedd is not dying anymore.
> So, could it be that 500000 jobs is too much even if they are submitted
> as job arrays?
You removed the limit in SGE's configuration "max_aj_instances" then.
You can try to use "-tc" in `qsub` to limit the number of instances which can be executed at a time, or set "max_pending_tasks_per_job" in the scheduler to a lower limit - it's still set to 50?
> Anyone has experience running big job arrays?
> users mailing list
> users at gridengine.org
More information about the users