[gridengine users] Q owner powers?
Harry Mangalam
harry.mangalam at uci.edu
Tue Feb 7 21:17:44 UTC 2012
Hi Reuti - Thanks for the quick reply - yes, the jobs do get suspended
if the Q instance gets suspended (I made a mistake in checking this),
but there seemed to be no way to kill them as a last resort.
I'll check the checkpointing link that sounds like a way to handle
the problem more elegantly.
hjm
On Tuesday 07 February 2012 12:47:36 Reuti wrote:
> Am 07.02.2012 um 20:58 schrieb Harry Mangalam:
> > I run a cluster that is a mostly peaceful mix of open
> > (universally available, under SGE 6.2) and condo nodes
> > (generally open, except when the owners want them under their
> > control and available only to them. I've assigned a node Q to
> > the owner who can disable/enable & suspend/resume the QUEUES
> > according to the docs.
>
> You mean the jobs are not suspended, despite the fact that the
> queue instance got suspended they are running in?
>
> You could use a custom suspend method to kill the jobs instead
> suspending. Or maybe better: attach in an JSV a checkpointing
> environment. This way the jobs would stay at the top of the queue,
> if the checkpointing environment is setup to reschedule on
> suspend. You are using the checkpointing facility only for the
> removal of the jobs from the node, i.e. for a migration.
>
> http://arc.liv.ac.uk/SGE/howto/checkpointing.html
>
> -- Reuti
>
> > Is there a mechanism to allow the Q owner to suspend or even
> > kill running JOBS or is that forbidden to the Q owner and only
> > available to the admin?
> >
> > ie in the following extract, argardne is the owner of the Q on
> > node claw9, but he can't kill/suspend jobs running there - he
> > can only operate on Qs. $ qconf -sq claws
> > qname claws
> > hostlist @execlaws
> > seq_no 0
> > load_thresholds np_load_avg=1.1
> > suspend_thresholds NONE
> > nsuspend 1
> > suspend_interval 00:05:00
> > priority 0
> > min_cpu_interval 00:05:00
> > processors
> > 1-4,[claw1.bduc=1-2],[claw5.bduc=1-8],[claw7.bduc=1-8], \
> >
> > [claw8.bduc=1-16],[claw9.bduc=1-48]
> >
> > qtype
> > BATCH,[claw1.bduc=BATCH],[claw5.bduc=BATCH], \
> >
> > [claw9.bduc=BATCH],[claw8.bduc=BATCH],[claw
> > 7.bduc=BATCH]
> >
> > ...
> > owner_list NONE,[claw9.bduc=argardne]
> > user_lists
> > arusers,[claw9.bduc=arusers],[claw5.bduc=arusers], \
> >
> > [claw7.bduc=arusers],[claw1.bduc=arusers]
--
Harry Mangalam - Research Computing, OIT, Rm 225 MSTB, UC Irvine
[ZOT 2225] / 92697 Google Voice Multiplexer: (949) 478-4487
415 South Circle View Dr, Irvine, CA, 92697 [shipping]
MSTB Lat/Long: (33.642025,-117.844414) (paste into Google Maps)
--
Citzens United: Democracy on meth
- Walter Egan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gridengine.org/pipermail/users/attachments/20120207/417b5f15/attachment.html>
More information about the users
mailing list