[gridengine users] -notify and killing jobs
reuti at staff.uni-marburg.de
Mon Mar 6 11:29:05 UTC 2017
> Am 06.03.2017 um 10:36 schrieb Julien Nicoulaud <julien.nicoulaud at gmail.com>:
> I run jobs with -notify and a long notify time of 30 minutes, as the jobs can have a very long cleanup.
> This works fine, when using "qdel" USR2 is sent and handled by my jobs.
> But in some cases, I would like to force kill the job immediately (by sending the KILL signal).
> I cannot find any way to do this, any idea ?
Unfortunately the -notify has no y/n option, and hence we can't change its setting by `qalter`. There are two similar ways to remove them anyway:
1. Abuse a checkpointing interface to kill it by rescheduling it (must be attached to the queue and requested by job submission).
$ qconf -sckpt killer
The running job can be checkpointed by `qmod -sj <job_id>`, this will send a sigkill to the job and reschedule it. While it is waiting again, you can use the usual `qdel` to remove it from the waiting list.
(2. but not optimal: Submit the jobs with "-r y" and reschedule them by `qmod -rj <jobn_id>`. While it's waiting again, you can use the `qdel` on the (again) waiting job. But the jobs will continue on the node although they vanished from the job list. There were discussions on the list before, that it will need some time until they really decease operation.)
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 203 bytes
Desc: Message signed with OpenPGP using GPGMail
More information about the users