[gridengine users] qalter not successful

Kevin Buckley kevin.buckley.ecs.vuw.ac.nz at gmail.com
Wed Jun 13 22:42:18 UTC 2012


On 14 June 2012 10:36, Reuti <reuti at staff.uni-marburg.de> wrote:
>> I am hoping that Reuti, or someone else, can enlarge upon the suggestion
>> above by asking what happens if you start up the execd again?

>> What communication between the node and the master is still left hanging around
>> if you "softstop" the execd?
>
> None any longer. The sgeexecd is gone, you could also a sigkill the sgeexecd directly.
>
>
>> Is it possible to alter the local execd configuration so that a new
>> instance could be started and have the node then accept other tasks,
>>  whilst still retaining the original communication ports ?
>
> Never tried it (so, no guarantee): before starting the execd again, you may
> need to change the location of the spool directory of the execd (i.e. in the
>  local host configuration: `qconf -mconf node17` or alike the set "execd_spool_dir").
> Then it won't see the former jobs. After the long running job ended, it needs to
> be removed from the qmaster list of jobs by `qdel -f 1234`.

Righto, cheers.

Something to have a play around with though and there are a few hours until
the job in questions get the signal.

I'll let you know what happens,
Kevin
ECS, VUW, NZ


More information about the users mailing list