[gridengine users] Resume a job after machine reboot

Reuti reuti at staff.uni-marburg.de
Tue Jul 23 12:01:37 UTC 2013


Am 23.07.2013 um 13:46 schrieb Guillermo Marco Puche:

> I'm in need to restart my machine.

You mean an exechost - right?


> However I've a long job already running for some weeks through gird engine.
> Is there any way to pause job, restart machine and then resume the job so I don't lose the progress?

There is nothing inside SGE which provides such a facility.

Nevertheless, if you have some sort of checkpointing and restart capability available outside of SGE provided by your application already, then SGE can be configured to use/support this by definition of a checkpointing configuration (`man checkpoint` resp. `man sge_ckpt`).

-- Reuti


> Thank you very much.
> 
> Best regards,
> Guillermo.
> -- 
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users





More information about the users mailing list