[gridengine users] Suspend submission to a queue to reboot node

Gowtham sgowtham at mtu.edu
Thu Jul 4 16:34:59 UTC 2013


Just one correction to my note in the previous email. If the compute 
node in question in compute-0-0.local, then the qsub command would 
look like 

  qsub -p 1024 -pe mpi 16 -q all.q at compute-0-0.local cn-reboot.sh


Best regards,
g

--
Gowtham, PhD
Information Technology Services
Adj. Assistant Professor, Physics
Michigan Technological University

(906) 487/3593
http://it.mtu.edu


On Thu, 4 Jul 2013, Gowtham wrote:

| 
| Yes, I believe you can. Here's what I have done:
| 
|   1. Create a shell script with the following contents
|      and 755 permission:
| 
| #! /bin/bash
| #
| # cn-reboot.sh
| # Place this script in a location that is accessible
| # from all nodes.
| # BASH script to reboot a compute node via SGE.
| 
| /usr/bin/logger -p local0.alert "Rebooting via SGE at `date -R`"
| /sbin/init 6
| 
| 
|   2. Submit to the queue, as root, with the highest
|      possible priority. Suppose that your compute node
|      in question has 16 processors and belongs to all.q,
|      then the qsub would look like
| 
|      qsub -p 1024 -pe mpi 16 -q all.q cn-reboot.sh
| 
| 
| This will do the following:
| 
|   1. Submit the highest priority job on that node to the queue.
|   2. Let the currently running jobs on that node run to their 
|      completion.
|   3. Reboot that node.
|   4. Let the waiting jobs start running on that node as and
|      when each job's required resources become available.
| 
| 
| Best regards,
| g
| 
| --
| Gowtham, PhD
| Information Technology Services
| Adj. Assistant Professor, Physics
| Michigan Technological University
| 
| (906) 487/3593
| http://it.mtu.edu
| 
| 
| On Thu, 4 Jul 2013, Lionel SPINELLI wrote:
| 
| | Hello all,
| | 
| | I would need to reboot one of the node of my grid. However, users continue to submit jobs and I can't ask them to stop. I have tried to suspend the queue on this node but it seems the running jobs on it are suspended to. Is there a way to avoid new submission to this host until current running jobs are finished and I have rebooted the host?
| | 
| | Thanks in advance
| | 
| | Regards
| | 
| | Lionel
| | _______________________________________________
| | users mailing list
| | users at gridengine.org
| | https://gridengine.org/mailman/listinfo/users
| | 
| 


More information about the users mailing list