[gridengine users] Spooling for a shadow master...
j.abbott at imperial.ac.uk
Wed Mar 30 14:19:18 UTC 2011
I'm in the process of reorganising our SGE setup, and would appreciate
some guidance on a couple of points re: spooling.
Our $SGE_ROOT is exported to all hosts from a separate clustered, highly
available NFS4 server. The qmaster is using classic spooling to
SGE_ROOT/spool/qmaster. I want to add a shadow master for improved
resilance, but have got a little confused over the 'best' way of
handling spooling for this setup.
I understood it was necessary to use a BDB spooling server if using
shadow masters, but see that this is essentially being deprecated due to
BDB RPC issues. I've also read that classic spooling is ok via nfs4
(hence our current setup...), but have since seen comments that this
only applies if the qmaster is running on Solaris (our SGE
infrastructure is all CentOS 5.5).
So, adding a BDB spooling server introduces RPC security weaknesses and
a single point of failure, and looks like will not be an option in
future releases so doesn't look like an ideal solution. The NFS server
currently hosting SGE_ROOT is a clustered, highly available system, so
gives us resiliance, but is using NFS4 for classic spooling really safe?
It seems to be working ok for us with a single qmaster, but that may be
due to luck...
Bottom line question: What is the optimal spooling setup for using
If we do decide to move to a BDB server, can I setup the spooling server
and simply shutdown the qmaster, edit
$SGE_ROOT/$SGE_CELL/common/bootstrap and restart the qmaster, or is it
more involved than that?
Dr. James Abbott
Bioinformatics Software Developer
Imperial College, London
More information about the users