[gridengine users] Having trouble installing SGE on a new execution host

Mun Johl mun at apeirondata.com
Mon Jan 2 00:44:29 UTC 2017


Thanks for your reply.
See my inline comments below.

On Sun, Jan 1, 2017 at 3:49 PM, Reuti <reuti at staff.uni-marburg.de
> wrote:

> Hi,
> Am 02.01.2017 um 00:05 schrieb Mun Johl:
> > Hi,
> >
> > Someone had installed SGE on our servers over a year ago (that person is
> now gone).  However, we now need to install SGE on a new execution host so
> I downloaded the ge2011.11.tar tar-ball.
> Was this the version which was installed on the other machines?

​[Mun] Yes.

> >  After setting up the SGE_ROOT var, etc. I ran 'install_execd'.  When I
> was queried about the Grid Engine cell, I selected [default] and the
> following error was displayed:
> >
> >    Obviously there was no qmaster installation yet!
> >    Call >install_qmaster<
> >    on the machine which shall run the Grid Engine qmaster
> >
> > However, our qmaster _is_ installed and running.
> >
> > I noticed the ge2011.11.tar tar-ball did not include a 'default'
> directory, which it seems the installation script is trying to access.
> There was nothing in the instructions that I found indicating I am to setup
> that directory ahead of time.  I had assumed the installation process would
> setup that directory.
> >
> > How can I properly setup the 'default' directory so that I can correctly
> install the execution host?
> Most of the cases there is no need to "install" anything on an additional
> exechost when you have already a working cluster.

​[Mun] Really?  I was basically trying to follow the old "Sun N1 Grid
Engine 6.1 Installation Guide" instructions to install an Exection Host
from the following URL:

> - Prepare a proper /etc/hosts or NIS or alike on the new machine, so that
> all machines in the cluster are known for it (and also the old machines
> should be able to reference the new one)
> - Mount /opt/sge or /usr/sge on the new exechost

​[Mun] When SGE was initially installed, a common mount was not used.
SGE_ROOT is local to each host.  It doesn't "feel" right to copy
$SGE_ROOT/default from a working host to the new host; but I don't know how
to get that directory on the new host otherwise.

> - Copy $SGE_ROOT/default/common/sgeexecd to /etc/init.d
> Depending on the startup of services you need either:
> # /etc/init.d/sgeexecd start
> # chkconfig --add sgeexecd
> or
> # systemctl daemon-reload
> # systemctl start sgeexecd.service
> # systemctl enable sgeexecd.service
> BTW: Is tmpdir in the queue definition just /tmp or do you need an
> additional /scratch or alike on the new machine too?

​[Mun] I don't understand this question, sorry.​  Are you referring to the
SGE queues?



> -- Reuti
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gridengine.org/pipermail/users/attachments/20170101/9e633c35/attachment.html>

More information about the users mailing list