[gridengine users] Can a 6.2u5 node talk to a 6.2u3 cluster?

Reuti reuti at staff.uni-marburg.de
Mon Sep 23 15:00:26 UTC 2013


Hi,

Am 23.09.2013 um 16:39 schrieb Alan McKay:

> Hey folks,
> 
> I've got an existing RHEL5 / CentOS5 cluster running 6.2u3.
> 
> I have an Ubuntu box running 13.04, and it has 6.2u5 in the apt
> repositories.  I think I have it configured properly and added
> properly to the cluster but there is a communication error.   On the
> new Ubuntu node I do "qstat -f" and after about a minute it times out
> with :
> 
> error: failed receiving gdi request response for mid=1 (got syncron
> message receive timeout error).
> 
> So I google that and after a bit of sifting I find 2 main causes.
> (1) "some kind of NFS problem" (but I can't find details on just what)
> (2) "Grid Engine is incompatible between dot releases"

There is no need to install GridEngine on the new node at all. Usually you share /usr/sge or alike and have access to binaries and (more important) the shared configuration of the actual used cell (like /usr/sge/default/common) so that the execd knowns the actual qmaster which it has to contact.

How is your setup right now and how do the other nodes know about the actual qmaster? Are all GridEngine binaries local or already shared?

-- Reuti


> Technically 6.2u3 is the same dot release as 6.2u5
> 
> I'd rather not upgrade the 6.2u3 nodes to 6.2u5 at this point because
> the plan is to wipe them completely and install Ubuntu.  But I want to
> do that in a staged fashion and convert 1 node at a time, all the
> while keeping them all in the cluster.
> 
> Thoughts?
> 
> thanks,
> -Alan
> 
> -- 
> “Don't eat anything you've ever seen advertised on TV”
>         - Michael Pollan, author of "In Defense of Food"
> 
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users
> 





More information about the users mailing list