[gridengine users] qmaster fails to start: "got NULL element for EH_name"

Reuti reuti at staff.uni-marburg.de
Tue Oct 25 15:44:39 UTC 2011


Am 24.10.2011 um 10:41 schrieb Steffen Neumann:

> I am currently facing a serious problem where qmaster 6.2u5 
> fails to start. The /var/spool/gridengine/spooldb/sge and sge_jobs 
> are not corrupted, and pass db4.8_verify fine.
> 
> The log shows 
> 
> 10/24/2011 10:18:24|  main|cumulus|W|local configuration cumulus not defined - using global configuration
> 10/24/2011 10:18:24|  main|cumulus|W|can't resolve host name "kroll": undefined commlib error code
> 10/24/2011 10:18:24|  main|cumulus|W|can't resolve host name "vsuse006": undefined commlib error code
> 10/24/2011 10:18:24|  main|cumulus|E|can't create queue "MSBI": host "vubuntu001" is not known

But the hosts in question are defined and resolve correctly with the gethostbyaddress/-name in $SGE_ROOT/utilbin/lx24-amd64?

-- Reuti


> 10/24/2011 10:18:24|  main|cumulus|I|read job database with 5 entries in 0 seconds
> 10/24/2011 10:18:24|  main|cumulus|C|!!!!!!!!!! got NULL element for EH_name !!!!!!!!!!
> 
> I guess the critical bit is create queue "MSBI". Could that be ?
> 
> The installation uses berkeley db4.8 spooling, 
> and I would like to find a way to a) remove vubuntu001 
> from the MSBI definition (it is/was part of a host group) 
> or b) delete MSBI altogether. Or c) something else altogether.
> 
> Any clues ? 
> 
> Yours,
> Steffen
> 
> 
> -- 
> IPB Halle                    AG Massenspektrometrie & Bioinformatik
> Dr. Steffen Neumann          http://www.IPB-Halle.DE
> Weinberg 3                   http://msbi.bic-gh.de
> 06120 Halle                  Tel. +49 (0) 345 5582 - 1470
>                                  +49 (0) 345 5582 - 0
> sneumann(at)IPB-Halle.DE     Fax. +49 (0) 345 5582 - 1409
> 
> 
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users




More information about the users mailing list