[gridengine users] commlib errors?

Michael Coffman michael.coffman at avagotech.com
Thu Jul 12 14:50:55 UTC 2012


I am intermittently seeing the following on the command line when
attempting to run qrsh with out any options:

error: error running IJS server: "can't create tty_to_commlib thread:
timeout while waiting for thread start"

In addition, I have started to see the following in the
spool/qmaster/messages file (unrelated?):

07/11/2012 13:06:02|listen|serverA|E|commlib error: got read error (closing
"hostA/qstat/29971")

These appear to be 2 separate problems as one is qrsh and the other appears
to be qstat.

I am running sge6.2u5
qmaster is running on rhel5
clients are rhel5 and rhel6

The qrsh issue seems to happen much more frequently on the rhel6 system.

Thanks for any help in how to trouble shoot this.
-- 
-MichaelC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gridengine.org/pipermail/users/attachments/20120712/9d8408ee/attachment.html>


More information about the users mailing list