[gridengine users] qhost and qsub working, qlogin and qrsh not working
fmlheureux at datacratic.com
Fri Aug 29 14:01:57 UTC 2014
Thanks for you reply.
I've made some tests and it seems that regarding the connection flow it's
the opposite. On the submit host, I activated debugging and launched qrsh.
9 4931 main starting commlib server
10 4931 main trying to create commlib handle
11 4931 main (*handle)->connect_port = 0
12 4931 main (*handle)->service_port = 46780
13 4931 main B E F O R E S E N D I N G! ! ! ! ! ! ! !
! ! ! ! ! !
14 4931 main
15 4931 main sge_set_auth_info: username(uid) = root(0),
groupname = root(0)
16 4931 main JSV client context
17 4931 main JSV list for current thread updated
18 4931 main job id is: 32
19 4931 main R E A D I N G J O B ! ! ! ! ! ! ! ! ! ! !
20 4931 main ============================================
21 4931 main random polling set to 3
22 4931 main waiting for connection
Then ctrl+z, ran "netstat -plnt" and got
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 0.0.0.0:46780 0.0.0.0:* LISTEN
So it seems that it is in fact the submit host that opens a port and await
for the exec host to connect.
Since docker doesn't currently allow to bind ports on demand I think I have
to declare my attempt a failure. (A very instructive one though!) It may
work someday if docker changes the way it handles ports, but in its current
state I don't see what can be done. I'll keep thinking about it...
Thanks a lot.
2014-08-29 6:27 GMT-04:00 Reuti <reuti at staff.uni-marburg.de>:
> Am 28.08.2014 um 23:28 schrieb François-Michel L'Heureux:
> > Ok, I'm pretty sure now my issue is related to ports exposition.
> > Is there a known/configurable port range used by SGE? There seem to be
> SGE will instruct the shepherd on the selected exechost to start a damon
> (either builtin/sshd/rshd) solely for your job and listen on a randomly
> selected port. On the other side the submission host will get the
> information to connect on exactly using this port to the exechost.
> I'm also not aware of any option to restrict this to a range of ports
> (besides doing it somewhere in the source).
> -- Reuti
> > 2014-08-28 16:57 GMT-04:00 François-Michel L'Heureux <
> fmlheureux at datacratic.com>:
> > Hello!
> > I'll start by admitting that I'm working on a somewhat complex setup:
> I'm trying to submit jobs to SGE from a docker environment.
> > So far, I've managed to mount the proper directory, run theinstallation
> and so on. qhost and qsub work perfectly well. qrsh and qlogin however
> don't. When I check /opt/sge6/default/spool/qmaster/messages I have
> > 08/28/2014 20:41:31|worker|master|W|job 23.1 failed on host master
> assumedly after job because: can't read usage file for job 23.1
> > for all of my qrsh/qlogin attemps.
> > Googling the error did not help much. Anyone ever encountered that?
> > Thanks a lot!
> > Mich
> > _______________________________________________
> > users mailing list
> > users at gridengine.org
> > https://gridengine.org/mailman/listinfo/users
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the users