[gridengine users] gridengine overriding shell environment?

John Young John.E.Young at NASA.Gov
Fri Jun 3 13:46:48 UTC 2011


One of the engineers here is having problems with any job
that tries to use more than 1024 cores.  His csh script is
getting a 'Too many open files' error, so I tried raising
the descriptors limit in the shell from 1024 to 65535.
That seems to have worked for interactive logins, but not
for gridengine jobs.

If I ssh to one of the client nodes and issue a 'limit'
command, I get:

% ssh compute-1-6 limit
cputime      unlimited
filesize     unlimited
datasize     unlimited
stacksize    10240 kbytes
coredumpsize 0 kbytes
memoryuse    unlimited
vmemoryuse   unlimited
descriptors  65535
memorylocked 32 kbytes
maxproc      131072

but if I submit a script that contains:

#
limit
#
echo 'cat /proc/sys/fs/file-max'
cat /proc/sys/fs/file-max
#

I get (from the same client as above) in the logfile:

cputime      unlimited
filesize     unlimited
datasize     unlimited
stacksize    unlimited
coredumpsize 0 kbytes
memoryuse    unlimited
vmemoryuse   unlimited
descriptors  1024
memorylocked 32 kbytes
maxproc      524288
cat /proc/sys/fs/file-max
6448170

Please note that 'descriptors' is still showing 1024 instead
of 65535.  Any idea where that is coming from?  Why is gridengine
using a different value than the one that I get when I just ssh
into a node?

Any suggestions?

JY




More information about the users mailing list