[gridengine users] Ulimit for max open files
lhuang at NYGENOME.ORG
Mon Jun 26 17:24:57 UTC 2017
To increase the max open file, we have set execd_params in qconf –mconf and also on the OS level:
On our execution nodes we can see that SGE sets a soft limit of 65535 despite that we told it to set it to 262144.
[root at p2node01 ~]# cat /proc/104694/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size unlimited unlimited bytes
Max core file size unlimited unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 262144 262144 processes
Max open files 65535 262144 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks 262144 262144 locks
Max pending signals 15023 15023 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
When running PE smp job requesting for 2 slots, the soft limit is set to 65535*2= 131070. The core number seems to be the exponent of the soft limit. If we request for more than 4 slots, it will exceed the hard limit and reset the max open files to the default of 1024. Our work around for this is to set H_DESCRIPTORS=9362. This is because some of our exec nodes are 28 cores. 28 x 9362= 262144 for the limit. I was wondering if there is a better way of doing this?
You might think hey, why do we need to have 200k+ open file. This is due to someone using a software that has an open file handler leak and does not fclose properly. Their workaround is a dirty hack where the job ssh onto the localhost and bypass the ulimit set by SGE.
This electronic message is intended for the use of the named recipient only, and may contain information that is confidential, privileged or protected from disclosure under applicable law. If you are not the intended recipient, or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that any reading, disclosure, dissemination, distribution, copying or use of the contents of this message including any of its attachments is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and destroy all copies of this message. Please note, the recipient should check this email and any attachments for the presence of viruses. The organization accepts no liability for any damage caused by any virus transmitted by this email.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the users