[gridengine users] memory consumption while running the jobs in parallel environment

sudha.penmetsa at wipro.com sudha.penmetsa at wipro.com
Tue May 19 13:50:07 UTC 2015


While running jobs in parallel environment if we want to run a job in grid using 4 cores and total memory consumption is 40G we are defining as

qrsh -V -cwd -q test.q -l mem_free=40G,h_vmem=10G -pe sharedmem 4 sleep 40

However this assumes that each of the threads consumes max 10G mem, the total h_vmem consumed on the execution host is 40G

Our experiments have shown that when running the job in single core it requires the 40G mem but if we divide the 40G by four (running with "-pe sharedmem 4") the job crashes to out of mem.
One option is to run it like this :
qrsh -V -cwd -q  test.q -l mem_free=40G,h_vmem=40G -pe sharedmem 4 sleep 40
however then we end up consuming 160G of h_vmem from the execution host,

So how to ensure that:

-          4 CPU slots are reserved from the execution host

-          40G total h_vmem is consumed so that each thread can consume 40G mem if needed

One option of course is to leave out the h_vmem definition :
qrsh -V -cwd -q test.q -l mem_free=40G -pe sharedmem 4 sleep 40

however then other users might eat the memory from the host and our run crashes again.


The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gridengine.org/pipermail/users/attachments/20150519/04ec9e21/attachment.html>

More information about the users mailing list