[gridengine users] mem_free settings

Reuti reuti at staff.uni-marburg.de
Tue Mar 31 12:03:30 UTC 2015


Hi,

> Am 31.03.2015 um 13:13 schrieb Gowtham <g at mtu.edu>:
> 
> 
> In one of our clusters that has homogeneous compute nodes (64 GB RAM), I have set mem_free as a requestable and consumable resource. From the mailing list archives, I have done
> 
>  for x in `qconf -sel`
>  do
>    qconf -mattr exechost complex_values mem_free=60G $x
>  done
> 
> Every job that gets submitted by every user has the following line in the submission script:
> 
>  #$ -hard -l mem_free=2G
> 
> for single processor jobs, and
> 
>  #$ -hard -l mem_free=(2/NPROCS)G
> 
> for a parallel job using NPROCS processors.
> 
> 
> All single processor jobs run just fine, and so do many parallel jobs. But some parallel jobs, when the participating processors are spread across multiple compute nodes, keep on waiting.
> 
> When inspected with 'qstat -j JOB_ID', I notice that the job is looking for (2 * NPROCS)G of RAM in each compute node. How would I go about resolving this issue? If additional information is necessary from my end, please let me know.

Can you please post the output of `qstat -j JOB_ID` of such a job.

-- Reuti


> 
> Thank you for your time and help.
> 
> Best regards,
> g
> 
> --
> Gowtham, PhD
> Director of Research Computing, IT
> Adj. Asst. Professor, Physics/ECE
> Michigan Technological University
> 
> (906) 487/3593
> http://it.mtu.edu
> http://hpc.mtu.edu
> 
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users





More information about the users mailing list