[gridengine users] double seizure of processors

Ursula Winkler ursula.winkler at uni-graz.at
Wed Apr 11 14:24:22 UTC 2012


Dave Love wrote:
> Ursula Winkler <ursula.winkler at uni-graz.at> writes:
>
>   
>> Ursula Winkler wrote:
>>     
>>> Reuti wrote:
>>>   
>>>       
>>>> Am 11.04.2012 um 11:15 schrieb Ursula Winkler:
>>>>
>>>>       
>>>>         
>>>>> Reuti wrote:
>>>>>           
>>>>>           
>>>>>> This could also be a problem of the MPI implementations. Which one do you use - you use a plain mpiexec?
>>>>>>
>>>>>> -- Reuti
>>>>>>
>>>>>>                
>>>>>>             
>>>>> I found that setting "MV2_ENABLE_AFFINITY=0" could be an option. But this should be set per default. Is this right?
>>>>>           
>>>>>           
>>>> I wouldn't bet on it. I found some sites where they suggest to set it to zero.
>>>>
>>>> http://www.osc.edu/supercomputing/faq.shtml
>>>>
>>>> -- Reuti
>>>>       
>>>>         
>>> Thanks. I told my users they should try it out. I hope it helps.
>>>   
>>>       
>> Unfortunately it did not help. So, any ideas?
>>     
>
> I don't know what's happening here in detail, but I can explain
> generally if it's not documented for mvapich.
>
> First of all, core binding is important for performance, particularly on
> NUMA systems, and you should _not_ leave it to the operating system.  It
> sounds as if that's not what's happening here though, and mvapich has
> just done the binding badly.
>
> What should happen for nodes which run multiple jobs is that gridengine
> should bind specific cores (see -binding for qsub, e.g.
> http://arc.liv.ac.uk/SGE/htmlman/htmlman1/submit.html for up-to-date
> doc).  As far as I know, you need SGE from the site in my sig to get the
> behaviour where you can have different numbers of cores bound on
> different hosts ("linear:slots") if that matters.  Also you need that
> one, or another version based on the hwloc library, for binding to work
> properly on recent hardware or non-Linux kernels.
>
> The gridengine binding (which gridengine keeps track of) separates jobs,
> and it should be noticed by the MPI, which should then bind the
> individual processes to the cores it's been given.  I don't know
> mvapich, but I know it uses hwloc, and should be able to do this
> properly like openmpi does (modulo issues with recent hardware, sigh).
> I thought mvapich would do the right thing automatically -- openmpi is
> said often to look bad performance-wise by not doing core binding by
> default.
>
> If your MPI jobs have exclusive access to the nodes it's simpler as the
> MPI system can do the binding itself without worrying about what else is
> running (e.g. the old paffinity_alone setting in openmpi).
>   

Finally it turned out that setting "MV2_ENABLE_AFFINITY=0" is doing it's 
job.
- my users didn't apply the variable in the right way. I'm sorry.
I'll try out -binding to qsub too.

Thank you,
Ursula


More information about the users mailing list