[gridengine users] Simplifying Parallel Environments

Brian Smith brs at usf.edu
Thu Feb 2 19:42:41 UTC 2012


On 02/02/2012 11:52 AM, Mark Dixon wrote:
> On Wed, 1 Feb 2012, Brian Smith wrote:
>
>> I've started a github page for some tools I've put together from various
>> bits of code, how-tos, etc. to simplify the setup of parallel
>> environments so that they work universally for all MPI implementations
>> (on x86_64 Linux) w/ tight-integration support (no support for ssh yet).
>>  The syntax for submitting parallel jobs becomes more similar to
>> LSF/PBS/Torque and provides for easy configuration of your task layout
>> (ppn,nodes,pcpus,pcpus_min,pcpus_max).  We use a JSV to make the magic
>> happen.  We create PEs tied to queues since our queues often delineate
>> changes in the underlying communication fabrics available.
> ...
>
> Hi Brian,
>
> I'm glad to see you've written it as an optional alternative to the 
> normal way of requesting resources ;)
>
> A few things you might want to consider for future developments:
>
> 1) Some sites encode things like interconnect topology in the PE name 
> instead as well as ppn. Perhaps you should read the requested PE and 
> append a suffix, instead of overwriting it?
>
In our site, it just so happens that the queues are related to the 
topology/fabric breakdown.  I'm planning on implementing this feature 
using a soft request (e.g. I would like my job to run on nodes living on 
the same edge switch).  PEs themselves have been a problem in our shop.  
I always get "this is complicated" or "they don't do it like this at 
teragrid", or "my last facility used PBS/LSF/etc".  Yes, I'm catering to 
that crowd.  They are the squeaky wheel and I've gotta give 'em the grease.

> 2) Your implementation clearly works well with the "-l exclusive" 
> feature, giving users a simplified way to experiment and find the 
> optimum ppn for their code, or do mixed-mode parallel programming. 
> Unfortunately, AFAIK this doesn't get accounted for properly in usage 
> policy calculations. Until an execd_params "ACCT_EXCLUSIVE_USAGE" or 
> similar option appears in your favourite GE variant, you might want to 
> try the obvious sorts of ugly kludges around this.

Fortunately, -l exclusive is seldom used in our shop and we usually only 
allow its use with an AR.  We favor utilization of the resources so the 
higher-ups know they are being utilized.

>
> 3) Personally, I'm really not a fan of managing the machinefile and 
> rsh wrappers using the PE's start_proc_args / stop_proc_args. I find 
> that an mpirun wrapper script provides a much cleaner and more 
> powerful way to achieve this (and many other improvements).

Did that for a long time!  Back in the 5.x days, SGE shipped with a 
template PE that had a script called 'sge_mpirun'.  I still use that 
though it looks nothing like it used to.  The big problem has been 
implementing PPN while supporting myriad MPI implementations.  I also 
found that using a wrapper is great... until someone shows up with an 
application that can't be wrapped.  Happens a lot.

Clearly, some of the rational in my implementation is based on how our 
environment operates which is why I'm glad other people are giving 
suggestions so I'm not working inside my own little bubble :)

>
> All the best,
>
> Mark



More information about the users mailing list