[gridengine users] Simplifying Parallel Environments
Mark Dixon
m.c.dixon at leeds.ac.uk
Thu Feb 2 16:52:01 UTC 2012
On Wed, 1 Feb 2012, Brian Smith wrote:
> I've started a github page for some tools I've put together from various
> bits of code, how-tos, etc. to simplify the setup of parallel
> environments so that they work universally for all MPI implementations
> (on x86_64 Linux) w/ tight-integration support (no support for ssh yet).
> The syntax for submitting parallel jobs becomes more similar to
> LSF/PBS/Torque and provides for easy configuration of your task layout
> (ppn,nodes,pcpus,pcpus_min,pcpus_max). We use a JSV to make the magic
> happen. We create PEs tied to queues since our queues often delineate
> changes in the underlying communication fabrics available.
...
Hi Brian,
I'm glad to see you've written it as an optional alternative to the normal
way of requesting resources ;)
A few things you might want to consider for future developments:
1) Some sites encode things like interconnect topology in the PE name
instead as well as ppn. Perhaps you should read the requested PE and
append a suffix, instead of overwriting it?
2) Your implementation clearly works well with the "-l exclusive" feature,
giving users a simplified way to experiment and find the optimum ppn for
their code, or do mixed-mode parallel programming. Unfortunately, AFAIK
this doesn't get accounted for properly in usage policy calculations.
Until an execd_params "ACCT_EXCLUSIVE_USAGE" or similar option appears in
your favourite GE variant, you might want to try the obvious sorts of ugly
kludges around this.
3) Personally, I'm really not a fan of managing the machinefile and rsh
wrappers using the PE's start_proc_args / stop_proc_args. I find that an
mpirun wrapper script provides a much cleaner and more powerful way to
achieve this (and many other improvements).
All the best,
Mark
--
-----------------------------------------------------------------
Mark Dixon Email : m.c.dixon at leeds.ac.uk
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-----------------------------------------------------------------
More information about the users
mailing list