[gridengine users] Why define many PEs?
Reuti
reuti at staff.uni-marburg.de
Tue Nov 22 18:02:39 UTC 2011
Hi,
Am 22.11.2011 um 12:52 schrieb mahbube rustaee:
> I'm so thankful in advance for your time and your help.
>
> suppose we have many users with different request on "how to get slots for running mpi jobs",
> such some want fillup allocation rule, some users round robin, some 2 slots per host, some 4 slots per host ,...
> sgeadmin had to define many PEs (for different requests)!
>
> Is there any simple way to grant slots to users without define many PEs?
no. The idea behind SGE is that the admin defines the PEs for the users, so that all users can benefit from the set up tight integration and a prepared machinefile they need for the specific parallel library. Although the difference may become blur due to the built in behavior of Open MPI and MPICH2 (recent version, not Intel MPI) nowadays:
- for plain SMP (i.e. Open MP) jobs, you most often don't need any start-/stop_proc_args
- for Intel MPI (or ancient LAM/MPI, old MPICH2) you need to start some daemons
- MPICH(1) needs a reformatted machinefile
- Open MPI and MPICH2 use a one time `qrsh -inherit ...` per node and then threads per node
Therefore it's part of the queuing system. In Torque for example it will work with Open MPI and MPICH2, but all others need a setup by the user (i.e. each user on his own) which might lead to a non-tight integration.
But it might be worth to check you applications, whether the difference matters in a large scale: if a user wants 4 cores per node and he got other jobs there from other users on the same node, the advantage of the intended distribution might become negligible to getting all slots from one and the same machine or in a different distribution.
-- Reuti
More information about the users
mailing list