[gridengine users] define PE on special hosts

Reuti reuti at staff.uni-marburg.de
Mon Nov 21 16:19:09 UTC 2011


Am 21.11.2011 um 11:51 schrieb mahbube rustaee:

> Hi,
> a  question:
> I configured a queue amd.q such:
> pe_list               NONE,[@amd= mpi2 ]
> and another queue xeon.q such:
> pe_list               NONE,[@xeon= mpi2 ]
> 
> In this case, jobs get hosts  just from @amd , as I wanted and my problem solved.

What you observed is coincidence. When you asked for mpi2, you can get slots from all queue instances to which mpi2 was attached.

Again: don't think in queues with SGE. This is the way Torque works, but not SGE. It's better to have a boolean complex attached to each exechost, and then submit:

$ qsub -l xeon job.sh

while having only one queue, and two separate PEs which could also reflect the desired CPU type:

$ qsub -pe mpi2xeon 16 job.sh

oe

$ qsub -pe "mpi2*" 16 job.sh

if you don't care. Once a PE is selected by SGE, all slots will come from queue instances to which this PE is attached.

-- Reuti


> my question is:
> what is difference when queue configuration for both  amd.q and xeon.q is:
> (hostlists of amd.q is @amd and hostlists of xeon.q is @xeon)
> pe_list   mpi2
> 
> Thx so much
> 
> On Sun, Nov 20, 2011 at 12:11 PM, mahbube rustaee <rustaee at gmail.com> wrote:
> Thx ,
> the problem solved. I appriciate for your help.
> 
> 
> On Sat, Nov 19, 2011 at 9:24 PM, Reuti <reuti at staff.uni-marburg.de> wrote:
> Am 19.11.2011 um 17:50 schrieb mahbube rustaee:
> 
> >>> number of slots per node is correct, but host list is not host list on that
> >>> queue.
> >> Which queue, the one it runs in or the one you wanted it to run in?
> >> I defined a queue on a hostgroup (such @rack1) and attached to the queue a PE (such mpi2 with Allocatio_rule 2). I expected that any job launch to that queue use 2 slots of @rack1 hosts, but hosts that run jobs are not member of @rack1.
> > jobs get 2 slots per host , but some hosts are not member of @rack1!   why?
> >
> > I want jobs that launch to the mentioned queue use 2 slots of @rack1 hosts.
> 
> Did you setup the wildcard solution I outlined below?
> 
> In SGE you don't think in queues, but resource requests. If you request a PE which is attached to several queues, you will get what you asked for.
> 
> http://www.gridengine.info/2006/02/14/grouping-jobs-to-nodes-via-wildcard-pes/
> 
> Further more:
> 
> Is the output of `qstat -g t` not the expected one, or the node your MPI job tasks - possibly outisde of SGE's allocarion?
> 
> -- Reuti
> 
> 
> > Thx so much
> >>>>
> >
> >
> > On 11/19/11, Reuti <reuti at staff.uni-marburg.de> wrote:
> >> Am 19.11.2011 um 13:00 schrieb William Hay:
> >>
> >>> On 19 November 2011 09:58, mahbube rustaee <rustaee at gmail.com> wrote:
> >>>>
> >>>>
> >>>> On Sat, Nov 19, 2011 at 12:05 PM, William Hay <w.hay at ucl.ac.uk> wrote:
> >>>>>
> >>>>> On 19 November 2011 04:53, mahbube rustaee <rustaee at gmail.com> wrote:
> >>>>>> Hi,
> >>>>>> I defined a queue on @node-grp (a group of nodes).
> >>>>>> I defined mpi2 parallel environment as:
> >>>>>> start_proc_args    /opt/gridengine/mpi/startmpi.
> >>>>>> sh $pe_hostfile
> >>>>>> stop_proc_args     /opt/gridengine/mpi/stopmpi.sh
> >>>>>> allocation_rule    2
> >>>>>> control_slaves     FALSE
> >>>>>> job_is_first_task  TRUE
> >>>>>> urgency_slots      min
> >>>>>> accounting_summary TRUE
> >>>>>>
> >>>>>> and attach this PE to that queue. I expected that parallel programs get
> >>>>>> slots from  @node-grp  on that queue and 2 slots on any host.
> >>>>>> but that was not happen, slots are at another hosts that arenot member
> >>>>>> of
> >>>>>> @node-grp.
> >>>>>>
> >>>>>> how can I define such parallel environment ?
> >>>>> 1.Were the parallel programs submitted to the mpi2 PE or possibly to a
> >>>>> different PE?
> >>>>
> >>>> Yes,  the parallel programs submitted to the mpi2 PE.
> >>>>>
> >>>>> 2.Is mpi2 defined in the pe_list of the queue in which the jobs actually
> >>>>> do run?
> >>>>
> >>>> yes,
> >>> Well that would appear to be the problem.  If you have multiple queues
> >>> with mpi2 defined then the job can run in any or all of them
> >>
> >> Correct. To limit it to one hostgroup you will have to attach different PEs
> >> to the hostgroups (this can still be done in one and the same queue) and
> >> request a PE with a wildcard. E.g. if they should stay inside one rack or
> >> so:
> >>
> >> $ qconf -sq all.q
> >> ...
> >> pe_list NONE,[@rack1=MPI2a PVMa],[@rack2=MPI2b PVMb]
> >>
> >>
> >> Then submitting:
> >>
> >> $ qsub -pe "MPI2*" 4 job.sh
> >>
> >> will give you machines only from one hostgroup. Of course MPI2a could also
> >> be requested directly this way.
> >>
> >> -- Reuti
> >>
> >>
> >>>> number of slots per node is correct, but host list is not host list on
> >>>> that
> >>>> queue.
> >>> Which queue, the one it runs in or the one you wanted it to run in?
> >>>
> >>>>>
> >>>>> William
> >>>>>>
> >>>>>> Thx
> >>>>
> >>>>
> >>>
> >>> _______________________________________________
> >>> users mailing list
> >>> users at gridengine.org
> >>> https://gridengine.org/mailman/listinfo/users
> >>>
> >>
> >>
> 
> 
> 





More information about the users mailing list