[gridengine users] Split process between multiple nodes.

Guillermo Marco Puche guillermo.marco at sistemasgenomicos.com
Mon Nov 12 12:47:19 UTC 2012


Hello,

This must be the problem. I've check that each compute node can only 
resolve his own IP address:

For example in compute-0-0:

/opt/gridengine/utilbin/lx26-amd64/gethostbyaddr 10.4.0.2
Hostname: compute-0-0.local
Aliases:  compute-0-0
Host Address(es): 10.4.0.2

10.4.0.3 (compute-0-1)

$ /opt/gridengine/utilbin/lx26-amd64/gethostbyaddr 10.4.0.3
error resolving ip "10.4.0.3": can't resolve ip address (h_errno = 
HOST_NOT_FOUND)

And the inverse on compute-0-1, it can resolve 10.4.0.3 but not 10.4.0.2.

Regards,
Guillermo.
El 12/11/2012 13:35, Guillermo Marco Puche escribió:
> Hello,
>
> Ok I've patched my nodes with the RPM fix for MPI and SGE. (i forgot 
> to install it on compute nodes).
>
> Removed -np 16 argument and got this new error:
>
> error: commlib error: access denied (client IP resolved to host name 
> "". This is not identical to clients host name "")
> error: executing task of job 97 failed: failed sending task to 
> execd at compute-0-1.local: can't find connection
> -------------------------------------------------------------------------- 
>
> A daemon (pid 3037) died unexpectedly with status 1 while attempting
> to launch so we are aborting.
>
> There may be more information reported by the environment (see above).
>
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have 
> the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> -------------------------------------------------------------------------- 
>
> -------------------------------------------------------------------------- 
>
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> -------------------------------------------------------------------------- 
>
>
>
> El 12/11/2012 13:11, Reuti escribió:
>> Am 12.11.2012 um 12:18 schrieb Guillermo Marco Puche:
>>
>>> Hello,
>>>
>>> I'm currently trying with the following job script and then 
>>> submiting with qsub.
>>> I don't know why it just uses cpus of one of my two compute nodes. 
>>> It's not using both compute nodes. (compute-0-2 it's currently 
>>> powered off node).
>>>
>>> #!/bin/bash
>>> #$ -S /bin/bash
>>> #$ -V
>>> ### name
>>> #$ -N aln_left
>>> ### work dir
>>> #$ -cwd
>>> ### outputs
>>> #$ -j y
>>> ### PE
>>> #$ -pe orte 16
>>> ### all.q
>>> #$ -q all.q
>>>
>>> mpirun -np 16 pBWA aln -f aln_left 
>>> /data_in/references/genomes/human/hg19/bwa_ref/hg19.fa 
>>> /data_in/data/rawdata/HapMap_1.fastq >
>> If the compute-0-2 is powered off, it won't get slots assigned by SGE.
>>
>> The 16 slots are available on the actual machine - otherwise the job 
>> should be in "qw" state? As Open MPI was compiled with tight 
>> integration, the argument "-np 16" isn't necessary. It will detect 
>> the granted amount of slots and their location automatically.
>>
>> -- Reuti
>>
>>
>>> /data_out_2/tmp/05_11_12/mpi/HapMap_cloud.left.sai
>>>
>>> Here's all.q config file:
>>>
>>> qname                 all.q
>>> hostlist              @allhosts
>>> seq_no                0
>>> load_thresholds       np_load_avg=1.75
>>> suspend_thresholds    NONE
>>> nsuspend              1
>>> suspend_interval      00:05:00
>>> priority              0
>>> min_cpu_interval      00:05:00
>>> processors            UNDEFINED
>>> qtype                 BATCH INTERACTIVE
>>> ckpt_list             NONE
>>> pe_list               make mpich mpi orte openmpi smp
>>> rerun                 FALSE
>>> slots 0,[compute-0-0.local=8],[compute-0-1.local=8], \
>>>                       [compute-0-2.local.sg=8]
>>> tmpdir                /tmp
>>> shell                 /bin/csh
>>> prolog                NONE
>>> epilog                NONE
>>> shell_start_mode      posix_compliant
>>> starter_method        NONE
>>> suspend_method        NONE
>>> resume_method         NONE
>>> terminate_method      NONE
>>> notify                00:00:60
>>> owner_list            NONE
>>> user_lists            NONE
>>> xuser_lists           NONE
>>> subordinate_list      NONE
>>> complex_values        NONE
>>> projects              NONE
>>> xprojects             NONE
>>> calendar              NONE
>>> initial_state         default
>>> s_rt                  INFINITY
>>> h_rt                  INFINITY
>>> s_cpu                 INFINITY
>>> h_cpu                 INFINITY
>>> s_fsize               INFINITY
>>> h_fsize               INFINITY
>>> s_data                INFINITY
>>> h_data                INFINITY
>>> s_stack               INFINITY
>>> h_stack               INFINITY
>>> s_core                INFINITY
>>> h_core                INFINITY
>>> s_rss                 INFINITY
>>> h_rss                 INFINITY
>>> s_vmem                INFINITY
>>> h_vmem                INFINITY
>>>
>>> Best regards,
>>> Guillermo.
>>>
>>>
>>> El 05/11/2012 12:01, Reuti escribió:
>>>> Hi,
>>>>
>>>> Am 05.11.2012 um 10:55 schrieb Guillermo Marco Puche:
>>>>
>>>>> I've managed to compile Open MPI for Rocks:
>>>>> ompi_info | grep grid
>>>>>                   MCA ras: gridengine (MCA v2.0, API v2.0, 
>>>>> Component v1.4.3)
>>>>>
>>>>> Now I'm really confused on how i should run my pBWA program with 
>>>>> Open MPI.
>>>>> Program website (http://pbwa.sourceforge.net/) suggests something 
>>>>> like:
>>>>>
>>>>> sqsub -q mpi -n 240 -r 1h --mpp 4G ./pBWA bla bla bla...
>>>> Seems to be a local proprietary command on Sharcnet, or at least a 
>>>> wrapper to another unknown queuing system.
>>>>
>>>>
>>>>> I don't have sqsub, but qsub provided by SGE.  "-q" option isn't 
>>>>> valid for SGE since it's for queue selection.
>>>> Correct, the SGE paradigm is to request resources and SGE will 
>>>> select an appropriate queue for your job which fullfils the 
>>>> requirements.
>>>>
>>>>
>>>>> Maybe the solution is to create a simple job bash script and 
>>>>> include parallel environment for SGE and the number of slots 
>>>>> (since pBWA internally supports Open MPI)
>>>> How is your actal setup of your SGE? Most likely you will need to 
>>>> define a PE and request it during submission like for any other 
>>>> Open MPI application:
>>>>
>>>> $ qsub -pe orte 240 -l h_rt=1:00:00,h_vmem=4G ./pBWA bla bla bla...
>>>>
>>>> Assuming "-n" gives the number of cores.
>>>> Assuming "-r 1h" means wallclock time: -l h_rt=1:00:00
>>>> Assuming "--mpp 4G" requests the memory per slot: -l h_vmem=4G
>>>>
>>>> Necessary setup:
>>>>
>>>> http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge
>>>>
>>>> -- Reuti
>>>>
>>>>
>>>>> Regards,
>>>>> Guillermo.
>>>>>
>>>>> El 26/10/2012 12:21, Reuti escribió:
>>>>>> Am 26.10.2012 um 12:02 schrieb Guillermo Marco Puche:
>>>>>>
>>>>>>
>>>>>>> Hello,
>>>>>>>
>>>>>>> Like I said i'm using Rocks cluster 5.4.3 and it comes with 
>>>>>>> mpirun (Open MPI) 1.4.3.
>>>>>>> But $ ompi_info | grep gridengine shows nothing.
>>>>>>>
>>>>>>> So I'm confused if I've to update and rebuild open-mpi into the 
>>>>>>> latest version.
>>>>>>>
>>>>>> You can also remove the supplied version 1.4.3 from your system 
>>>>>> and build it from source with SGE support. But I don't see the 
>>>>>> advantage of using an old version. ROCKS supplies the source of 
>>>>>> their used version of Open MPI?
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Or if i can keep that current version of MPI and re-build it 
>>>>>>> (that would be the preferred option to keep the stability of the 
>>>>>>> cluster)
>>>>>>>
>>>>>> If you compile and install only in your own $HOME (as normal 
>>>>>> user, no root access necessary), then there is no impact to any 
>>>>>> system tool at all. You just have to take care which version you 
>>>>>> use by setting the correct $PATH and $LD_LIBRARY_PATH during 
>>>>>> compilation of your application and during execution of it. 
>>>>>> Therefore I suggested to include the name of the used compiler 
>>>>>> and Open MPI version in the build installation's directory name.
>>>>>>
>>>>>> There was a question about the to be used version of `mpiexec` 
>>>>>> just on the MPICH2 mailing list, maybe it's additional info:
>>>>>>
>>>>>>
>>>>>> http://lists.mcs.anl.gov/pipermail/mpich-discuss/2012-October/013318.html 
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- Reuti
>>>>>>
>>>>>>
>>>>>>
>>>>>>> Thanks !
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Guillermo.
>>>>>>>
>>>>>>> El 26/10/2012 11:59, Reuti escribió:
>>>>>>>
>>>>>>>> Am 26.10.2012 um 09:40 schrieb Guillermo Marco Puche:
>>>>>>>>
>>>>>>>>
>>>>>>>>> Hello,
>>>>>>>>>
>>>>>>>>> Thank you for the links Reuti !
>>>>>>>>>
>>>>>>>>> When they talk about:
>>>>>>>>>
>>>>>>>>> shell $ ./configure --with-sge
>>>>>>>>>
>>>>>>>>> It's in bash shell or in any other special shell?
>>>>>>>>>
>>>>>>>> There is no special shell required (please have a look at the 
>>>>>>>> INSTALL file in Open MPI's tar-archive).
>>>>>>>>
>>>>>>>>
>>>>>>>>> Do I've to be in a specified directory to execute that command?
>>>>>>>>>
>>>>>>>> Depends.
>>>>>>>>
>>>>>>>> As it's set up according to the
>>>>>>>> http://en.wikipedia.org/wiki/GNU_build_system
>>>>>>>> , you can either:
>>>>>>>>
>>>>>>>> $ tar -xf openmpi-1.6.2.tar.gz
>>>>>>>> $ cd openmpi-1.6.2
>>>>>>>> $ ./configure --prefix=$HOME/local/openmpi-1.6.2_gcc --with-sge
>>>>>>>> $ make
>>>>>>>> $ make install
>>>>>>>>
>>>>>>>> It's quite common to build inside the source tree. But if it is 
>>>>>>>> set up in the right way, it also supports building in different 
>>>>>>>> directories inside or outside the source tree which avoids a 
>>>>>>>> `make distclean` in case you want to generate different builds:
>>>>>>>>
>>>>>>>> $ tar -xf openmpi-1.6.2.tar.gz
>>>>>>>> $ mkdir openmpi-gcc
>>>>>>>> $ cd openmpi-gcc
>>>>>>>> $ ../openmpi-1.6.2/configure 
>>>>>>>> --prefix=$HOME/local/openmpi-1.6.2_gcc --with-sge
>>>>>>>> $ make
>>>>>>>> $ make install
>>>>>>>>
>>>>>>>> While at the time in another window you can execute:
>>>>>>>>
>>>>>>>> $ mkdir openmpi-intel
>>>>>>>> $ cd openmpi-intel
>>>>>>>> $ ../openmpi-1.6.2/configure 
>>>>>>>> --prefix=$HOME/local/openmpi-1.6.2_intel CC=icc CXX=icpc 
>>>>>>>> FC=ifort F77=ifort --disable-vt --with-sge
>>>>>>>> $ make
>>>>>>>> $ make install
>>>>>>>>
>>>>>>>> (Not to confuse anyone: there is bug in combination of Intel 
>>>>>>>> compiler and GNU headers with the above version of Open MPI, 
>>>>>>>> disabling VampirTrace support helps.)
>>>>>>>>
>>>>>>>> -- Reuti
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Thank you !
>>>>>>>>> Sorry again for my ignorance.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Guillermo.
>>>>>>>>>
>>>>>>>>> El 25/10/2012 19:50, Reuti escribió:
>>>>>>>>>
>>>>>>>>>> Am 25.10.2012 um 19:36 schrieb Guillermo Marco Puche:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Hello,
>>>>>>>>>>>
>>>>>>>>>>> I've no idea who compiled the application. I just found on 
>>>>>>>>>>> seqanswers forum that pBWA was a nice speed up to the 
>>>>>>>>>>> original BWA since it supports native OPEN MPI.
>>>>>>>>>>>
>>>>>>>>>>> As you told me i'll look further on how to compile open-mpi 
>>>>>>>>>>> with SGE. If anyone knows a good introduction/tutorial to 
>>>>>>>>>>> this would be appreciated.
>>>>>>>>>>>
>>>>>>>>>> The Open MPI site has huge documentation:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://www.open-mpi.org/faq/?category=building#build-rte-sge
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://www.open-mpi.org/faq/?category=running#run-n1ge-or-sge
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Be sure that during execution you pick the correct `mpiexec` 
>>>>>>>>>> and LD_LIBRARY_PATH from you own build. You can also adjust 
>>>>>>>>>> the location of Open MPI with the usual --prefix. I put it in 
>>>>>>>>>> --prefix==$HOME/local/openmpi-1.6.2_shared_gcc refelcting the 
>>>>>>>>>> version I built.
>>>>>>>>>>
>>>>>>>>>> -- Reuti
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Then i'll try to run it with my current version of open-mpi 
>>>>>>>>>>> and update if needed.
>>>>>>>>>>>
>>>>>>>>>>> Thanks.
>>>>>>>>>>>
>>>>>>>>>>> Best regards,
>>>>>>>>>>> Guillermo.
>>>>>>>>>>>
>>>>>>>>>>> El 25/10/2012 18:53, Reuti escribió:
>>>>>>>>>>>
>>>>>>>>>>>> Please keep the list posted, so that others can participate 
>>>>>>>>>>>> on the discussion. I'm not aware of this application, but 
>>>>>>>>>>>> maybe someone else is on the list who could be of broader 
>>>>>>>>>>>> help.
>>>>>>>>>>>>
>>>>>>>>>>>> Again: who compiled the application, as I can see only the 
>>>>>>>>>>>> source at the site you posted?
>>>>>>>>>>>>
>>>>>>>>>>>> -- Reuti
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Am 25.10.2012 um 13:23 schrieb Guillermo Marco Puche:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> $ ompi_info | grep grid
>>>>>>>>>>>>>
>>>>>>>>>>>>> Returns nothing. Like i said I'm newbie to MPI.
>>>>>>>>>>>>> I didn't know that I had to compile anything. I've Rocks 
>>>>>>>>>>>>> installation out of the box.
>>>>>>>>>>>>> So MPI is installed but nothing more I guess.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I've found an old thread in Rocks discuss list:
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://lists.sdsc.edu/pipermail/npaci-rocks-discussion/2012-April/057303.html 
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> User asking is using this script:
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *#$ -S /bin/bash*
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *#*
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *#*
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *# Export all environment variables*
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *#$ -V*
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *# specify the PE and core #*
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *#$ -pe mpi 128*
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *# Customize job name*
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *#$ -N job_hpl_2.0*
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *# Use current working directory*
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *#$ -cwd*
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *# Join stdout and stder into one file*
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *#$ -j y*
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *# The mpirun command; note the lack of host names as 
>>>>>>>>>>>>> SGE will provide them
>>>>>>>>>>>>>
>>>>>>>>>>>>>   on-the-fly.*
>>>>>>>>>>>>>
>>>>>>>>>>>>>   *mpirun -np $NSLOTS ./xhpl >> xhpl.out*
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> But then I read this:
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> in rocks  sge PE
>>>>>>>>>>>>> mpi is loosely integrated
>>>>>>>>>>>>> mpich and orte are tightly integrated
>>>>>>>>>>>>> qsub require args are different for mpi mpich with orte
>>>>>>>>>>>>>
>>>>>>>>>>>>> mpi and mpich need machinefile
>>>>>>>>>>>>>
>>>>>>>>>>>>> by default
>>>>>>>>>>>>> mpi, mpich are for mpich2
>>>>>>>>>>>>> orte is for openmpi
>>>>>>>>>>>>> regards
>>>>>>>>>>>>> -LT
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> The program I need to run is pBWA:
>>>>>>>>>>>>>   http://pbwa.sourceforge.net/
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> It uses MPI.
>>>>>>>>>>>>>
>>>>>>>>>>>>> At this moment i'm kinda confused on which is the next step.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I thought i just could run with MPI and a simple SGE job 
>>>>>>>>>>>>> pBWA with multiple processes.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>> Guillermo.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> El 25/10/2012 13:17, Reuti escribió:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Am 25.10.2012 um 13:11 schrieb Guillermo Marco Puche:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hello Reuti,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I got stoned here. I've no idea what MPI library I've 
>>>>>>>>>>>>>>> got. I'm using Rocks Cluster Viper 5.4.3 which comes out 
>>>>>>>>>>>>>>> with Centos 5.6, SGE, SPM, OPEN MPI and MPI.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> How can i check which library i got installed?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I found this:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> $ mpirun -V
>>>>>>>>>>>>>>> mpirun (Open MPI) 1.4.3
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Report bugs to
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> http://www.open-mpi.org/community/help/
>>>>>>>>>>>>>> Good, and this one you also used to compile the application?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The check whether Open MPI was build with SGE support:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> $ ompi_info | grep grid
>>>>>>>>>>>>>>                   MCA ras: gridengine (MCA v2.0, API 
>>>>>>>>>>>>>> v2.0, Component v1.6.2)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -- Reuti
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>> Guillermo.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> El 25/10/2012 13:05, Reuti escribió:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Am 25.10.2012 um 10:37 schrieb Guillermo Marco Puche:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hello !
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I found a new version of my tool which supports 
>>>>>>>>>>>>>>>>> multi-threading but also MPI or OPENMPI for more 
>>>>>>>>>>>>>>>>> additional processes.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm kinda new to MPI with SGE. What would be the good 
>>>>>>>>>>>>>>>>> command for qsub or config inside a job file to ask 
>>>>>>>>>>>>>>>>> SGE to work with 2 MPI processes?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Will the following code work in a SGE job file?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> #$ -pe mpi 2
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> That's supposed to make job work with 2 processes 
>>>>>>>>>>>>>>>>> instead of 1.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Not out of the box: it will grant 2 slots for the job 
>>>>>>>>>>>>>>>> according to the allocation rules of the PE. But how to 
>>>>>>>>>>>>>>>> start your application in the jobscript inside the 
>>>>>>>>>>>>>>>> granted allocation is up to you. Fortunately the MPI 
>>>>>>>>>>>>>>>> libraries got an (almost) automatic integration into 
>>>>>>>>>>>>>>>> queuing systems nowadays without further user 
>>>>>>>>>>>>>>>> intervention.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Which MPI library do you use when you compile your 
>>>>>>>>>>>>>>>> application of the mentioned ones above?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -- Reuti
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Regards,
>>>>>>>>>>>>>>>>> Guillermo.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> El 22/10/2012 17:19, Reuti escribió:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Am 22.10.2012 um 16:31 schrieb Guillermo Marco Puche:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I'm using a program where I can specify the number 
>>>>>>>>>>>>>>>>>>> of threads I want to use.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Only threads and not additional processes? Then you 
>>>>>>>>>>>>>>>>>> are limited to one node, unless you add something 
>>>>>>>>>>>>>>>>>> like 
>>>>>>>>>>>>>>>>>> http://www.kerrighed.org/wiki/index.php/Main_Page or 
>>>>>>>>>>>>>>>>>> http://www.scalemp.com
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>   to get a cluster wide unique process and memory space.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -- Reuti
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I'm able to launch multiple instances of that tool 
>>>>>>>>>>>>>>>>>>> in separate nodes.
>>>>>>>>>>>>>>>>>>> For example: job_process_00 in compute-0-0, 
>>>>>>>>>>>>>>>>>>> job_process_01 in compute-1 etc.. each job is 
>>>>>>>>>>>>>>>>>>> calling that program which splits up in 8 threads 
>>>>>>>>>>>>>>>>>>> (each of my nodes has 8 CPUs).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> When i setup 16 threads i can't split 8 threads per 
>>>>>>>>>>>>>>>>>>> node. So I would like to split them between 2 
>>>>>>>>>>>>>>>>>>> compute nodes.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Currently I've 4 compute nodes and i would like to 
>>>>>>>>>>>>>>>>>>> speed up the process setting 16 threads of my 
>>>>>>>>>>>>>>>>>>> program splitting between more than one compute 
>>>>>>>>>>>>>>>>>>> node. At this moment I'm stuck using only 1 compute 
>>>>>>>>>>>>>>>>>>> node per process with 8 threads.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thank you !
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>>>>>> Guillermo.
>>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> users at gridengine.org
>>>>>>>>>>>>>>>>>>> https://gridengine.org/mailman/listinfo/users
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> users mailing list
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> users at gridengine.org
>>>>>>>>>>>>>>>>> https://gridengine.org/mailman/listinfo/users
>
> _______________________________________________
> users mailing list
> users at gridengine.org
> https://gridengine.org/mailman/listinfo/users




More information about the users mailing list