[gridengine users] Hadoop Integration HOWTO (was: Hadoop Integration - how's it going)

Ralph Castain rhc at open-mpi.org
Wed Jun 13 21:55:34 UTC 2012


Sent from wrong account, so it bounced. Sorry for resending it.


On Jun 13, 2012, at 3:54 PM, Ralph Castain wrote:

> Here is my initial cut at the API.
> 
> typedef enum {
>    GRIDENGINE_RESOURCE_FILE,
>    GRIDENGINE_RESOURCE_PPN,
>    GRIDENGINE_RESOURCE_NUM_NODES
> } gridengine_resource_type_t;
> 
> typedef struct {
>    gridengine_resource_type_t type;
>    char *constraint;
> } gridengine_resources_t;
> 
> typedef void (*gridengine_allocation_cbfunc_t)(char *nodes);
> 
> int gridengine_allocate(gridengine_resources_t *array, int num_elements, gridengine_allocation_cbfunc_t cbfunc);
> 
> The function call is non-blocking, returning 0 if the request was successfully registered and < 0 if there was an error. The input params include an array of structs that contain the specification, an integer declaring how many elements are in the array, and the callback function to be called once the allocation is available.
> 
> The callback function delivers a comma-delimited string of node names (the receiver is responsible for releasing the string). The string is NULL if the allocation could not be done - e.g., if one of the files cannot be located, or we lack permission to access it.
> 
> We can extend supported constraints over time, but I could see three for now:
> 
> 1. ppn, indicating how many "slots" are required on each node. Default is 1.
> 
> 2. file, specifying a filename to be accessible by the application. The filename will be given in URI form - I can see a couple of variants we need supported for now:
> 
> (a) "hdfs://path-to-file" - need to get file shard locations from hdfs. There are two ways to do this today - I've given Rayson copies of both methods. One uses Java and works with all branches of HDFS. The other uses C and only works with Hadoop 2.0 (branch-2 of the Apache Hadoop code base).
> 
> (b) "nfs://path" - network file system, should be available anywhere.
> 
> (c) "file://path" - a file on the local file system, ORTE will take responsibility for copying it to the remote nodes.
> 
> I'm sure we'll think of others - note that (b) and (c) are equivalent to "no constraint" on the allocator in terms of files. In such cases, you can essentially ignore that constraint.
> 
> 3. num_nodes - specify the total number of nodes to be allocated. If ppn is given and no num_nodes, then we want slots allocated on every node that meets the rest of the constraints.
> 
> Comments welcome - this is just a first cut!
> Ralph
> 
> 
> 
> On Jun 12, 2012, at 12:29 PM, Ralph Castain wrote:
> 
>> I'll be happy to do so - I'm running behind as I've been hit with a high priority issue. I hope to get back to this in the not-too-distant future (probably August).
>> 
>> Version 2.0 is in alpha now - no announced schedule for release, but I would guess the August time frame again.
>> 
>> MR+ doesn't replace HDFS, but replaces the "Job" class underneath Hadoop MR so that it uses the OMPI infrastructure instead of YARN - this allows it to run on any HPC system, and to use MPI. So people can write their mappers and reducers with MPI calls in them (remember, OMPI now includes Java bindings!).
>> 
>> The current version of MR+ (in the OMPI trunk) let's you write and execute your own mappers and reducers, but does not support the Hadoop MR classes. So if you want to write something that uses your own algorithms, you can do so. There is a "word-count" example in the "examples" directory of the OMPI code base.
>> 
>> Ralph
>> 
>> 
>> On Jun 5, 2012, at 12:10 AM, Ron Chen wrote:
>> 
>>> Ralph, when you your C HDFS API ready, can you please let us know? And do you know when Apache Hadoop is going to release version 2.0?
>>> 
>>> 
>>> -Ron
>>> 
>>> 
>>> 
>>> ----- Original Message -----
>>> From: Rayson Ho <rayson at scalablelogic.com>
>>> To: Prakashan Korambath <ppk at ats.ucla.edu>
>>> Cc: Ralph Castain <rhc at open-mpi.org>; Ron Chen <ron_chen_123 at yahoo.com>; "users at gridengine.org" <users at gridengine.org>
>>> Sent: Monday, June 4, 2012 1:53 PM
>>> Subject: Re: [gridengine users] Hadoop Integration HOWTO (was: Hadoop Integration - how's it going)
>>> 
>>> Prakashan,
>>> 
>>> Ralph mentioned to me before that the C API bindings will be available
>>> in Apache 2.0, which adds Google protocol buffers as one of the new
>>> features and thus supports non-Java HDFS bindings.
>>> 
>>> AFAIK, EMC MapR replaces HDFS with something that has more HA features
>>> & performance. I don't know all the specific details but I do believe
>>> that most of the API interfaces are going to be the same as or very
>>> similar to the existing HDFS APIs.
>>> 
>>> Rayson
>>> 
>>> 
>>> 
>>> On Mon, Jun 4, 2012 at 1:24 PM, Prakashan Korambath <ppk at ats.ucla.edu> wrote:
>>>> Hi Rayson,
>>>> 
>>>> Let me know why you have C API bindings from Ralph ready.  I can help you
>>>> guys with testing it out.
>>>> 
>>>> Prakashan
>>>> 
>>>> 
>>>> On 06/04/2012 10:17 AM, Rayson Ho wrote:
>>>>> 
>>>>> Hi Prakashan&  Ron,
>>>>> 
>>>>> I thought about this issue while I was writing&  testing the HOWTO...
>>>>> 
>>>>> but I didn't spend much more time on it as I needed to work on
>>>>> something else, and it requires an upcoming C API binding for HDFS
>>>>> from Ralph. Plus... I didn't want to pre-announce too many upcoming
>>>>> new features. :-)
>>>>> 
>>>>> With the architecture of Prakashan's On-demand Hadoop Cluster, we can
>>>>> take advantage of Ralph's C HDFS API, and we can then easily write a
>>>>> scheduler plugin that queries HDFS block information. This scheduler
>>>>> plugin then affects scheduling decision such that Open Grid
>>>>> Scheduler/Grid Engine can send jobs to the data, which IMO is the core
>>>>> idea behind Hadoop - scheduling jobs&  tasks to the data.
>>>>> 
>>>>> 
>>>>> Note that we will also need to productionize the "Parallel Environment
>>>>> Queue Sort (PQS) Scheduler API", which was under technology preview in
>>>>> GE 2011.11:
>>>>> 
>>>>> http://gridscheduler.sourceforge.net/Releases/ReleaseNotesGE2011.11.pdf
>>>>> 
>>>>> Rayson
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mon, Jun 4, 2012 at 12:55 PM, Prakashan Korambath<ppk at ats.ucla.edu>
>>>>> wrote:
>>>>>> 
>>>>>> Hi Ron,
>>>>>> 
>>>>>> I don't have anything planned beyond what I released right now.  Idea is
>>>>>> to
>>>>>> let what Hadoop does best to Hadoop and what SGE or any scheduler does
>>>>>> best
>>>>>> to the scheduler.  I believe somebody from SDSC also released similar
>>>>>> strategy for PBS/Torque.  I worked only on the SGE because I mostly use
>>>>>> SGE.
>>>>>> 
>>>>>> Prakashan
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On 06/04/2012 09:45 AM, Ron Chen wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> Hi Prakashan,
>>>>>>> 
>>>>>>> 
>>>>>>> I am trying to understand your integration, and it looks like Ravi
>>>>>>> Chandra
>>>>>>> Nallan's Hadoop Integration.
>>>>>>> 
>>>>>>> One of the improvements in Daniel Templeton's Hadoop Integration is he
>>>>>>> models HDFS data as resources, and thus can schedule jobs to data. Is
>>>>>>> scheduling jobs to data a planned feature of your "On-Demand Hadoop
>>>>>>> Cluster"
>>>>>>> integration?
>>>>>>> 
>>>>>>> For those who didn't know Ravi Chandra Nallan, he was with Sun Micro
>>>>>>> when
>>>>>>> he developed the integration. Last I checked, he was with Oracle.
>>>>>>> 
>>>>>>> -Ron
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> ----- Original Message -----
>>>>>>> From: Rayson Ho<rayson at scalablelogic.com>
>>>>>>> To: Prakashan Korambath<ppk at ats.ucla.edu>
>>>>>>> Cc: "users at gridengine.org"<users at gridengine.org>
>>>>>>> Sent: Friday, June 1, 2012 3:04 PM
>>>>>>> Subject: Re: [gridengine users] Hadoop Integration HOWTO (was: Hadoop
>>>>>>> Integration - how's it going)
>>>>>>> 
>>>>>>> Thanks again Prakashan for the contribution!
>>>>>>> 
>>>>>>> Rayson
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Jun 1, 2012 at 1:25 PM, Prakashan Korambath<ppk at ats.ucla.edu>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Thank you Rayson!  Appreciate you taking time and upload the tar files
>>>>>>>> and
>>>>>>>> writing the howto.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> 
>>>>>>>> Prakashan
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On 06/01/2012 10:19 AM, Rayson Ho wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I've reviewed the integration, and wrote a short Grid Engine Hadoop
>>>>>>>>> HOWTO:
>>>>>>>>> 
>>>>>>>>> http://gridscheduler.sourceforge.net/howto/GridEngineHadoop.html
>>>>>>>>> 
>>>>>>>>> The difference between the 2 methods (original SGE 6.2u5 vs
>>>>>>>>> Prakashan's) is that with Prakashan's approach, Grid Engine is used
>>>>>>>>> for resource allocation, and the Hadoop job scheduler/Job Tracker is
>>>>>>>>> used to handle all the MapReduce operations. A Hadoop cluster is
>>>>>>>>> created on demand with Prakashan's approach, but in the original SGE
>>>>>>>>> 6.2u5 method Grid Engine replaces the Hadoop job scheduler.
>>>>>>>>> 
>>>>>>>>> As standard Grid Engine PEs are used in this new approach, one can
>>>>>>>>> call "qrsh -inherit" and use Grid Engine's method to start Hadoop
>>>>>>>>> services on remote nodes, and thus get full job control, job
>>>>>>>>> accounting, and cleanup at terminate benefits like any other tight PE
>>>>>>>>> jobs!
>>>>>>>>> 
>>>>>>>>> Rayson
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Tue, May 29, 2012 at 10:36 AM, Prakashan
>>>>>>>>> Korambath<ppk at ats.ucla.edu>
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> I put my scripts in a tar file and send it to Rayson yesterday so
>>>>>>>>>> that
>>>>>>>>>> he
>>>>>>>>>> can put it in a common place to download.
>>>>>>>>>> 
>>>>>>>>>> Prakashan
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On 05/29/2012 07:18 AM, Jesse Becker wrote:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Mon, May 28, 2012 at 12:00:24PM -0400, Prakashan
>>>>>>>>>>> Korambath wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> This is how we run hadoop using Grid Engine (for that matter
>>>>>>>>>>>> any scheduler with appropriate alteration)
>>>>>>>>>>>> 
>>>>>>>>>>>> http://www.ats.ucla.edu/clusters/hoffman2/hadoop/default.htm
>>>>>>>>>>>> 
>>>>>>>>>>>> Basically, run either a prolog or call a script inside the
>>>>>>>>>>>> submission command file itself to parse the output of
>>>>>>>>>>>> PE_HOSTFILE to create hadoop *.site.xml, masters and slaves
>>>>>>>>>>>> files at run time. This methodology is suitable for any
>>>>>>>>>>>> scheduler as it is not dependent on them. If there is
>>>>>>>>>>>> interest I can post the prologue script. Thanks.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Please do.
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> users mailing list
>>>>>>> users at gridengine.org
>>>>>>> https://gridengine.org/mailman/listinfo/users
>>>>>>> 
>>>>>> 
>>>> 
>>> 
>> 
> 




More information about the users mailing list