Brief Introduction to Oozie

Last updated on Feb 09,2021 15.8K Views

Brief Introduction to Oozie

edureka.co

Oozie is a workflow scheduler system to manage Apache Hadoop jobs. It is integrated with the rest of the Hadoop stack supporting several types of Hadoop jobs such as Java MapReduce, Streaming MapReduce, Pig, Hive and Sqoop. Oozie is a scalable, reliable and extensible system. This technology is used in production at Yahoo!, running more than 200,000 jobs every day.

Features:

Workflow – Directed Acyclic Graph of Jobs:

Workflow Example:

<workflow-app nome='wordcount –wf’>
 <start to= ‘wordcount’/>
<action name=’Wordcount'>
 <map-reduce>
<job-tracker>foo.com:9001</job-tracker>
<name-node>hdfs://bar.com:9000</name-node>
 <configuration>
<property>
 <name>mapred.input.dir</name>
 <value>${inputDir}</value,>
 </property>
<property>
 <name>mapred.output.dir</name>
 <value> ${outputDir}</value>
 </property>
 </configuration>
 </map-reduce>
<ok to='end’/>
 <error to='kill'/>
 </action>
<kill name='kill'/>
<end name='end'/>
 </Workflow-app>

Workflow Definition:

A workflow definition is a DAG with control flow nodes or action nodes, where the nodes are connected by transitions arrows.

Control Flow Nodes:     

The control flow provides a way to control the Workflow execution path. Flow control operations within the workflow applications can be done through the following nodes:

Action Nodes:

Workflow Application:

Workflow application is a ZIP file that includes the workflow definition and the necessary files to run all the actions. It contains the following files:

Application Deployment:

$ hadoop fs-put wordcount-wf hdfs://bar.com:9000/usr/abc/wordcount

Workflow Job Parameters:

$ cat job.properites
Oozie.wf.application.path=hdfs://bar.com:9000/usr/abc/wordcount
Input=/usr/abc/input-data
Output=/usr/abc/output-data

Job Execution:

$ oozie job –run –config job.properties
Job:1-20090525161321-oozie-xyz-W

Got a question for us? Mention them in the comments section and we will get back to you. 

BROWSE COURSES