From 0 to 1: The Oozie Orchestration Framework
What you'll learn
- Install and set up Oozie
- Configure Workflows to run jobs on Hadoop
- Configure time-triggered and data-triggered Workflows
- Configure data pipelines using Bundles
- Students should have basic knowledge of the Hadoop eco-system and should be able to run MapReduce jobs on Hadoop
Prerequisites: Working with Oozie requires some basic knowledge of the Hadoop eco-system and running MapReduce jobs
Taught by a team which includes 2 Stanford-educated, ex-Googlers and 2 ex-Flipkart Lead Analysts. This team has decades of practical experience in working with large-scale data processing jobs.
Oozie is like the formidable, yet super-efficient admin assistant who can get things done for you, if you know how to ask
Let's parse that
formidable, yet super-efficient: Oozie is formidable because it is entirely written in XML, which is hard to debug when things go wrong. However, once you've figured out how to work with it, it's like magic. Complex dependencies, managing a multitude of jobs at different time schedules, managing entire data pipelines are all made easy with Oozie
get things done for you: Oozie allows you to manage Hadoop jobs as well as Java programs, scripts and any other executable with the same basic set up. It manages your dependencies cleanly and logically.
if you know how to ask: Knowing the right configurations parameters which gets the job done, that is the key to mastering Oozie
Workflow Management: Workflow specifications, Action nodes, Control nodes, Global configuration, real examples with MapReduce and Shell actions which you can run and tweak
Time-based and data-based triggers for Workflows: Coordinator specification, Mimicing simple cron jobs, specifying time and data availability triggers for Workflows, dealing with backlog, running time-triggered and data-triggered coordinator actions
Data Pipelines using Bundles: Bundle specification, the kick-off time for bundles, running a bundle on Oozie
Who this course is for:
- Yep! Engineers, analysts and sysadmins who are interested in big data processing on Hadoop
- Nope! Beginners who have no knowledge of the Hadoop eco-system
Loonycorn is us, Janani Ravi and Vitthal Srinivasan. Between us, we have studied at Stanford, been admitted to IIM Ahmedabad and have spent years working in tech, in the Bay Area, New York, Singapore and Bangalore.
Janani: 7 years at Google (New York, Singapore); Studied at Stanford; also worked at Flipkart and Microsoft
Vitthal: Also Google (Singapore) and studied at Stanford; Flipkart, Credit Suisse and INSEAD too
We think we might have hit upon a neat way of teaching complicated tech courses in a funny, practical, engaging way, which is why we are so excited to be here on Udemy!
We hope you will try our offerings, and think you'll like them :-)