From 0 to 1: Hive for Processing Big Data
- 15.5 hours on-demand video
- 137 downloadable resources
- Full lifetime access
- Access on mobile and TV
- Certificate of Completion
Get your team access to 4,000+ top Udemy courses anytime, anywhere.Try Udemy for Business
- Write complex analytical queries on data in Hive and uncover insights
- Leverage ideas of partitioning, bucketing to optimize queries in Hive
- Customize hive with user defined functions in Java and Python
- Understand what goes on under the hood of Hive with HDFS and MapReduce
Data warehousing systems - which have become the rage with the rise of 'Big Data' - are quite different from traditional transaction processing systems. Hive is a prototypical data warehousing system.
Hive has a whole bunch of useful functions available out-of-the-box. This is an introduction to the 3 types of functions available. Standard, aggregate and table generating functions.
Sub-queries in Hive are rather quirky. For instance, union is fine, but intersect is not.
Partitioning in Hive is conceptually similar to Indexing in traditional DBMS - way to quickly look up rows with specific values in a particular column
Bucketing is conceptually quite close to partitioning - and indeed to Indexing in traditional RDBMS - but with a key difference.
- Hive requires knowledge of SQL. If you don't know SQL, please head to the SQL primer at the end of the course first.
- You'll need to know Java if you are interested in the sections on custom user defined functions
- No other prerequisites: The course covers everything you need to install Hive and run queries!
Prerequisites: Hive requires knowledge of SQL. The course includes and SQL primer at the end. Please do that first if you don't know SQL. You'll need to know Java if you want to follow the sections on custom functions.
Taught by a 4 person team including 2 Stanford-educated, ex-Googlers and 2 ex-Flipkart Lead Analysts. This team has decades of practical experience in working with large-scale data.
Hive is like a new friend with an old face (SQL). This course is an end-to-end, practical guide to using Hive for Big Data processing.
Let's parse that
A new friend with an old face: Hive helps you leverage the power of Distributed computing and Hadoop for Analytical processing. It's interface is like an old friend : the very SQL like HiveQL. This course will fill in all the gaps between SQL and what you need to use Hive.
End-to-End: The course is an end-to-end guide for using Hive: whether you are analyst who wants to process data or an Engineer who needs to build custom functionality or optimize performance - everything you'll need is right here. New to SQL? No need to look elsewhere. The course has a primer on all the basic SQL constructs, .
Practical: Everything is taught using real-life examples, working queries and code .
Analytical Processing: Joins, Subqueries, Views, Table Generating Functions, Explode, Lateral View, Windowing and more
Tuning Hive for better functionality: Partitioning, Bucketing, Join Optimizations, Map Side Joins, Indexes, Writing custom User Defined functions in Java. UDF, UDAF, GenericUDF, GenericUDTF, Custom functions in Python, Implementation of MapReduce for Select, Group by and Join
For SQL Newbies: SQL In Great Depth
- Yep! Analysts who want to write complex analytical queries on large scale data
- Yep! Engineers who want to know more about managing Hive as their data warehousing solution