A course about Apache Pig, a Data analysis tool in Hadoop. we will start with concept of Hadoop , its components, HDFS and MapReduce. HDFS design and Map and Reduce Phase of analysis.
then we will Look into apache Pig, what it is where we can use apache Pig and where we can not use it. we will look at teh basics of Pig and gradually proceed to more advance topics. we will also look in to apache flume , a tool to collect Log data and injest them into HDFS or any other sink. we will see its configurations in detail and will write our own configuration file to fetch data from twitter to HDFS.
we will also be seeing pig UDFs and will Use them for our project.
finally we will be analyzing tweets for the sentiments using apache PIg
Gaurav Vyas has more than 10 Years of experience.
He is working in technologies like Hadoop and its ecosystem MapReduce, Apache Pig, Hive, Impala, Zookeeper, Spark with both Scala and python apis. He also has a very good experience working in NoSqls HBASE and MongoDB.
He has also used Functional programming languages like R and Scala for various projects.