- 28 hours on-demand video
- 5 downloadable resources
- Full lifetime access
- Access on mobile and TV
- Certificate of Completion
Get your team access to 4,000+ top Udemy courses anytime, anywhere.Try Udemy for Business
- Entire curriculum of CCA Spark and Hadoop Developer
- Apache Sqoop
- HDFS Commands
- Scala Fundamentals
- Core Spark - Transformations and Actions
- Spark SQL and Data Frames
- Streaming analytics using Kafka, Flume and Spark Streaming
- Basic programming skills
- Cloudera Quickstart VM or valid account for IT Versity Big Data labs or any Hadoop clusters where Hadoop, Hive and Spark are well integrated.
- Minimum memory required based on the environment you are using with 64 bit operating system
CCA 175 Spark and Hadoop Developer is one of the well recognized Big Data certification. This scenario based certification exam demands basic programming using Python or Scala along with Spark and other Big Data technologies.
This comprehensive course covers all aspects of the certification using Scala as programming language.
Core Spark - Transformations and Actions
Spark SQL and Data Frames
Flume, Kafka and Spark Streaming
Exercises will be provided to prepare before attending the certification. Intention of the course is to boost the confidence to attend the certification.
All the demos are given on our state of the art Big Data cluster. You can avail one week complementary lab access by filling the form which is provided as part of the welcome message.
- Any IT aspirant/professional willing to learn Big Data and give CCA 175 certification
// Sort the data by date in ascending order and by daily revenue per product in descending order val dailyRevenuePerProductSorted = dailyRevenuePerProductJoin. map(rec => ((rec._2._1._1, -rec._2._1._2), (rec._2._1._1, rec._2._1._2, rec._2._2))). sortByKey() dailyRevenuePerProductSorted.take(100).foreach(println) //((order_date_asc, daily_revenue_per_product_id_desc), (order_date,daily_revenue_per_product,product_name))
// Get data to desired format – order_date,daily_revenue_per_product,product_name val dailyRevenuePerProduct = dailyRevenuePerProductSorted. map(rec => rec._2._1 + "," + rec._2._2 + "," + rec._2._3) dailyRevenuePerProduct.take(10).foreach(println)
// Save final output into HDFS in avro file format as well as text file format // HDFS location – avro format /user/YOUR_USER_ID/daily_revenue_avro_scala // HDFS location – text format /user/YOUR_USER_ID/daily_revenue_txt_scala dailyRevenuePerProduct.saveAsTextFile("/user/dgadiraju/daily_revenue_txt_scala") sc.textFile("/user/dgadiraju/daily_revenue_txt_scala").take(10).foreach(println) // Copy both from HDFS to local file system // /home/YOUR_USER_ID/daily_revenue_scala mkdir daily_revenue_scala hadoop fs -get /user/dgadiraju/daily_revenue_txt_scala \ /home/dgadiraju/daily_revenue_scala/daily_revenue_txt_scala cd daily_revenue_scala/daily_revenue_txt_scala/ ls -ltr