Spark Scala coding framework, testing, Structured streaming
What you'll learn
- Spark Scala industry standard coding practices - Logging, Exception Handling, Reading from Configuration File
- Unit Testing Spark Scala using JUnit , ScalaTest, FlatSpec & Assertion
- Building a data pipeline using Hive, Spark and PostgreSQL
- Spark Scala development with Intellij, Maven
- Cloudera QuickStart VM setup on GCP
Requirements
- Basic programming skills
- Basic database skills
- Hadoop entry level knowledge
Description
This course will bridge the gap between your academic and real world knowledge and prepare you for an entry level Big Data Spark Scala developer role. You will learn the following
Spark Scala coding best practices
Logging - log4j, slf4
Exception Handling
Configuration using Typesafe config
Doing development work using IntelliJ, Maven
Using your local environment as a Hadoop Hive environment
Reading and writing to a Postgres database using Spark
Unit Testing Spark Scala using JUnit , ScalaTest, FlatSpec & Assertion
Building a data pipeline using Hadoop , Spark and Postgres
Bonus - Setting up Cloudera QuickStart VM on Google Cloud Platform (GCP)
Structured Streaming
Prerequisites :
Basic programming skills
Basic database knowledge
Big Data and Spark entry level knowledge
Who this course is for:
- Students looking at moving from Big Data Spark academic background to a real world developer role
Instructor
We are a group of Solution Architects and Developers with expertise in Java, Python, Scala , Big Data , Machine Learning and Cloud.
We have years of experience in building Data and Analytics solutions for global clients.
Our primary goal is to simplify learning for our students.
We take a very practical use case based approach in all our courses.