PySpark - Python Spark Hadoop coding framework & testing
What you'll learn
- Python Spark PySpark industry standard coding practices - Logging, Error Handling, reading configuration, unit testing
- Building a data pipeline using Hive, Spark and PostgreSQL
- Python Spark Hadoop development using PyCharm
Requirements
- Basic programming skills
- Basic database skills
- Hadoop entry level knowledge
Description
This course will bridge the gap between your academic and real world knowledge and prepare you for an entry level Big Data Python Spark developer role. You will learn the following
Python Spark coding best practices
Logging
Error Handling
Reading configuration from properties file
Doing development work using PyCharm
Using your local environment as a Hadoop Hive environment
Reading and writing to a Postgres database using Spark
Python unit testing framework
Building a data pipeline using Hadoop , Spark and Postgres
Prerequisites :
Basic programming skills
Basic database knowledge
Hadoop entry level knowledge
Who this course is for:
- Students looking at moving from Big Data Spark academic background to a real world developer role
Course content
- Preview01:22
- 02:12What is Big Data Spark?
Instructor
We are a group of Solution Architects and Developers with expertise in Java, Python, Scala , Big Data , Machine Learning and Cloud.
We have years of experience in building Data and Analytics solutions for global clients.
Our primary goal is to simplify learning for our students.
We take a very practical use case based approach in all our courses.