Udemy
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
Development
Web Development Data Science Mobile Development Programming Languages Game Development Database Design & Development Software Testing Software Engineering Software Development Tools No-Code Development
Business
Entrepreneurship Communication Management Sales Business Strategy Operations Project Management Business Law Business Analytics & Intelligence Human Resources Industry E-Commerce Media Real Estate Other Business
Finance & Accounting
Accounting & Bookkeeping Compliance Cryptocurrency & Blockchain Economics Finance Finance Cert & Exam Prep Financial Modeling & Analysis Investing & Trading Money Management Tools Taxes Other Finance & Accounting
IT & Software
IT Certifications Network & Security Hardware Operating Systems & Servers Other IT & Software
Office Productivity
Microsoft Apple Google SAP Oracle Other Office Productivity
Personal Development
Personal Transformation Personal Productivity Leadership Career Development Parenting & Relationships Happiness Esoteric Practices Religion & Spirituality Personal Brand Building Creativity Influence Self Esteem & Confidence Stress Management Memory & Study Skills Motivation Other Personal Development
Design
Web Design Graphic Design & Illustration Design Tools User Experience Design Game Design 3D & Animation Fashion Design Architectural Design Interior Design Other Design
Marketing
Digital Marketing Search Engine Optimization Social Media Marketing Branding Marketing Fundamentals Marketing Analytics & Automation Public Relations Paid Advertising Video & Mobile Marketing Content Marketing Growth Hacking Affiliate Marketing Product Marketing Other Marketing
Lifestyle
Arts & Crafts Beauty & Makeup Esoteric Practices Food & Beverage Gaming Home Improvement & Gardening Pet Care & Training Travel Other Lifestyle
Photography & Video
Digital Photography Photography Portrait Photography Photography Tools Commercial Photography Video Design Other Photography & Video
Health & Fitness
Fitness General Health Sports Nutrition & Diet Yoga Mental Health Martial Arts & Self Defense Safety & First Aid Dance Meditation Other Health & Fitness
Music
Instruments Music Production Music Fundamentals Vocal Music Techniques Music Software Other Music
Teaching & Academics
Engineering Humanities Math Science Online Education Social Science Language Learning Teacher Training Test Prep Other Teaching & Academics
Web Development JavaScript React Angular CSS Node.Js Typescript HTML5 PHP
AWS Certification Microsoft Certification AWS Certified Solutions Architect - Associate AWS Certified Cloud Practitioner CompTIA A+ Amazon AWS Cisco CCNA CompTIA Security+ Microsoft AZ-900
Microsoft Power BI SQL Tableau Data Modeling Business Analysis Data Analysis Data Warehouse Business Intelligence Blockchain
Unity Unreal Engine Game Development Fundamentals C# 3D Game Development C++ Unreal Engine Blueprints 2D Game Development Mobile Game Development
Google Flutter iOS Development Android Development Swift React Native Dart (programming language) Kotlin SwiftUI Mobile App Development
Graphic Design Photoshop Adobe Illustrator Drawing Canva Digital Painting InDesign Design Theory Procreate Digital Illustration App
Life Coach Training Neuro-Linguistic Programming Personal Development Personal Transformation Life Purpose Mindfulness Sound Therapy Emotional Intelligence Coaching
Business Fundamentals Entrepreneurship Fundamentals Freelancing Business Strategy Online Business Startup Business Plan Blogging Amazon Kindle Direct Publishing (KDP)
Digital Marketing Social Media Marketing Marketing Strategy Internet Marketing Copywriting Email Marketing Google Analytics Startup Advertising Strategy
2022-07-16T03:50:31Z

IT & SoftwareIT CertificationsHadoop

Spark SQL and Spark 3 using Scala Hands-On with Labs

A comprehensive course on Spark SQL as well as Data Frame APIs using Scala with complementary lab access
Rating: 4.5 out of 54.5 (2,464 ratings)
19,372 students
Created by Durga Viswanatha Raju Gadiraju, Asasri Manthena
Last updated 2/2022
English
Italian [Auto]

What you'll learn

  • All the HDFS Commands that are relevant to validate files and folders in HDFS.
  • Enough Scala to work Data Engineering Projects using Scala as Programming Language
  • Spark Dataframe APIs to solve the problems using Dataframe style APIs.
  • Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark Dataframe APIs
  • Inner as well as outer joins using Spark Data Frame APIs
  • Ability to use Spark SQL to solve the problems using SQL style syntax.
  • Basic Transformations such as Projection, Filtering, Total as well as Aggregations by Keys using Spark SQL
  • Inner as well as outer joins using Spark SQL
  • Basic DDL to create and manage tables using Spark SQL
  • Basic DML or CRUD Operations using Spark SQL
  • Create and Manage Partitioned Tables using Spark SQL
  • Manipulating Data using Spark SQL Functions
  • Advanced Analytical or Windowing Functions to perform aggregations and ranking using Spark SQL

Requirements

  • Basic programming skills
  • Self support lab (Instructions provided) or ITVersity lab at additional cost for appropriate environment.
  • Minimum memory required based on the environment you are using with 64 bit operating system
  • 4 GB RAM with access to proper clusters or 16 GB RAM with virtual machines such as Cloudera QuickStart VM

Description

As part of this course, you will learn all the key skills to build Data Engineering Pipelines using Spark SQL and Spark Data Frame APIs using Scala as a Programming language. This course used to be a CCA 175 Spark and Hadoop Developer course for the preparation of the Certification Exam. As of 10/31/2021, the exam is sunset and we have renamed it to Spark SQL and Spark 3 using Scala as it covers industry-relevant topics beyond the scope of certification.

About Data Engineering

Data Engineering is nothing but processing the data depending on our downstream needs. We need to build different pipelines such as Batch Pipelines, Streaming Pipelines, etc as part of Data Engineering. All roles related to Data Processing are consolidated under Data Engineering. Conventionally, they are known as ETL Development, Data Warehouse Development, etc. Apache Spark is evolved as a leading technology to take care of Data Engineering at scale.

I have prepared this course for anyone who would like to transition into a Data Engineer role using Spark (Scala). I myself am a proven Data Engineering Solution Architect with proven experience in designing solutions using Apache Spark.

Let us go through the details about what you will be learning in this course. Keep in mind that the course is created with a lot of hands-on tasks which will give you enough practice using the right tools. Also, there are tons of tasks and exercises to evaluate yourself.

Setup of Single Node Big Data Cluster

Many of you would like to transition to Big Data from Conventional Technologies such as Mainframes, Oracle PL/SQL, etc and you might not have access to Big Data Clusters. It is very important for you set up the environment in the right manner. Don't worry if you do not have the cluster handy, we will guide you through support via Udemy Q&A.

  • Setup Ubuntu-based AWS Cloud9 Instance with the right configuration

  • Ensure Docker is setup

  • Setup Jupyter Lab and other key components

  • Setup and Validate Hadoop, Hive, YARN, and Spark

Are you feeling a bit overwhelmed about setting up the environment? Don't worry!!! We will provide complementary lab access for up to 2 months. Here are the details.

  • Training using an interactive environment. You will get 2 weeks of lab access, to begin with. If you like the environment, and acknowledge it by providing a 5* rating and feedback, the lab access will be extended to additional 6 weeks (2 months). Feel free to send an email to support@itversity.com to get complementary lab access. Also, if your employer provides a multi-node environment, we will help you set up the material for the practice as part of the live session. On top of Q&A Support, we also provide required support via live sessions.

A quick recap of Scala

This course requires a decent knowledge of Scala. To make sure you understand Spark from a Data Engineering perspective, we added a module to quickly warm up with Scala. If you are not familiar with Scala, then we suggest you go through relevant courses on Scala as Programming Language.

Data Engineering using Spark SQL

Let us, deep-dive into Spark SQL to understand how it can be used to build Data Engineering Pipelines. Spark with SQL will provide us the ability to leverage distributed computing capabilities of Spark coupled with easy-to-use developer-friendly SQL-style syntax.

  • Getting Started with Spark SQL

  • Basic Transformations using Spark SQL

  • Managing Spark Metastore Tables - Basic DDL and DML

  • Managing Spark Metastore Tables Tables - DML and Partitioning

  • Overview of Spark SQL Functions

  • Windowing Functions using Spark SQL

Data Engineering using Spark Data Frame APIs

Spark Data Frame APIs are an alternative way of building Data Engineering applications at scale leveraging distributed computing capabilities of Spark. Data Engineers from application development backgrounds might prefer Data Frame APIs over Spark SQL to build Data Engineering applications.

  • Data Processing Overview using Spark Data Frame APIs leveraging Scala as Programming Language

  • Processing Column Data using Spark Data Frame APIs leveraging Scala as Programming Language

  • Basic Transformations using Spark Data Frame APIs leveraging Scala as Programming Language - Filtering, Aggregations, and Sorting

  • Joining Data Sets using Spark Data Frame APIs leveraging Scala as Programming Language

All the demos are given on our state-of-the-art Big Data cluster. You can avail of one-month complimentary lab access by reaching out to support@itversity.com with a Udemy receipt.

Who this course is for:

  • Any IT aspirant/professional willing to learn Data Engineering using Apache Spark
  • Python Developers who want to learn Spark using Scala to add additional skill to be a Data Engineer
  • Java or Scala Developers to learn Spark using Scala to add Data Engineering Skills to their profile

Featured review

Pavol Vadkerti
Pavol V.
29 courses
13 reviews
Rating: 5.0 out of 53 years ago
Really one of the "from zero to hero" courses. I was typing all the commands like in the video, making notes and had an ITversity lab account and still it took me 4 Months to finish the course (did it in my free time). The instructor was very skilled, easy to understand. I'm looking forward to the next course. Big thumb up!

Instructors

Durga Viswanatha Raju Gadiraju
CEO at ITVersity and CTO at Analytiqs, Inc
Durga Viswanatha Raju Gadiraju
  • 4.4 Instructor Rating
  • 10,245 Reviews
  • 243,531 Students
  • 17 Courses

20+ years of experience in executing complex projects using a vast array of technologies including Big Data and the Cloud.

ITVersity, Inc. - is a US-based organization that provides quality training for IT professionals and we have a track record of training hundreds of thousands of professionals globally.

Building an IT career for people with required tools such as high-quality material, labs, live support, etc to upskill and cross-skill is paramount for our organization.

At this time our training offerings are focused on the following areas:

* Application Development using Python and SQL

* Big Data and Business Intelligence

* Cloud

* Datawarehousing, Databases

Asasri Manthena
AM
  • 4.4 Instructor Rating
  • 10,245 Reviews
  • 125,087 Students
  • 17 Courses

Top companies choose Udemy Business to build in-demand career skills.
NasdaqVolkswagenBoxNetAppEventbrite
  • Udemy Business
  • Teach on Udemy
  • Get the app
  • About us
  • Contact us
  • Careers
  • Blog
  • Help and Support
  • Affiliate
  • Investors
  • Terms
  • Privacy policy
  • Sitemap
  • Accessibility statement
Udemy
© 2022 Udemy, Inc.