Udemy
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
Development
Web Development Data Science Mobile Development Programming Languages Game Development Database Design & Development Software Testing Software Engineering Software Development Tools No-Code Development
Business
Entrepreneurship Communication Management Sales Business Strategy Operations Project Management Business Law Business Analytics & Intelligence Human Resources Industry E-Commerce Media Real Estate Other Business
Finance & Accounting
Accounting & Bookkeeping Compliance Cryptocurrency & Blockchain Economics Finance Finance Cert & Exam Prep Financial Modeling & Analysis Investing & Trading Money Management Tools Taxes Other Finance & Accounting
IT & Software
IT Certifications Network & Security Hardware Operating Systems & Servers Other IT & Software
Office Productivity
Microsoft Apple Google SAP Oracle Other Office Productivity
Personal Development
Personal Transformation Personal Productivity Leadership Career Development Parenting & Relationships Happiness Esoteric Practices Religion & Spirituality Personal Brand Building Creativity Influence Self Esteem & Confidence Stress Management Memory & Study Skills Motivation Other Personal Development
Design
Web Design Graphic Design & Illustration Design Tools User Experience Design Game Design 3D & Animation Fashion Design Architectural Design Interior Design Other Design
Marketing
Digital Marketing Search Engine Optimization Social Media Marketing Branding Marketing Fundamentals Marketing Analytics & Automation Public Relations Paid Advertising Video & Mobile Marketing Content Marketing Growth Hacking Affiliate Marketing Product Marketing Other Marketing
Lifestyle
Arts & Crafts Beauty & Makeup Esoteric Practices Food & Beverage Gaming Home Improvement & Gardening Pet Care & Training Travel Other Lifestyle
Photography & Video
Digital Photography Photography Portrait Photography Photography Tools Commercial Photography Video Design Other Photography & Video
Health & Fitness
Fitness General Health Sports Nutrition & Diet Yoga Mental Health Martial Arts & Self Defense Safety & First Aid Dance Meditation Other Health & Fitness
Music
Instruments Music Production Music Fundamentals Vocal Music Techniques Music Software Other Music
Teaching & Academics
Engineering Humanities Math Science Online Education Social Science Language Learning Teacher Training Test Prep Other Teaching & Academics
Web Development JavaScript React Angular CSS Node.Js Typescript HTML5 PHP
AWS Certification Microsoft Certification AWS Certified Solutions Architect - Associate AWS Certified Cloud Practitioner CompTIA A+ Amazon AWS Cisco CCNA CompTIA Security+ Microsoft AZ-900
Microsoft Power BI SQL Tableau Data Modeling Business Analysis Data Analysis Data Warehouse Blockchain Business Intelligence
Unity Unreal Engine Game Development Fundamentals C# 3D Game Development C++ Unreal Engine Blueprints 2D Game Development Mobile Game Development
Google Flutter iOS Development Android Development Swift React Native Dart (programming language) Kotlin SwiftUI Mobile App Development
Graphic Design Photoshop Adobe Illustrator Drawing Canva Digital Painting InDesign Design Theory Procreate Digital Illustration App
Life Coach Training Neuro-Linguistic Programming Personal Development Personal Transformation Life Purpose Mindfulness Sound Therapy Emotional Intelligence Coaching
Business Fundamentals Entrepreneurship Fundamentals Freelancing Business Strategy Online Business Startup Business Plan Blogging Amazon Kindle Direct Publishing (KDP)
Digital Marketing Social Media Marketing Marketing Strategy Internet Marketing Copywriting Google Analytics Email Marketing Startup Advertising Strategy

DevelopmentData SciencePySpark

PySpark Essentials for Data Scientists (Big Data + Python)

Learn how to wrangle Big Data for Machine Learning using Python in PySpark taught by an industry expert!
Rating: 4.5 out of 54.5 (627 ratings)
4,257 students
Created by Layla AI
Last updated 5/2022
English
English [Auto]

What you'll learn

  • Use Python with Big Data on a distributed framework (Apache Spark)
  • Work with REAL datasets on realistic consulting projects
  • How to streaming LIVE data from Twitter using Spark Structured Streaming
  • Learn how to create a "Pandora Like" app that classifies songs into genres using machine learning
  • Flag suspicious job postings using Natural Language Processing
  • Use machine learning to predict optimal cement strength and the factors that affect it
  • Classify Christmas cooking recipes using Topic Modeling (LDA)
  • Customer Segmentation using Gaussian Mixture Modeling (Clustering)
  • Use cluster analysis to develop a strategy designed to increase college graduation rates for under-priveleged populations
  • How to use the k-means clustering algorithm to define a marketing outreach strategy
  • Integrate a UI to monitor your model training and development process with MLflow
  • Theory and application of cutting edge data science algorithms
  • Manipulate, Join and Aggregate Dataframes in Spark with Python
  • Learn how to apply Spark's machine learning techniques on distributed Dataframes
  • Cross Validation & Hyperparameter Tuning
  • Frequent Pattern Mining Techniques
  • Classification & Regression Techniques
  • Data Wrangling for Natural Language Processing
  • How to write SQL Queries in Spark

Requirements

  • Familiarity with Python is helpful but not required
  • Some background in data science is helpful but not required
  • A hunger to LEARN

Description

This course is for data scientists (or aspiring data scientists) who want to get PRACTICAL training in PySpark (Python for Apache Spark) using REAL WORLD datasets and APPLICABLE coding knowledge that you’ll use everyday as a data scientist! By enrolling in this course, you’ll gain access to over 100 lectures, hundreds of example problems and quizzes and over 100,000 lines of code!

I’m going to provide the essentials for what you need to know to be an expert in Pyspark by the end of this course, that I’ve designed based on my EXTENSIVE experience consulting as a data scientist for clients like the IRS, the US Department of Labor and United States Veterans Affairs.

I’ve structured the lectures and coding exercises for real world application, so you can understand how PySpark is actually used on the job. We are also going to dive into my custom functions that I wrote MYSELF to get you up and running in the MLlib API fast and make getting started building machine learning models a breeze! We will also touch on MLflow which will help us manage and track our model training and evaluation process in a custom user interface that will make you even more competitive on the job market!

Each section will have a concept review lecture as well as code along activities structured problem sets for you to work through to help you put what you have learned into action, as well as the solutions to each problem in case you get stuck. Additionally, real world consulting projects have been provided in every section with AUTHENTIC datasets to help you think through how to apply each of the concepts we have covered.

Lastly, I’ve written up some condensed review notebooks and handouts of all the course content to make it super easy for you to reference later on. This will be super helpful once you land your first job programming in PySpark!

I can’t wait to see you in the lectures! And I really hope you enjoy the course! I’ll see you in the first lecture!

Who this course is for:

  • Data Scientists interested in learning PySpark
  • PySpark developers looking to strengthen their coding skills
  • Python developers who need to work with big data
  • Data Scientists who want to learn to work with big data

Featured review

Daryl MItchell
Daryl M.
90 courses
20 reviews
Rating: 5.0 out of 5hace un año
This has been a wonderful class. The PySpark piece has been excellent and allowed me to review my Data Science group's DataBricks notebooks, and clean them up significantly to become nightly "production" runs. The material is complete, concise and very well presented. I am skipping over the ML stuff for now as I don't have a need for this at the moment but look forward to getting back to them as soon as possible. Thanks!

Instructor

Layla AI
Seasoned Data Scientist Consultant & Passionate Instructor
Layla AI
  • 4.5 Instructor Rating
  • 627 Reviews
  • 4,257 Students
  • 1 Course

Layla AI is quickly becoming one of Udemy's leading female instructors in the data science realm. She began her career as a data scientist in 2012 while earning her masters degree in Quantitative Analytics and has been a federal consultant since 2016 for clients like the IRS, Veterans Affairs and Department of Labor.

Her skills are most predominantly in predictive modeling, artificial intelligence, natural language processing, topic model, trend analysis, frequent pattern mining, machine-learning, deep-learning, cluster analysis and began teaching in 2020.

Her primary programming language is Python but she also has extensive experience with non-object oriented languages like SAS and SQL.

Most notably however, she is a passionate teacher who loves to share her knowledge with the world!

Top companies choose Udemy Business to build in-demand career skills.
NasdaqVolkswagenBoxNetAppEventbrite
  • Udemy Business
  • Teach on Udemy
  • Get the app
  • About us
  • Contact us
  • Careers
  • Blog
  • Help and Support
  • Affiliate
  • Investors
  • Terms
  • Privacy policy
  • Sitemap
  • Accessibility statement
Udemy
© 2022 Udemy, Inc.