Udemy
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
  •  
Development
Web Development Data Science Mobile Development Programming Languages Game Development Database Design & Development Software Testing Software Engineering Software Development Tools No-Code Development
Business
Entrepreneurship Communication Management Sales Business Strategy Operations Project Management Business Law Business Analytics & Intelligence Human Resources Industry E-Commerce Media Real Estate Other Business
Finance & Accounting
Accounting & Bookkeeping Compliance Cryptocurrency & Blockchain Economics Finance Finance Cert & Exam Prep Financial Modeling & Analysis Investing & Trading Money Management Tools Taxes Other Finance & Accounting
IT & Software
IT Certifications Network & Security Hardware Operating Systems & Servers Other IT & Software
Office Productivity
Microsoft Apple Google SAP Oracle Other Office Productivity
Personal Development
Personal Transformation Personal Productivity Leadership Career Development Parenting & Relationships Happiness Esoteric Practices Religion & Spirituality Personal Brand Building Creativity Influence Self Esteem & Confidence Stress Management Memory & Study Skills Motivation Other Personal Development
Design
Web Design Graphic Design & Illustration Design Tools User Experience Design Game Design 3D & Animation Fashion Design Architectural Design Interior Design Other Design
Marketing
Digital Marketing Search Engine Optimization Social Media Marketing Branding Marketing Fundamentals Marketing Analytics & Automation Public Relations Paid Advertising Video & Mobile Marketing Content Marketing Growth Hacking Affiliate Marketing Product Marketing Other Marketing
Lifestyle
Arts & Crafts Beauty & Makeup Esoteric Practices Food & Beverage Gaming Home Improvement & Gardening Pet Care & Training Travel Other Lifestyle
Photography & Video
Digital Photography Photography Portrait Photography Photography Tools Commercial Photography Video Design Other Photography & Video
Health & Fitness
Fitness General Health Sports Nutrition & Diet Yoga Mental Health Martial Arts & Self Defense Safety & First Aid Dance Meditation Other Health & Fitness
Music
Instruments Music Production Music Fundamentals Vocal Music Techniques Music Software Other Music
Teaching & Academics
Engineering Humanities Math Science Online Education Social Science Language Learning Teacher Training Test Prep Other Teaching & Academics
Web Development JavaScript React Angular CSS Node.Js PHP HTML5 Vue JS
AWS Certification Microsoft Certification AWS Certified Solutions Architect - Associate AWS Certified Cloud Practitioner CompTIA A+ Amazon AWS Cisco CCNA CompTIA Security+ Microsoft AZ-900
Microsoft Power BI SQL Tableau Data Modeling Business Analysis Business Intelligence MySQL Qlik Sense Data Analysis
Unity Unreal Engine Game Development Fundamentals C# 3D Game Development C++ Unreal Engine Blueprints 2D Game Development Mobile Game Development
Google Flutter iOS Development Android Development Swift React Native Dart (programming language) Kotlin Mobile App Development SwiftUI
Graphic Design Photoshop Adobe Illustrator Drawing Digital Painting Canva InDesign Character Design Procreate Digital Illustration App
Life Coach Training Personal Development Neuro-Linguistic Programming Personal Transformation Life Purpose Mindfulness Sound Therapy CBT Cognitive Behavioral Therapy Coaching
Business Fundamentals Entrepreneurship Fundamentals Freelancing Business Strategy Startup Business Plan Online Business Blogging Leadership
Digital Marketing Social Media Marketing Marketing Strategy Internet Marketing Google Analytics Copywriting Email Marketing Startup YouTube Marketing

Teaching & AcademicsOther Teaching & AcademicsData Cleaning

Data Cleaning in Python

Preprocessing, structuring and normalizing data
Rating: 4.4 out of 54.4 (76 ratings)
1,972 students
Created by Taimoor khan
Last updated 7/2020
English
English [Auto]

What you'll learn

  • Data cleaning or cleansing as a preprocessing step towards making the data more consistent and high quality before training predictive models.

Requirements

  • Basics of Python

Description

Data cleaning or Data cleansing is very important from the perspective of building intelligent automated systems. Data cleansing is a preprocessing step that improves the data validity, accuracy, completeness, consistency and uniformity. It is essential for building reliable machine learning models that can produce good results. Otherwise, no matter how good the model is, its results cannot be trusted. Beginners with machine learning starts working with the publicly available datasets that are thoroughly analyzed with such issues and are therefore, ready to be used for training models and getting good results. But it is far from how the data is, in real world. Common problems with the data may include missing values, noise values or univariate outliers, multivariate outliers, data duplication, improving the quality of data through standardizing and normalizing it, dealing with categorical features. The datasets that are in raw form and have all such issues cannot be benefited from, without knowing the data cleaning and preprocessing steps. The data directly acquired from multiple online sources, for building useful application, are even more exposed to such problems. Therefore, learning the data cleansing skills help users make useful analysis with their business data. Otherwise, the term 'garbage in garbage out' refers to the fact that without sorting out the issues in the data, no matter how efficient the model is, the results would be unreliable. 

In this course, we discuss the common problems with data, coming from different sources. We also discuss and implement how to resolve these issues handsomely. Each concept has three components that are theoretical explanation, mathematical evaluation and code. The lectures *.1.* refers to the theory and mathematical evaluation of a concept while the lectures *.2.* refers to the practical code of each concept.  In *.1.*, the first (*) refers to the Section number, while the second (*) refers to the lecture number within a section. All the codes are written in Python using Jupyter Notebook.

Who this course is for:

  • The target students are beginners to data science and machine learning.

Instructor

Taimoor khan
Asst. Professor
Taimoor khan
  • 4.1 Instructor Rating
  • 226 Reviews
  • 2,638 Students
  • 3 Courses

I am a researcher and an academician since 2011, and have a background of professional software development for around 3 years. As an Assistant Professor in Computer Science faculty I have taught various courses to undergraduate and graduate students. I am particularly interested in courses related to software design and development, databases, artificial intelligence, machine learning and data mining etc.


My PhD research is related to data science and computational linguistics, having worked with large-scale textual data for building knowledge-based systems that are adaptive and evolve with the growing needs without having to explicitly trained for a specific scenario. I have published papers in internationally recognized journals and conferences where we proposed solutions to real-world data analysis issues. I have supervised tens of projects that offered software based solutions for social content analytics, recommendations and tracking evolving public interests.

Top companies choose Udemy Business to build in-demand career skills.
NasdaqVolkswagenBoxNetAppEventbrite
  • Udemy Business
  • Teach on Udemy
  • Get the app
  • About us
  • Contact us
  • Careers
  • Blog
  • Help and Support
  • Affiliate
  • Investors
  • Impressum Kontakt
  • Terms
  • Privacy policy
  • Cookie settings
  • Sitemap
  • Accessibility statement
Udemy
© 2022 Udemy, Inc.