Udemy
    •  
    •  
    •  
    •  
    •  
    •  
    •  
    •  
Turn what you know into an opportunity and reach millions around the world.
Learn More
Your cart is empty.
Keep shopping
Data Cleaning and Visualization in Python
1 students

Data Cleaning and Visualization in Python

Imputation techniques | Outlier analysis | Data transformation | Data visualization
Created bySairam V A
Last updated 2/2025
English

What you'll learn

  • Understand the various issues that can be present in real time data
  • Understand imputation techniques and outlier analysis
  • Understand skewness and data transformation techniques to rectify them
  • Understand univariate, bivariate and multivariate feature visualization techniques
  • Implement the above mentioned concepts on real time dataset using python

Course content

5 sections10 lectures3h 5m total length
  • Introduction3:35

    This lecture talks about the introduction to the subject of Exploratory Data Analysis (EDA) and the various sections  to be covered in this course!!

Requirements

  • This course is for beginners who don't have much expertise in data cleaning and analytics.
  • Minimal level of expertise would be needed. Basic idea of python programming like variables, loops, conditional statements would be enough to understand the course.
  • It is important to understand the theoretical aspects of the concepts. That's the reason why this course is aligned more towards theory!!

Description

This course provides a comprehensive understanding of Exploratory Data Analysis (EDA), a crucial step in the machine learning lifecycle. EDA helps in diagnosing issues within datasets and applying appropriate techniques to improve data quality.

The first phase of the course focuses on data cleaning, covering essential techniques such as handling missing values (imputation), data transformation, and outlier detection. Understanding these processes ensures the dataset is refined and structured for better model performance. Various imputation methods, including statistical, neighbor-based, and predictive filling, are discussed along with transformations like log, square root, and Box-Cox. Outlier detection techniques such as Z-score, IQR, and Mahalanobis distance are also explored.

The second phase delves into data visualization, covering univariate, bivariate, and multivariate analysis. It provides an extensive discussion on various plots, including histograms, box plots, scatter plots, heatmaps, and more, ensuring clarity in data interpretation.

The course concludes with real-world case studies, demonstrating how EDA helps derive meaningful insights. All implementations are carried out in Python, leveraging libraries such as pandas, numpy, seaborn, and matplotlib. By the end of this course, participants will have hands-on expertise in performing EDA effectively for any dataset and leverage these techniques to improvise the data for better results in machine learning analysis.


This course provides more focus and priority to the theoretical aspects of the concepts, since understanding the theory is very much needed and expected in the industry also. Learning this course will give an in-depth idea on various practical issues with data and how to sort them out, followed by various visualization techniques. This knowledge can be useful to work on real time datasets and develop python programs for effective and insightful analysis. Furthermore, mastering the EDA process can be highly helpful in boosting the performance of machine learning algorithms. This can be useful for a career as data analyst, data scientist, or machine learning engineer.

Who this course is for:

  • Beginner Engineering Aspirants who want to learning data science, machine learning and deep learning.
  • Understand and apply the fundamental steps that can boost the performance of machine learning models.
  • Engineering Students of various background who can apply these concepts on their domain.
  • AI and data science aspirants who are looking for a single course on data cleaning, analysis and visualization using python.