
This video will give you an overview about the course.
Before moving on to the coding part of the course, we must lay the foundation of descriptive statistics which will be used heavily throughout the course.
• Explore the various measure of statistics like mean, median, and mode
• Understand the various properties of these measures
• Learn how to calculate these statistical measures
Once we have learned how to calculate these statistical measures, we move on to visualizing them in the form of graphs for better understanding.
• Explore the various graphs through which we can visualize the statistical measures
• Understand the visualization changes with change in values of these measures
• Explore alternate graphs for visualizations
We must understand the importance of variance in data and how it ties up with other measures of central tendencies.
• Explore the concept of variance
• Visualize variance in data
• Understand how it depends on other statistical measures
Percentiles allow us to interpret data in a more readable format. We will explore how they are calculated and what information they give regarding the dataset.
• Understand what are iterators and the iterator protocol
• Implement iterators in Python
• Implement generators in Python using the yield keyword
Once we are done with percentiles and how they can be calculated, we move on to the concept of Quartiles and how to visualize them using box plots.
• Understand the concept of Quartiles
• Visualize percentiles and Quartiles using box plots
• Get a better understanding of box plots
Most of the real-world datasets contain missing values due to various reasons. In this video, we find out how we can know whether we have missing values in our dataset using Pandas library in Python.
• Explore the various reasons for the missing values in datasets
• Understand the various Pandas functions that can be used to find the missing values
• Learn about the different types of missing values and how Pandas does type conversion for them
Once we have learned how to find missing values in the dataset, we move on to discussing the different ways to deal with missing values.
• First, we discuss why simply ignoring rows with missing values might not work
• Understand how we can impute missing values with measures of central tendencies
• Demonstrate via an example about we can fill missing values based on other columns
Now, we move on to using Pandas library to deal with missing data.
• Explore the df.dropna function and its various attributes
• Explore the various ways of filling missing values via df.fillna, df.ffill, and df.bfill
• Implement an example in which we fill missing values based on values in other columns
We need to apply the concepts that we have learnt in this section over the real-world Titanic Dataset.
• Load the Titanic Dataset and explore the various columns
• Find out the descriptive statistics of the dataset
• Impute missing values in the dataset
Sometimes we might encounter values in our dataset which are abnormally high, low, or simply weird as compared to other values in the dataset. We must understand what outliers are and what causes them to occur.
• Understand what outliers are
• Understand the causes of outliers
• Explore via examples, the different types of outliers
Z-scores are one of the commonly used methods to identify outliers. In this video, we understand the idea behind Z-score and how they can be used to identify outliers.
• Discuss what are Z-scores and what do they signify
• Visualize Z-scores over a normal distribution for more clarity
• Implement Z-scores to find outliers in a dummy dataset
Z-scores can sometimes not be very efficient since they use mean and standard deviation to detect outliers. In this video, we use a modified version of Z-score which is based on median.
• Understand why Z-score might fail in some cases
• Understand the idea of Median, Standard Deviation, and Modified Z-scores
• Implement an example in which we find missing values using Modified Z-scores
Finally, we also learn how to use Interquartile Range (IQR) to detect outliers in a dataset and visualize them via box plots.
• Explore the concept of IQR and how it can be used to identify outliers
• Visualize IQR and outliers over a box plot
• Implement an example using IQR and box plots to detect outliers
Before moving on to analyzing the various types of variables in a dataset, we must understand the different variables that might occur in a dataset.
• Understand what are the different types of variables
• Explore the different types of numeric variables
• Explore the different types of categorical variables
Now that we have understood the different types of variables, let’s take a look at the different ways of analyzing variables using Python.
• Create dummy data for our analysis
• Implement code for plotting different types of graphs in Python
• Explore the different graphs and libraries available in Python
After learning about the various graphs that we can use to explore columns in Python, we must first understand the concept of Skewness and Kurtosis in Statistics and how they affect the shape of a distribution.
• Understand what Skewness is
• Understand the idea behind Kurtosis
• Explore how Skewness and Kurtosis affect the shape of the curve
Finally, we will apply the different techniques that we have learned for Univariate Analysis over the Olympics Dataset.
• Explore the different columns in Olympics Dataset
• Draw density plots, histograms, and so on. over various columns
• Find Skewness of the data using SciPy module in Python
Now that we have explored univariate analysis, we move ahead to bivariate analysis where we explore two variables at the same time.
• Understand what is bivariate analysis
• Understand how bivariate analysis helps us understand our data better
• List out various graphs used for bivariate analysis
Before moving on to doing practical bivariate analysis, we must understand the theoretical concept behind correlation coefficients.
• Explore the concept of correlation coefficient
• Understand the different types of correlation coefficient
• Understand what correlation coefficient signifies for our data
After understanding the theoretical concepts behind correlation coefficients, we now move on to visualizing correlation between two sets of variables.
• Implement code for positive and negative correlation
• Use seaborn library to visualize scatterplots
• Use heatmaps to visualize correlation between multiple pair of columns at once
In this video, we will apply various techniques of bivariate analysis over the Titanic Dataset.
• Load the Titanic Dataset
• Implement bivariate graphs using Seaborn
• Identify trends if they exist in the data
In this video, we will apply various techniques of bivariate analysis over the video game sales dataset.
• Load the video game sales dataset and understand the various columns
• Implement interactive graphs using Bokeh library in Python
• Identify trends if they exist in the data using bivariate graphs
Now that we have explored univariate and bivariate analysis, we move ahead to multivariate analysis where we explore more than two variables at the same time.
• Understand what is multivariate analysis
• Understand the various advantages of multivariate analysis
• Visualize a graph depicting multivariate analysis
In this video, we will apply various techniques of multivariate analysis over the Titanic Dataset.
• Load the Titanic Dataset and find descriptive statistics of the various variables
• Implement multivariate graphs using Seaborn
• Identify trends if they exist in the data
In this video, we will apply various techniques of multivariate analysis over the Pokemon Dataset.
• Load the Pokemon Dataset and find descriptive statistics of the various variables
• Implement interactive graphs using Bokeh
• Identify trends if they exist in the data using multivariate graphs
Simpson’s Paradox is a phenomenon that may occur in real-world data, leading to conflicting results. We understand why it happens and what we can do to prevent it.
• Understand what is Simpson’s Paradox
• Understand what causes it and how we can prevent it from happening
• Demonstrate Simpson’s Paradox using an example
This is one of the most widely misinterpreted phenomena that occurs in real world. We understand why it happens and what we can do to prevent it.
• Understand why Correlation does not necessarily imply causation
• Understand what causes it and how we can prevent it from happening
• Demonstrate that correlation does not imply causation using various examples
In this video, we will apply all the different techniques that we have learned in the previous sections to a real-world dataset.
• Download and load the dataset
• Explore the different variables in the dataset
• Create a set of questions that we will answer through our analysis
Here we will do Exploratory Data Analysis over Red Wine Data.
• Download and load the dataset
• Explore the different variables in the dataset
• Identify trends if they exist in the data
In this video, we will do Exploratory Data Analysis over White Wine Data.
• Download and load the dataset
• Explore the different variables in the dataset
• Identify trends if they exist in the data
Here, we will do a comparative analysis about how these wines are different from each other.
• Download and load the dataset
• Explore the different variables in the dataset based on the type of wines
• Identify trends if they exist in the data
This video explains the course prerequisites and provides an entire overview of the course.
Which Python distribution to use in this course?
Install Anaconda Navigator and verify the installation
Choose an IDE (Spyder)
Most of the data comes in CSV form. We will look how we can use Python to import and get things out of it.
Import and parse CSV file using CSV module
Import and parse using Pandas module
In industry, data is mostly exposed in web services and JSON is used to represent the data. So we will parse data out of JSON in this video.
Analyze the JSON file by opening it
Use JSON module in Python to parse the data out of JSON
Most of the data is available on public web embedded in HTML markup, so a need arises to use that. We will look at the basics of web parsing in this video.
Explore the modules used for web scraping
Scrap the HTML markup of a Wikipedia page and, get the basic information out of it
In this video, we will look at practical demo to extract the HTML markup of a table tag of HTML and then storing that information in structured form.
Look into the correct table tag which we want to extract into our program
Hands-On approach in Python to get the relevant information out of the table tag
Store the information in form of table
Sometimes organizations/companies find it convenient to store the relevant information about something in sheets of Excel file. We will look into xlrd module to extract information out of Excel file.
Analyze the dataset by manually opening the file
Import the dataset into Python using xlrd module and then print the sheet names and number of sheets
Print the rows of a sheet on the console
People prefer a small portion of code to get big things done so we will use Pandas module in this video to do the same.
Import the Pandas module
Import the different sheets of Excel files into the Pandas DataFrames and extract some basic information
Sometimes a need arises to extract information out of the PDF files and then process that. We are going to look how we can do that in this video.
Extract information out of a PDF file and then store each page of PDF file in a separate index of a list
Print each page of PDF files
Sometimes we feel a need to write the data to a PDF file, so in this video we will look how to edit to a PDF file.
Construct a sample resume in the code example
Edit text and images to a PDF file at proper positions
Database administrators choose their databases based on the characteristics of databases. We will just look into the basics.
Understand when to choose relational database and when to choose the non-relational database
Look at the links containing the software’s required to install in this section
We will be storing the JSON file into SQLite light weight database and look into the code example to accomplish that.
Create the table containing fields from the JSON file in SQLite
Dump the JSON file by parsing it into the SQLite databases
Verify the dump using DB Browser for SQLite
Many of the times in industry people prefer non-relational databases over relational databases due to complexities of schemas. We will dump information in MongoDB (a famous document oriented database).
Make MongoDB up and running
Write the code to dump the CSV file into MongoDB
Verify the dump using Robo3T(Robomongo)
In this video we are going to use Elasticsearch with Kibana(to display information from Elasticsearch) to store the JSON file into the Elasticsearch.
Explore Elasticsearch and Kibana
Import the CSV file and then convert each to a format which can be dumped to Elasticsearch
Verify the dump using Kibana
Often people are interested in the pros and cons of the databases, so in this video we are going to look into that in detail.
Understand advantages and disadvantages of relational and non-relational databases
Data cleansing holds an important part in Data Science. We will look into why it is important and some common tips and methods to do it.
Explore importance of data cleansing in Data Science
Learn about data cleansing tips and techniques
We are going to jump into looking data frames and what they are, and how they display structured information in a good format.
Read datasets, displaying column names, displaying the number of rows and number of columns in a data frame
Changing the data type of columns and retrieving certain rows from the data frames
Sometimes there is a need to give proper names to columns coming in the datasets, adding more and removing the irrelevant ones so this video will show you how to perform that.
Edit the column names
Delete the irrelevant columns
Add more columns into the data frame
Duplicate rows based on a column values might be redundant in performing the operations in Data Science so it is good to drop them.
Drop all the duplicate rows
Drop all but keep the first duplicate row
Drop all but keep the last duplicate row
Sometimes we need to extract just required columns and rows out of the data frames. In this video we are going to look at the lines of codes which can be used to do so.
Read in the dataset and then retrieving the first, last rows and columns
Retrieve the first five columns and first five rows
Retrieve certain rows and columns
Data distributed in different sheets can be concatenated and merge/join depending on the use case. In this video we are going to solve this issue which occurs a lot in industry.
Look at the syntax of how we can create a data frame from a dictionary
Concatenate data frame in Python
Perform join/merge operation between two data frames on a column
Real time datasets contains many missing values in columns. We will look into this video how we can solve this problem and come to a good solution.
Understand how the missing values appears in the dataset and also learn how to deal with them
Introduce some missing values in a data frame
Drop the rows/columns which contain missing values or variables or mean of the column in which they are present
Analyzing the dataset sometimes require rows to be sorted. Also we need to filter out some rows based on various conditions. We will look into this video how we can do that.
Edit the certain columns for the sake of computation
Sort the data frame on a column and look into the syntax of how we can do that
Filter out the certain conditions and look into the syntax of how we can do that
Computers understand numbers. Sometimes machine learning algorithms require columns to be in equivalent numeric form so we will look in this video how we can do that.
Drop the rows which are containing the missing values
Use LabelEncoder from the pre-processing module to encode the gender column values
Add the encoded gender column back into the data frame
We will look into another technique of mapping more than two unique values in a column.
Drop the rows having the missing values and get the unique values in a column
Construct a dictionary mapping those unique values to different values
Apply the new encoding onto the column of data frame and look at the changes
Rescaling is mapping the numeric values in a column to (0 to 1) range and it helps machine learning algorithms to converge faster. Standardization helps to map column values in such a way that they have mean of zero and standard deviation of one. This helps to compare feature along different scales.
Rescale a column using the MinMaxScaler of pre-processing module and this we will look into the results
Standardize a column using the StandardScaler class of pre-processing module. And then we will look into the results
We will be looking into the common cleaning operations good to have in the toolbox while playing with data frames.
Drop rows having missing values in them and then reset the index of the data frame
Lower case the column names and then apply a strip function on a column to remove spaces from the values at the beginning and end of the strings
Apply a function to each value of a column
Sometimes we need to store the data frames after doing processing back into the CSV/JSON files.
Drop rows having missing values in them and then reset the index of the data frame
Delete a column from the data frame
Dump the data frame into the CSV file and a JSON file
We will look into the common uses of Python modules in Data Science(Pandas, NumPy, SciPy, and Matplotlib).
Learn about usage of Pandas module
We will look into the types of Dataframe columns people come across in industry. Numeric columns which contains numbers and then will into their types. Categorical variables and their types.
Understand numerical data
Sometimes a need arises to group similar rows to perform operations on them and get stats out of them.
Perform group by to compute the number of elements in all the groups
Perform group by to compute the average water consumption in every group of animal
Describing features of a column comes in descriptive statistics.
Compute the mean, median and mode of values in a column
Compute sum, standard deviation and range of values in a column
We will look into advanced statistical techniques of computing stats.
Compute geometric mean, harmonic mean and trimmed mean
We are going to look into visualizations. Why they are important? What are the different types?
Understand in detail about visualizations rules
We will see an amazing site showing cool visualizations regarding the happenings in the World over the last 60 years.
Understand visualization with the help of World population dataset
We will look into how we can plot the relationship between variables (scatter plot), look into line plots and the histograms.
Explore the iris dataset and extract the relevant columns out of them
Plot a scatter plot between two columns, plot a line plot of one variable in the data frame
Plot a histogram to understand its concept in a better way
We will look into box plot and the pie chart of how they can be used to make visualizations of things.
Plot a box plot of a column in data frame which sums up many things in that columns
Make a pie chart for visualizing the utilization of hours of a person in a day
Sometimes it is good to use some online tools which provide many ease to do visualizations for a non-tech person. We will look into RAWGraphs site for making visualizations of the things.
Explore RAWGraphs
This video will give you an overview about the course.
Dask is Python library for parallel and distributed computing. Before moving ahead, we must understand the idea behind Dask and the various use cases in which it can be used.
Understand what Dask is
Overview of the features of Dask
Use cases for Dask
Now that we have a basic idea of what Dask is, we move to discussing the various features it has to offer for parallel/distributed processing.
See how Dask helps in parallelizing code
Understand how Dask helps in scaling out your code
Understand the various data-structures, algorithms, schedulers, etc., that Dask can offer
We also need to cover the limitations of Dask, to get a better idea of the assumptions to be made while writing code for Dask.
Explore the limitations of Dask for parallelizing code
Study the limitations of Dask for running code over a cluster of nodes
Look at the limitations of Dask for task scheduling
Now that we have covered the features and limitations of Dask, we now move on to setting up Dask on our system.
Install Dask from Conda/PIP
Install Dask from source (GitHub)
Install Dask over Google Colab notebook
Now that we have a fair idea of what Dask is, we move on to discussing about Blocked Algorithms and Dask Arrays.
Understand what Dask arrays are
Overview of Blocked Algorithms
Look at the example of Blocked Algorithm
Once, we have understood how blocked algorithms work over Dask arrays, we move on to implementing some basic operations over Dask arrays.
Explore Dask array API
Create Dask arrays
Visualize Task Graphs for Dask arrays
In this video, we will look at some more advanced operations that we can perform on Dask Arrays.
Perform scalar operations, reduction, etc. over Dask Arrays
Perform operations like slicing, indexing, and broadcasting over Dask arrays
Implement Stacking and Concatenation over Dask Arrays
We do a performance analysis of Dask Arrays versus NumPy arrays.
Implement a computationally expensive operation over large NumPy arrays
Implement the same operation using Dask Arrays
Do a performance analysis over both methods
Universal NumPy Functions are one of the building blocks of NumPy API. Now we will learn to implement them for Dask arrays.
Understand what Universal NumPy Functions are
Explore the different properties of NumPy Universal Functions
Implement Universal NumPy Functions for Dask arrays
Finally, we will discuss the current limitations of Dask arrays.
Discuss the current limitations of Dask Arrays API, as per latest documentation
Before we move on to parallelizing our code using Dask, we must first understand the concept of Lazy evaluation and how it works.
Understand what lazy evaluation is
Understand how it is different from eager evaluation
Look at the example of Lazy evaluation in Python
Once we have understood how lazy evaluation works, we move on to exploring dask.delayed function and how it can be used to parallelize existing Python code.
Explore dask.delayed function
Implement examples using dask.delayed
Implement examples using @delayed decorators and visualize task graphs
Task Graphs form the basic computations of Dask. In this video, we look into what these graphs mean, how different computations affect task graphs, etc.
Understand what task graphs are
Visualize task graphs for basic computations
Visualize task graphs over complex computations
We do a performance analysis of dask.delayed versus Sequential Execution.
Implement a sequential program over multiple array
Implement the same operation using dask.delayed
Do a performance analysis over both methods
Before we move on to analyzing data with Dask Dataframes, we will understand how Dask Dataframes works and what features they provide.
Overview of Dask Dataframes
Look at the features of Dask Dataframes
Discuss its similarity to Pandas API
In this video, we will implement some basic examples for manipulating Dask Dataframes.
Perform basic Dataframe manipulation operation
Perform aggregation operations on Dask Dataframes
Highlight the differences and similarity with Pandas API
We will discuss some of the different ways of creating and loading Dask Dataframes.
Discuss the different ways of creating Dask Dataframes
Use glob patterns to load multiple files at once
Discuss the multiple formats from which we can load Dask Dataframes
We will try to load larger than memory datasets via Dask Dataframes and perform operations on it.
Load larger than memory dataset into Dask Dataframes
Perform analysis over the data
Analyze the time taken and performance of Dask Dataframes
In this video, we will take a real-world dataset and analyze it using Dask.
Load dataset using Dask Dataframe
Perform analysis on the data
Finally, we will discuss some of the current limitations of Dask Dataframes.
Discuss some of the limitations of Dask Dataframes
First, we need to understand the basic concept of Dask bags, their features and use cases, before we move on to the implementation part.
Understand the concept behind Dask Bags
Understand the features of Dask Bags
Explore the various use-cases for Dask Bags
In this video, we will discuss the various functions available in the Dask API, to create and store Dask Bags.
Explore the various functions to create Dask Bags
Explore the various functions to store Dask Bags
Explore different options for creating/storing Dask Bags
Now that we have a fair idea of what Dask Bags are, and lets us explore the various ways through which we can manipulate Dask Bags.
Create dummy Dask Bags for manipulation
Use various functions for Dask Bags API
Visualize task graphs over these functions
In this example, we create a word counter using Dask Bags.
Create a Dask Bag using a URL
Clean the text via Dask Bag Functions
Explore multiple ways of creating a word counter
In this example, we will explore how we can manipulate JSON data using Dask Bags.
Create a Dask Bag using a glob pattern of JSON files.
Clean the data
Visualize the data after processing
In the final video, we will discuss some of the current limitations of Dask Bags.
Discuss the limitations of Dask Bags as per the latest documentation of Dask Version 1.2.1
In this video, we will understand the various features offered by dask.distributed and compare it with Apache Spark.
Overview of dask.distributed
Look at the features of dask.distributed
Compare dask.distributed with Apache Spark
In this video, we will focus on setting your own local Dask cluster.
Understand the various options available for setting up your local Dask cluster
Create a local Dask cluster
Before we move on to submitting jobs to Dask clusters, we must understand the different types of schedulers available with Dask.
Discuss the different types of schedulers
Implement a program using dask.delayed
Compare the performances of different schedulers based on the same example
In this video, we understand the Dask Dashboard UI available for Dask cluster.
Setup a local Dask cluster
Load data via Dask Dataframes and perform computations on it
Analyze the Dask Dashboard UI
Finally, we will discuss some of the limitations of dask.distrbuted.
Overview of current limitations of dask.distributed
In this video, we will discuss how you can save up on computation and memory by persisting data on your cluster.
Setup a local cluster and load a Dataframe
Perform some operation and analyze the task graphs
Persist the data and see the difference in task graphs
Dask provides an interface to Python’s concurrent.future API. In this video, we will discuss how we can leverage that interface for asynchronous computation.
Understand what exactly is Python’s concurrent.future interface
Explore how Dask provides an interface to concurrent.future
Implement Examples using Dask for asynchronous computation
Finally, we will discuss some of the best practices to be followed while developing applications for Dask.
Discuss some of the best practices for developing application with Dask
In this video, we will discuss a brief overview of what features Dask has to offer with respect to Machine Learning.
Overview of Dask-ML
Look at the features of Dask-ML
In this video, we go over an example of Regression using scikit-learn and combine it with Dask.
Implement a basic regression example using scikit-learn
Create a local Dask cluster
Combine Dask with scikit-learn for regression
In this video, we go over an example of Classification using scikit-learn and combine it with Dask.
Implement a basic Classification example using scikit-learn
Create a local Dask cluster
Combine Dask with scikit-learn for classification
In this video, we go over an example of Hyper-Parameter Tuning using scikit-learn and combine it with Dask.
Implement a basic Hyper-Parameter tuning example using scikit-learn
Create a local Dask cluster
Combine Dask with scikit-learn for Hyper-Parameter tuning
This video gives a glimpse of the entire course.
Learn to create plots with Matplotlib3 by getting data from various means.
Plot data from lists
Plot data from NumPy
Plot data from Pandas Dataframes
Learn to create Scatter plots with Matplotlib 3.
Import dataset
Draw scatter plots
Customize scatter plots
Learn to create line plots with Matplotlib 3.
Import dataset
Draw line plots
Customize line plots
Learn to create Scatter plots with Matplotlib 3.
Import dataset
Draw bar plots
Customize bar plots
Learn to visualize and compare datasets using subplots.
Import dataset
Create subplots
Various ways to create subplots
Learn to create Histograms with Matplotlib 3.
Import dataset
Draw Histograms
Customize Histograms
Learn to create Heatmaps with Matplotlib 3.
Import dataset
Draw Heatmaps
Customize Heatmaps
Learn to create box plots with Matplotlib 3.
Import dataset
Draw box plots
Customize box plots
Learn to create pie charts with Matplotlib 3.
Import dataset
Draw pie charts
Customize pie charts
Learn how to customize the Matplotlib plots
Customize Labels
Customize titles
Customize legends
Learn how to add grids to plots and customize ticks
Learn how to add grids to plots
Customize grids
Customize ticks
Learn what Matplotlib styles and how to use them
Introduce styles
List of styles
Apply styles
Learn to customize styles
Modify configuration files
Add your own style
Apply your style
Lean plot annotation
Introduce Plot Annotation
Various types of annotations
Apply annotations
Learn to build plots in matplotlib using plot scaffolding.
Introduce plot scaffolding
Build a plot step by step
Learn to build custom plots using figures
Introduce figure method
Build plots using figures method
Learn to customize plot axes
Axes and optimizations
Apply the axes customization
Learn to building 3D graphs using wireframe
Import modules and datasets
Create 3d wireframe graphs
Create 3D scatter plots
Import modules and datasets
Create 3d scatter plots
Draw 3D bar charts
Import modules and datasets
Create 3d bar charts
Learn to Customize wireframes
Import modules and datasets
Customize wireframes
Learn to draw animated graphs with data from your datasets
Import modules
Read-in datasets
Create animated plots
Learn to build animated histograms
Import modules
Read-in datasets
Create animated histograms
Learn to create animated subplots
Import modules
Read-in datasets
Build animated subplots
Learn how to make your plots interactive.
Import modules
Read-in datasets
Add interactivity to your plots.
Learn to create plots that update data interactively.
Import modules
Read-in datasets
Build interactive plots.
Learn to change the plot sizes
Import modules and dataset
Plot data
Change plot size
How to save plot image to a file
Import modules and dataset
Save plots to an image file
Save plots to a pdf file.
Move legend outside the plot area
Import modules and dataset
Display default legend
Move legend out of plot area
How to display plots inline in a jupyter notebook
Matplotlib jupyter notebook magic command
Behavior without the command
Behavior with the inline command
Learn how to use matpotlib clear plot methods
Introduce various plot clearing methods
Demonstrate plot clearing methods
How to change font properties of the plot elements
Change font sizes
Change font type
Change font color
Learn to fix matplotlib value errors
Type of value errors
Introduce value errors
Fix value error
Python is an open-source community-supported, general-purpose programming language that, over the years, has also become one of the bastions of data science. Thanks to its flexibility and vast popularity that data analysis, visualization, and machine learning can be easily carried out with Python.
This practical course is designed to teach you how to perform data science tasks such as data analysis, data manipulation, and data visualization. You will begin with performing data analysis on real-world datasets. You will then work on large datasets and perform exploratory data analysis to investigate the dataset and to come up with the findings from it.You will also learn to scale your data analysis and execute distributed data science projects right from data ingestion to data manipulation and visualization using Dask. Next, you will explore Dask frameworks and see how Dask can be used with other common Python tools such as NumPy, Pandas, matplotlib, Scikit-learn, and more. Finally, you will perform data visualization using Python and Matplotlib 3.
By the end of this course, you will be able to use the power of Python to analyze data, create beautiful visualizations, and use powerful machine learning algorithms.
Meet Your Expert(s):
We have the best work of the following esteemed author(s) to ensure that your learning journey is smooth:
Mohammed Kashif works as a Data Scientist at Nineleaps, India, dealing mostly with graph data analysis. Prior to this, he worked as a Python developer at Qualcomm. He completed his Master's degree in Computer Science from IIT Delhi, with a specialization in data engineering. His areas of interest include recommender systems, NLP, and graph analytics. In his spare time, he likes to solve questions on StackOverflow and help debug other people out of their misery. He is also an experienced teaching assistant with a demonstrated history of working in the Higher-Education industry.
Jamshaid Sohail is a Data Scientist who is highly passionate about Data Science, Machine learning, Deep Learning, big data, and other related fields. He spends his free time learning more about the field and learning to use its emerging tools and technologies. He is always looking for new ways to share his knowledge with other people and add value to other people's lives. He has also attended Cambridge University for a summer course in Computer Science where he studied under great professors and would like to impart this knowledge to others. He has extensive experience as a Data Scientist in a US-based company. In short, he would be extremely delighted to educate and share knowledge with other people.
Harish Garg is a co-founder and software professional with more than 18 years of software industry experience. He currently runs a software consultancy that specializes in the data analytics and data science domain. He has been programming in Python for more than 12 years and has been using Python for data analytics and data science for 6 years. He has developed numerous courses in the data science domain and has also published a book involving data science with Python, including Matplotlib.