Data Mining with Python: Classification and Regression
3.1 (6 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
80 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Data Mining with Python: Classification and Regression to your Wishlist.

Add to Wishlist

Data Mining with Python: Classification and Regression

A practical guide that will give you hands-on experience with the popular Python data mining algorithms
3.1 (6 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
80 students enrolled
Created by Packt Publishing
Last updated 8/2016
Current price: $10 Original price: $75 Discount: 87% off
5 hours left at this price!
30-Day Money-Back Guarantee
  • 2 hours on-demand video
  • 1 Supplemental Resource
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Understand the basic data mining concepts to implement efficient models using Python
  • Know how to use Python libraries and mathematical toolkits such as numpy, pandas, matplotlib, and sci-kit learn
  • Build your first application that makes predictions from data and see how to evaluate the regression model
  • Analyze and implement Logistic Regression and the KNN model
  • Dive into the most effective data cleaning process to get accurate results
  • Master the classification concepts and implement the various classification algorithms
View Curriculum
  • A rudimentary knowledge of Python and its libraries would be useful.

Python is a dynamic programming language used in a wide range of domains by programmers who find it simple yet powerful. In today’s world, everyone wants to gain insights from the deluge of data coming their way. Data mining provides a way of finding these insights, and Python is one of the most popular languages for data mining, providing both power and flexibility in analysis. Python has become the language of choice for data scientists for data analysis, visualization, and machine learning.

In this course, you will discover the key concepts of data mining and learn how to apply different data mining techniques to find the valuable insights hidden in real-world data. You will also tackle some notorious data mining problems to get a concrete understanding of these techniques.

We begin by introducing you to the important data mining concepts and the Python libraries used for data mining. You will understand the process of cleaning data and the steps involved in filtering out noise and ensuring that the data available can be used for accurate analysis. You will also build your first intelligent application that makes predictions from data. Then you will learn about the classification and regression techniques such as logistic regression, k-NN classifier, and SVM, and implement them in real-world scenarios such as predicting house prices and the number of TV show viewers.

By the end of this course, you will be able to apply the concepts of classification and regression using Python and implement them in a real-world setting.

About The Author

Saimadhu Polamuri is a data science educator and the founder of Data Aspirant, a Data Science portal for beginners. He has 3 years of experience in data mining and 5 years of experience in Python. He is also interested in big data technologies such as Hadoop, Pig, and Spark. He has a good command of the R programming language and Matlab. He has a rudimentary understanding of Cpp Computer vision library (opencv) and big data technologies.

Who is the target audience?
  • This book is for data analysts or aspiring data scientists who want to learn more about data mining with Python.
Students Who Viewed This Course Also Viewed
Curriculum For This Course
21 Lectures
Introduction to Data Mining
3 Lectures 15:37

We need to lay the groundwork for the course, and for this, we need a strong understanding of the concepts of data mining. 

A Brief Introduction to Data Mining

It's time to deep-dive into the core concepts of data mining. For that, we need a breakdown of the important topics. 

Data Mining Basic Concepts and Applications
Setting Up the Data Mining Python Packages Environment
7 Lectures 26:58

There are plenty of programming languages available. However, there is a reason Python is a good choice; understand just that with the help of this video. 

Preview 03:31

To boost up the expertise level of Python programming, there is a need to introduce the fruitful basics of the Python programming language, which will help in the upcoming sections. 

Basics of Python

We will get introduced to IPython, which is an important step in our journey. 

Installing IPython

Solving real-life problems using data mining algorithms requires a lot of scientific computing. So, there is a need to learn about Numpy packages, as they are specially built for scientific computing. 

Installing the Numpy Library

Working with tabular data is a painful process until we get some hands-on experience with a tabular data analysis library. pandas is specially built for tabular data analysis. So let's get introduced to the pandas data analysis library. 

Installing the pandas Library

In the space of data science, visualization plays a key role as the results obtained after applying different data mining algorithms have to be visualized to understand them. And visualization of data gives a clear picture about the data we are working on. So let's take a look at the Python visualization library. 

Installing Matplotlib

Introducing scikit-learn, an extraordinary and widely used python library that contains many in-built data mining algorithms. 

Installing scikit-learn
Cleaning Data and Preprocessing Techniques
2 Lectures 10:39

Data preprocessing techniques comprise data cleaning and preprocessing. Let's take a look at data cleaning and its importance. 

Preview 05:31

Take a look at other data preprocessing techniques, such as data integration, data reduction, and data transformation. 

Data Preprocessing Techniques
Linear Regression Model
4 Lectures 33:00

Get insights into linear regression, extend the areas where linear regression is efficient, and finally visualize a clear picture of the linear regression model. 

Preview 08:23

You'll probably face challenging problems while modeling the linear regression model. This video will give you a clear idea about those model fitting problems to evaluate different regression models and come up with ways to pick the best regression model. 

Evaluating Regression Models

Let's take a look at the usage of Python data mining libraries. We'll extend the usage of Python data mining algorithms by implementing a simple linear regression model in Python to predict house prices. 

Basic Regression Model Implementation to Predict House Prices

scikit-learn is a data mining algorithm library that can be used to implement the multi-regression model to predict television show viewers. 

Regression Model Implementation to Predict Television Show Viewers
Classification Concepts
5 Lectures 37:01

Introducing the use of logistic regression algorithm to solve classification problems. Extend your knowledge by the clear understanding of basic concepts of logistic regression to build it. 

Preview 04:01

Introducing K-nearest neighbors algorithm, a classification algorithm, extended by understanding special cases in the k nearest neighbors classifier model, and followed by introducing different distance measure metrics. 

K – Nearest Neighbors Classifier

Introducing the support vector machine algorithm by explaining the key concepts in support vector machine such as hyper planes, support vectors, and margins. 

Support Vector Machine

Implementation of the logistic regression model using python data mining libraries. This is extended by understanding the ANES 1996 dataset. Using implemented logistic regression model to predict whom the voter going to vote. 

Logistic Regression Model Implementation

K-NN classifier implementation using the python scikit-learn library, introducing iris data, and finally predicting the iris category using the implemented k-NN classifier. 

K – Nearest Neighbor Classifier Implementation
About the Instructor
Packt Publishing
3.9 Average rating
7,196 Reviews
51,408 Students
616 Courses
Tech Knowledge in Motion

Packt has been committed to developer learning since 2004. A lot has changed in software since then - but Packt has remained responsive to these changes, continuing to look forward at the trends and tools defining the way we work and live. And how to put them to work.

With an extensive library of content - more than 4000 books and video courses -Packt's mission is to help developers stay relevant in a rapidly changing world. From new web frameworks and programming languages, to cutting edge data analytics, and DevOps, Packt takes software professionals in every field to what's important to them now.

From skills that will help you to develop and future proof your career to immediate solutions to every day tech challenges, Packt is a go-to resource to make you a better, smarter developer.

Packt Udemy courses continue this tradition, bringing you comprehensive yet concise video courses straight from the experts.