Python is a dynamic programming language used in a wide range of domains by programmers who find it simple yet powerful. In today’s world, everyone wants to gain insights from the deluge of data coming their way. Data mining provides a way of finding these insights, and Python is one of the most popular languages for data mining, providing both power and flexibility in analysis. Python has become the language of choice for data scientists for data analysis, visualization, and machine learning.
In this course, you will discover the key concepts of data mining and learn how to apply different data mining techniques to find the valuable insights hidden in real-world data. You will also tackle some notorious data mining problems to get a concrete understanding of these techniques.
We begin by introducing you to the important data mining concepts and the Python libraries used for data mining. You will understand the process of cleaning data and the steps involved in filtering out noise and ensuring that the data available can be used for accurate analysis. You will also build your first intelligent application that makes predictions from data. Then you will learn about the classification and regression techniques such as logistic regression, k-NN classifier, and SVM, and implement them in real-world scenarios such as predicting house prices and the number of TV show viewers.
By the end of this course, you will be able to apply the concepts of classification and regression using Python and implement them in a real-world setting.
About The Author
Saimadhu Polamuri is a data science educator and the founder of Data Aspirant, a Data Science portal for beginners. He has 3 years of experience in data mining and 5 years of experience in Python. He is also interested in big data technologies such as Hadoop, Pig, and Spark. He has a good command of the R programming language and Matlab. He has a rudimentary understanding of Cpp Computer vision library (opencv) and big data technologies.
We need to lay the groundwork for the course, and for this, we need a strong understanding of the concepts of data mining.
It's time to deep-dive into the core concepts of data mining. For that, we need a breakdown of the important topics.
There are plenty of programming languages available. However, there is a reason Python is a good choice; understand just that with the help of this video.
To boost up the expertise level of Python programming, there is a need to introduce the fruitful basics of the Python programming language, which will help in the upcoming sections.
We will get introduced to IPython, which is an important step in our journey.
Solving real-life problems using data mining algorithms requires a lot of scientific computing. So, there is a need to learn about Numpy packages, as they are specially built for scientific computing.
Working with tabular data is a painful process until we get some hands-on experience with a tabular data analysis library. pandas is specially built for tabular data analysis. So let's get introduced to the pandas data analysis library.
In the space of data science, visualization plays a key role as the results obtained after applying different data mining algorithms have to be visualized to understand them. And visualization of data gives a clear picture about the data we are working on. So let's take a look at the Python visualization library.
Introducing scikit-learn, an extraordinary and widely used python library that contains many in-built data mining algorithms.
Data preprocessing techniques comprise data cleaning and preprocessing. Let's take a look at data cleaning and its importance.
Take a look at other data preprocessing techniques, such as data integration, data reduction, and data transformation.
Get insights into linear regression, extend the areas where linear regression is efficient, and finally visualize a clear picture of the linear regression model.
You'll probably face challenging problems while modeling the linear regression model. This video will give you a clear idea about those model fitting problems to evaluate different regression models and come up with ways to pick the best regression model.
Let's take a look at the usage of Python data mining libraries. We'll extend the usage of Python data mining algorithms by implementing a simple linear regression model in Python to predict house prices.
scikit-learn is a data mining algorithm library that can be used to implement the multi-regression model to predict television show viewers.
Introducing the use of logistic regression algorithm to solve classification problems. Extend your knowledge by the clear understanding of basic concepts of logistic regression to build it.
Introducing K-nearest neighbors algorithm, a classification algorithm, extended by understanding special cases in the k nearest neighbors classifier model, and followed by introducing different distance measure metrics.
Introducing the support vector machine algorithm by explaining the key concepts in support vector machine such as hyper planes, support vectors, and margins.
Implementation of the logistic regression model using python data mining libraries. This is extended by understanding the ANES 1996 dataset. Using implemented logistic regression model to predict whom the voter going to vote.
K-NN classifier implementation using the python scikit-learn library, introducing iris data, and finally predicting the iris category using the implemented k-NN classifier.
Packt has been committed to developer learning since 2004. A lot has changed in software since then - but Packt has remained responsive to these changes, continuing to look forward at the trends and tools defining the way we work and live. And how to put them to work.
With an extensive library of content - more than 4000 books and video courses -Packt's mission is to help developers stay relevant in a rapidly changing world. From new web frameworks and programming languages, to cutting edge data analytics, and DevOps, Packt takes software professionals in every field to what's important to them now.
From skills that will help you to develop and future proof your career to immediate solutions to every day tech challenges, Packt is a go-to resource to make you a better, smarter developer.
Packt Udemy courses continue this tradition, bringing you comprehensive yet concise video courses straight from the experts.