Unsupervised Machine Learning with Python
What you'll learn
- Clustering Algorithms: Hierarchical, DBSCAN, K Means, Gaussian Mixture Model
- Dimensions Reduction: Principal Component Analysis (PCA)
- Implementation of clustering algorithms and principal component analysis in Python
- Applications of clustering and PCA using real world data
- Basic knowledge of Linear Algebra including vectors, matrices, transpose, matrix multiplications, linear spaces
- Basic knowledge of Probability and Statistics including mean, covariance, and normal distributions
- Ability to program in Python 3
- Ability to run Python 3 programs on local machine in Jupyter notebooks and command window
After taking this course, students will be able to understand and implement in Python algorithms of Unsupervised Machine Learning and apply them to real-world datasets.
Course Topics and Approach:
Unsupervised Machine Learning involves finding patterns in datasets. The core of this course involves study of the following algorithms:
Clustering: Hierarchical, DBSCAN, K Means & Gaussian Mixture Model
Dimension Reduction: Principal Component Analysis
Unlike many other courses, this course:
Has a detailed presentation of the the math underlying the above algorithms, including normal distributions, expectation maximization, and singular value decomposition.
Has a detailed explanation of how algorithms are converted into Python code with lectures on code design and use of vectorization
Has questions (programming and theory) and solutions that allow learners to get practice with the course material
The course codes are then used to address case studies involving real-world data to perform dimension reduction/clustering for the Iris Flowers Dataset, MNIST Digits Dataset (images), and BBC Text Dataset (articles).
This course is designed for:
Scientists, engineers, and programmers and others interested in machine learning/data science
No prior experience with machine learning is needed
Students should have knowledge of
Basic linear algebra (vectors, transpose, matrices, matrix multiplication, inverses, determinants, linear spaces)
Basic probability and statistics (mean, covariance matrices, normal distributions)
Python 3 programming
Students should have a Python installation, such as the Anaconda platform, on their machine with the ability to run programs in the command window and in Jupyter Notebooks
Teaching Style and Resources:
Course includes many examples with plots and animations used to help students get a better understanding of the material
Course has many exercises with solutions (theoretical, Jupyter Notebook, and programming) to allow students to gain additional practice
All resources (presentations, supplementary documents, demos, codes, solutions to exercises) are downloadable from the course Github site.
Section 9.5: added Autoencoder example
Section 9.6: added this new section with an Autoencoder Demo
Sections 2.3, 2.4, 3.4, 4.3: updates so codes can run in more recent versions of python and matplotlib and updates to presentations to point out the changes
Who this course is for:
- Scientists, engineers and programmers interested in data science/machine learning
PhD in Applied Math from Massachusetts Institute of Technology
10 years experience doing research in applied math and teaching undergraduate and graduate courses at New York University, Oregon State University, and the University of British Columbia.
17 years experience in financial risk management space working at a software start-up, a financial information services company, and a large international bank.
Currently, consulting on machine learning projects.