Advanced Data Analysis with Haskell
3.8 (7 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
119 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Advanced Data Analysis with Haskell to your Wishlist.

Add to Wishlist

Advanced Data Analysis with Haskell

Learn advanced data analysis techniques to gain insights into real-world data sets using Haskell
3.8 (7 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
119 students enrolled
Created by Packt Publishing
Last updated 3/2017
Current price: $10 Original price: $125 Discount: 92% off
5 hours left at this price!
30-Day Money-Back Guarantee
  • 4 hours on-demand video
  • 1 Supplemental Resource
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Get to know the basics of data analysis: SQLite3 basics, regular expression, and visualization
  • Understand the process involved in linear regression and the pitfalls of it
  • Study a corpus of text to discover interesting features using TF-IDF analysis
  • Determine the likelihood of an event using Naïve Bayesian Classification
  • Reduce the size of data without affecting the data’s effectiveness using Principal Component Analysis
  • Generate Eigenvalues and Eigenvectors using HMatrix
  • Untangle the different varieties of clusters
  • Master the techniques necessary to perform multivariate regression using Haskell code
View Curriculum
  • This video course is for advanced users with some prior knowledge of functional programming.

Every business and organization that collects data is capable of tapping into its own data to gain insights on how to improve. Haskell is a purely functional and lazy programming language that is well suited to handling large data analysis problems. This video picks up where Beginning Haskell Data Analysis takes off. This video series will take you through the more difficult problems of data analysis in a conversational style.

You will be guided on how to find correlations in data, as well as multiple dependent variables. You will be given a theoretical overview of the types of regression and we’ll show you how to install the LAPACK and HMatrix libraries. By the end of the first part, you’ll be familiar with the application of N-grams and TF-IDF.

Once you’ve learned how to analyze data, the next step is organizing that data with the help of machine learning algorithms. You will be briefed on the mathematics and statistical theorems such as Baye’s law and its application, as well as eigenvalues and eigenvectors using HMatrix.

By the end of this course, you’ll have an understanding of data analysis, different ways to analyze data, and the various clustering algorithms available. You’ll also understand Haskell and will be ready to write code with it.

About the Author

James Church lives in Clarksville, Tennessee, United States, where he enjoys teaching, programming, and playing board games with his wife, Michelle. He is an assistant professor of computer science at Austin Peay State University. He has consulted for various companies and a chemical laboratory for the purpose of performing data analysis work. James is the author of Learning Haskell Data Analysis.

Who is the target audience?
  • It is ideal for those who wish to use their knowledge of Haskell to improve their understanding of their data sets. It’s also great for those who have a good grasp of data analysis but wish to expand their knowledge of Haskell.
Students Who Viewed This Course Also Viewed
Curriculum For This Course
32 Lectures
Brushing up on the Basics
5 Lectures 32:19

This video gives an overview of the entire course.

Preview 02:36

Data is frequently presented in raw data files called CSV files. CSV files are often cumbersome to use. We'll take a CSV file and convert it to SQLite3.

CSV Files to SQLite3

Data is often raw and sometimes does not meet our specifications.

Regular Expressions

Data is hard to understand unless we visualize it. We'll do that in this video.


It's hard to get a feel of the shape of data by just plotting it. For that, we need KDE.

Kernel Density Estimation
Regression Analysis
5 Lectures 50:24

Data often has gaps and we'd like to estimate what might a data point be if we had an observation. Linear regression attempts to solve this.

Preview 10:31

The data in which we are inspecting (the year and the population) seem to be related, but how can we be sure? We need to compute the correlation coefficients.

Correlation Coefficients

Linear regression and coefficients are great! Unfortunately, it is possible for regression coefficients to be misleading. Here, we'll study a dataset which is purposely misleading.

Drawbacks of Linear Regression

In video 1, we performed linear regression on our dataset. In video 2, we studied the results. The results appear good, but visually, it's still bad. We need a new solution. Let's try logarithmic regression!

Logarithmic Regression

Okay, so our dataset isn't linear and it isn't logarithmic. What is it? Let's try polynomial.

Polynomial Regression
Multiple Regression
4 Lectures 40:17

We have lots of data in a database, but our data needs to be in matrix format.

Preview 11:35

Now that we have our data pulled from the database, we need to perform the actual regression.

Performing Multivariate Regression

We performed the regression and we got some results. Are they any good? Let's find out.

Calculating the Adjusted R^2

We found in our previous video that our scores could not and should not be trusted. For an intellectual exercise, let's explore how we might improve this score.

Improving the Adjusted R^2 Score
Text Analysis
5 Lectures 30:41

Let's explore text analysis, shall we? First, we have to clean our datasets.

Preview 07:26

Our dataset is clean. How do we use this data?

Finding the Set of N-Grams

We now are able to find the n-grams of a dataset. Let's do something cool!

Cosine Similarity

We're changing passes here. Let's talk about TF-IDF. We're trying to figure out how important a word is to a document.

Overview of TF-IDF

Good. You learned what we need to know about TF-IDF. Let's apply that.

Applying TF-IDF
5 Lectures 34:31

What is a cluster? That's a problem in itself.

Preview 05:55

One problem with clustering is that it's hard to get good clustering data. We can solve this problem by just generating our own data.

Random Cluster Generation

We have clusters in our dataset, but how far apart are they? We'll use the "centroid" solution.

Distances between Clusters

We have data which needs clustering. Let's use k-means clustering.

Performing K-Means Clustering

We need to cluster our data. How can we do this in a different manner?

Performing Hierarchical Clustering
Naïve Bayes Classification
3 Lectures 24:05

How does spam detection work? How can we classify documents?

Preview 08:50

How can we write the code to perform a small portion of Bayes.

Bayes: The Code

How might we expand Naïve Bayes Classification to a single document?

Bayes on Full Documents
Principal Component Analysis
5 Lectures 32:12

How do recommendation engines work? We have lots and lots of data, but really we only need a subset of data in order to make recommendations.

Preview 07:06

How do we prepare our dataset?

Preparing Our Dataset

How do we perform eigendecomposition and what is it good for?


We still have our really big dataset. Let's make it smaller. That's easy.

Dimensionality Reduction

Now that we've reduced our dataset, let's make some recommendations with it.

Recommendation Engine
About the Instructor
Packt Publishing
3.9 Average rating
7,282 Reviews
51,870 Students
616 Courses
Tech Knowledge in Motion

Packt has been committed to developer learning since 2004. A lot has changed in software since then - but Packt has remained responsive to these changes, continuing to look forward at the trends and tools defining the way we work and live. And how to put them to work.

With an extensive library of content - more than 4000 books and video courses -Packt's mission is to help developers stay relevant in a rapidly changing world. From new web frameworks and programming languages, to cutting edge data analytics, and DevOps, Packt takes software professionals in every field to what's important to them now.

From skills that will help you to develop and future proof your career to immediate solutions to every day tech challenges, Packt is a go-to resource to make you a better, smarter developer.

Packt Udemy courses continue this tradition, bringing you comprehensive yet concise video courses straight from the experts.