Learning Path: Julia: Explore Data Science with Julia
4.1 (4 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
45 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Learning Path: Julia: Explore Data Science with Julia to your Wishlist.

Add to Wishlist

Learning Path: Julia: Explore Data Science with Julia

Use the advanced features of Julia to work with complex data
4.1 (4 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
45 students enrolled
Created by Packt Publishing
Last updated 4/2017
English
Curiosity Sale
Current price: $10 Original price: $200 Discount: 95% off
30-Day Money-Back Guarantee
Includes:
  • 5.5 hours on-demand video
  • 1 Supplemental Resource
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Get to grips with the basic data structures in Julia and learn about different development environments
  • Organize your code by writing Lisp-style macros and using modules
  • Manage, analyze, and work in depth with statistical datasets using the powerful DataFrames package
  • Perform statistical computations on data from different sources and visualize those using plotting packages
  • Apply different algorithms from decision trees and other packages to extract meaningful information from the Iris dataset
  • Gain some valuable insights into interfacing Julia with an R application
  • Uncover the concepts of metaprogramming in Julia
  • Conduct statistical analysis with StatsBase.jl and Distributions.jl
View Curriculum
Requirements
  • Although knowing the basic concepts of data science will give you a head-start, it is not a mandatory requirement. With no previous knowledge in data science as well, you will find the pace of the Learning Path quite comfortable and easy to follow.
Description

Almost all companies these days are investing thousands of dollars in data analysis to get their data analyzed. Well, in fact studies say that there are around 73% of organizations have invested in Big Data. Why do you think that is the case? What can you reap of the data, ideally just 1s and 0s? Moreover, how does this data help an organization’s future?

Most of you might have guessed it right; the market trends, the consumer habits can all be precisely predicted, if we are able to analyze our data efficiently. This Learning Path will tell you how you can achieve all this by using Julia.

Packt’s Video Learning Paths are an amalgamation of multiple video courses that are logically tied together to provide you with a larger learning curve.

With the amount of data that is generated in the world these days, we are faced with the challenge of analyzing this data. Julia, which enjoys the benefits of a sophisticated compiler, parallel execution, and an all-encompassing mathematical function library, acts as a very good tool that helps us work with data more efficiently.

In this Learning Path, embark on your journey from the basics of Julia, right from installing it on your system and setting up the environment. You will then be introduced to the basic machine learning techniques, data science models, and concepts of parallel computing.

After completing this Learning Path, you will have acquired all the skills that will help you work with data effectively. 

About the Authors

Ivo Balbaert is currently a web programming and databases lecturer at CVO Antwerpen, a community college in Belgium. He received a PhD in applied physics in 1986 from the University of Antwerp. He worked for 20 years in the software industry as a developer and consultant in several companies, and, for 10 years, as a project manager at the University Hospital of Antwerp. In 2000, he switched over to partly teach and partly develop software (KHM Mechelen, CVO Antwerp).

Jalem Raj Rohit is an IIT Jodhpur graduate with a keen interest in machine learning, data science, data analysis, computational statistics, and natural language processing (NLP). Rohit currently works as a senior data scientist at Zomato, also having worked as the first data scientist at Kayako.He is part of the Julia project, where he develops data science models and contributes to the codebase. Additionally, Raj is also a Mozilla contributor and volunteer, and has interned at Scimergent Analytics.

Who is the target audience?
  • This Learning Path is for anyone who is new to the field of data science, or anyone aspiring to get into the field of data science and choses Julia as the tool to do so.
Students Who Viewed This Course Also Viewed
Curriculum For This Course
63 Lectures
05:32:52
+
Julia for Data Science
26 Lectures 02:41:04

This video provides an overview of the entire course.

Preview 02:35

We are going to install Julia with any one of the common development environments available.

Installing a Julia Working Environment
05:12

Program data needs to be stored efficiently and in an easy to use form.

Working with Variables and Basic Types
08:07

This video deals with the problem of how to control the order of execution in Julia code and what to do when errors occur.

Controlling the Flow
05:17

Julia code is much less performant and readable when the code is not subdivided in functions.

Using Functions
08:35

Arrays can only be accessed by index and all the elements have to be of the same type. We want more flexible data structures; in particular, we want to also store and retrieve data by keys.

Using Tuples, Sets, and Dictionaries
05:53

Data is often presented in the form of a matrix. We need to know how to work with matrices in order to work on data. 

Working with Matrices for Data Storage and Calculations
08:25

The aim of the video is to show you the importance of using types and parametrized methods in writing generic and performant code. 

Preview 06:42

Coding is often a repetitive task. Shorten your code, make it more elegant and avoid repetition by making and using macros.

Optimizing Your Code by Using and Writing Macros
07:11

In order to build a Julia package we need something to structure that, why? Because of the following reasons: A package can contain multiple files and  Different packages can have functions with the same name that would conflict

Organizing Your Code in Modules
06:25

Functionality that you need in your project is often already written and exists as a package. How to search, install, and work with these packages?

Working with the Package Ecosystem
06:18

In order to process data, we need to get them out of their data-sources and into our Julia program. 

Preview 07:41

Working with tabular data in matrices is possible, but not very convenient. The DataFrame offers us a more convenient data structure for data science purposes. 

Using DataArrays and DataFrames
07:41

What are the possibilities that DataFrame offers for data manipulation? 

The Power of DataFrames
06:36

Relational databases are an important data source. How can we work from Julia with the data in these data sources? 

Interacting with Relational Databases Like SQL Server
07:20

In certain situations data is better stored in NoSQL databases. Julia can work with a number of these through specialized packages; amongst them are Mongo and Redis. 

Interacting with NoSQL Databases Like MongoDB
06:23

We need to calculate various statistical numbers to get insight into a dataset. How can we do this with Julia?

Preview 06:38

Data must be graphically visualized to get better insight onto them. What are the possibilities Julia offers in this area? 

An Overview of the Plotting Techniques in Julia
03:02

Scatterplots, histograms, and box plots are some of the basic tools of the data scientist. We investigate our iris data by using each of them in turn.

Visualizing Data with Scatterplots, Histograms, and Box Plots
04:24

In statistical investigations, we need to be able to define distributions, cluster data into groups, and test hypotheses.

Distributions and Hypothesis Testing
05:34

A lot of useful libraries exist written in R that are not yet implemented in Julia. Can we use these R libraries from Julia code? 

Interfacing with R
04:24

Data must be prepared before machine learning algorithms can be applied. Furthermore, applying an algorithm follows a specific cycle, which we will review here. The MLBase package will be used in this section.

Preview 06:15

Data often needs to be classified in groups; Decision Tree is one of the basic algorithms to do that.

Classification Using Decision Trees and Rules
07:00

In a realistic setting, a model is first trained, and then tested.

Training and Testing a Decision Tree Model
03:58

To obtain better linear regression models, and to be able to work with more independent variables, we need more generalized linear modeling.

Applying a Generalized Linear Model with GLM
06:17

We need a better classification algorithm than Decision Trees for more complex data, like in pattern recognition. The Support Vector Machine is developed for these tasks.

Working with Support Vector Machines
07:11
+
Julia Solutions
37 Lectures 02:51:48

This video gives an overview of the entire course.

Preview 05:02

In this video, we will explain ways in which you can handle files with the comma-separated values (CSV) file format.

Handling Data with CSV Files
06:28

TSV files are files whose contents are separated by commas(,)In this video, we will explain how to handle TSV files.

Handling Data with TSV Files
03:33

This video will teach you how to interact with websites by sending and receiving data through HTTP requests.

Interacting with the Web
06:42

We will see Interpretation and representation of Julia programs.

Preview 06:38

Improve the tasks that need to process the string both time-and space-efficient.

Symbols
03:07

Create an expression with a single argument.

Quoting
03:32

Construction of Expression objects when having multiple objects and/or variables is difficult.

Interpolation
03:48

Evaluating an expression object.

The eval Function
03:24

Compilation of code directly rather than the conventional method of constructing expression statements and using the eval function.

Macros
04:31

Metaprogramming techniques help speed up the process of dealing with data frames.

Metaprogramming with DataFrames
07:56

You will learn about doing statistics in Julia, along with common problems in handling data arrays, distributions, estimation, and sampling techniques.

Preview 05:15

Descriptive statistics helps us estimate the shape and features of data for model and algorithm selection.

Descriptive Statistics
07:04

Deviation metrics helps calculate the distance between two vectors. These metrics help us understand the relationship between the different vectors and the data in them.

Deviation Metrics
03:36

Sampling is the process where sample units are selected from a large population for analysis.

Sampling
06:27

Correlation analysis is the process that indicates the similarity and relationship between two random variables.

Correlation Analysis
07:52

In this video, you will learn about the concept of dimensionality reduction.

Preview 05:09

This video will let you explore about Linear regression model which can be used for explaining the relationship between a single dependent variable and independent variable.

Data Preprocessing
05:16

Linear regression is a linear model that is used to determine and predict numerical values. We will deal with that in this video.

Linear Regression
03:20

What could we do in those scenarios where the variable of interest is categorical in nature, such as buying a product or not, approving a credit card or not, tumor is cancerous or not, and so on? Logistic regression is the best solution to these.

Classification
03:19

Analysis of performance is very important for any analytics and machine learning processes. In this video, we will deal with performance evaluation and model selection.


Performance Evaluation and Model Selection
04:47

In this video, we will deal with cross validation is one of the most underrated processes in the domain of data science andanalytics.

Cross Validation
03:28

In statistics, the distance between vectors or data sets are computed in various ways depending on the problem statement and the properties of the data. In this video, we will deal with distances.

Distances
04:35

In this video, we will deal with difference type of distribution.

Distributions
05:14

Time series is another very important form of data. This video deals with time series analysis.

Time Series Analysis
01:35

Plotting of arrays is important in visualization as arrays are quick to store data.

Preview 06:21

DataFrames are the best way for representing tabular data.

Plotting DataFrames
05:12

Use several functions for both transforming and exploratory analytics steps and to plot separate functions as well as to stack several functions in a single plot.

Plotting Functions
05:31

It is thorough the exploration of the data that we find any possible patterns that can be identified through basic statistics and the shape of the data using plots and visualizations.

Exploratory Data Analytics Through Plots
05:13

Line plotscan be used both to understand correlations and look at data trends.

Line Plots
02:46

Scatter plots help in data distribution and see the relationship between the corresponding columns, which in turn helps identify some prominent patterns in the data.

Scatter Plots
03:33

Histograms are one of the best ways for visualizing and finding out the three main statistics of a dataset—the mean, median, and mode.

Histograms
03:45

Customization of plot enhances the visualization of the plot even further.

Aesthetic Customizations
03:49

Basic Concepts of Parallel Computing
05:46

Optimize data movements as it is quite common and should be minimized due to the time and the network overhead.

Data Movement
02:45

Learn about the famous Map-Reduce framework and why it is one of the most important ideas in the domains of big data and parallel computing and how to parallelize loops and use reducing functions on them through several CPUs and machines.

Parallel Maps and Loop Operations
03:25

Channels are like background plumbing for parallel computing in Julia, They are the reservoirs from which the individual processes access their data.

Channels
02:04
About the Instructor
Packt Publishing
3.9 Average rating
7,336 Reviews
52,382 Students
616 Courses
Tech Knowledge in Motion

Packt has been committed to developer learning since 2004. A lot has changed in software since then - but Packt has remained responsive to these changes, continuing to look forward at the trends and tools defining the way we work and live. And how to put them to work.

With an extensive library of content - more than 4000 books and video courses -Packt's mission is to help developers stay relevant in a rapidly changing world. From new web frameworks and programming languages, to cutting edge data analytics, and DevOps, Packt takes software professionals in every field to what's important to them now.

From skills that will help you to develop and future proof your career to immediate solutions to every day tech challenges, Packt is a go-to resource to make you a better, smarter developer.

Packt Udemy courses continue this tradition, bringing you comprehensive yet concise video courses straight from the experts.