Categories

Learning Path: Julia: Explore Data Science with Julia

Please confirm that you want to add **Learning Path: Julia: Explore Data Science with Julia** to your Wishlist.

Use the advanced features of Julia to work with complex data

45 students enrolled

Curiosity Sale

Current price: $10
Original price: $200
Discount:
95% off

30-Day Money-Back Guarantee

- 5.5 hours on-demand video
- 1 Supplemental Resource
- Full lifetime access
- Access on mobile and TV

- Certificate of Completion

What Will I Learn?

- Get to grips with the basic data structures in Julia and learn about different development environments
- Organize your code by writing Lisp-style macros and using modules
- Manage, analyze, and work in depth with statistical datasets using the powerful DataFrames package
- Perform statistical computations on data from different sources and visualize those using plotting packages
- Apply different algorithms from decision trees and other packages to extract meaningful information from the Iris dataset
- Gain some valuable insights into interfacing Julia with an R application
- Uncover the concepts of metaprogramming in Julia
- Conduct statistical analysis with StatsBase.jl and Distributions.jl

Requirements

- Although knowing the basic concepts of data science will give you a head-start, it is not a mandatory requirement. With no previous knowledge in data science as well, you will find the pace of the Learning Path quite comfortable and easy to follow.

Description

Almost all companies these days are investing thousands of dollars in data analysis to get their data analyzed. Well, in fact studies say that there are around 73% of organizations have invested in Big Data. Why do you think that is the case? What can you reap of the data, ideally just 1s and 0s? Moreover, how does this data help an organization’s future?

Most of you might have guessed it right; the market trends, the consumer habits can all be precisely predicted, if we are able to analyze our data efficiently. This Learning Path will tell you how you can achieve all this by using **Julia**.

Packt’s Video Learning Paths are an amalgamation of multiple video courses that are logically tied together to provide you with a larger learning curve.

With the amount of data that is generated in the world these days, we are faced with the challenge of analyzing this data. Julia, which enjoys the benefits of a sophisticated compiler, parallel execution, and an all-encompassing mathematical function library, acts as a very good tool that helps us work with data more efficiently.

In this Learning Path, embark on your journey from the basics of Julia, right from installing it on your system and setting up the environment. You will then be introduced to the basic machine learning techniques, data science models, and concepts of parallel computing.

**After completing this Learning Path, you will have acquired all the skills that will help you work with data effectively. **

**About the Authors**

**Ivo Balbaert** is currently a web programming and databases lecturer at CVO Antwerpen, a community college in Belgium. He received a PhD in applied physics in 1986 from the University of Antwerp. He worked for 20 years in the software industry as a developer and consultant in several companies, and, for 10 years, as a project manager at the University Hospital of Antwerp. In 2000, he switched over to partly teach and partly develop software (KHM Mechelen, CVO Antwerp).

**Jalem Raj Rohit** is an IIT Jodhpur graduate with a keen interest in machine learning, data science, data analysis, computational statistics, and natural language processing (NLP). Rohit currently works as a senior data scientist at Zomato, also having worked as the first data scientist at Kayako.He is part of the Julia project, where he develops data science models and contributes to the codebase. Additionally, Raj is also a Mozilla contributor and volunteer, and has interned at Scimergent Analytics.

Who is the target audience?

- This Learning Path is for anyone who is new to the field of data science, or anyone aspiring to get into the field of data science and choses Julia as the tool to do so.

Students Who Viewed This Course Also Viewed

Curriculum For This Course

63 Lectures

05:32:52
+
–

Julia for Data Science
26 Lectures
02:41:04

This video provides an overview of the entire course.

Preview
02:35

We are going to install Julia with any one of the common development environments available.

Installing a Julia Working Environment

05:12

Program data needs to be stored efficiently and in an easy to use form.

Working with Variables and Basic Types

08:07

This video deals with the problem of how to control the order of execution in Julia code and what to do when errors occur.

Controlling the Flow

05:17

Julia code is much less performant and readable when the code is not subdivided in functions.

Using Functions

08:35

Arrays can only be accessed by index and all the elements have to be of the same type. We want more flexible data structures; in particular, we want to also store and retrieve data by keys.

Using Tuples, Sets, and Dictionaries

05:53

Data is often presented in the form of a matrix. We need to know how to work with matrices in order to work on data.

Working with Matrices for Data Storage and Calculations

08:25

The aim of the video is to show you the importance of using types and parametrized methods in writing generic and performant code.

Preview
06:42

Coding is often a repetitive task. Shorten your code, make it more elegant and avoid repetition by making and using macros.

Optimizing Your Code by Using and Writing Macros

07:11

In order to build a Julia package we need something to structure that, why? Because of the following reasons: A package can contain multiple files and Different packages can have functions with the same name that would conflict

Organizing Your Code in Modules

06:25

Functionality that you need in your project is often already written and exists as a package. How to search, install, and work with these packages?

Working with the Package Ecosystem

06:18

In order to process data, we need to get them out of their data-sources and into our Julia program.

Preview
07:41

Working with tabular data in matrices is possible, but not very convenient. The DataFrame offers us a more convenient data structure for data science purposes.

Using DataArrays and DataFrames

07:41

What are the possibilities that DataFrame offers for data manipulation?

The Power of DataFrames

06:36

Relational databases are an important data source. How can we work from Julia with the data in these data sources?

Interacting with Relational Databases Like SQL Server

07:20

In certain situations data is better stored in NoSQL databases. Julia can work with a number of these through specialized packages; amongst them are Mongo and Redis.

Interacting with NoSQL Databases Like MongoDB

06:23

We need to calculate various statistical numbers to get insight into a dataset. How can we do this with Julia?

Preview
06:38

Data must be graphically visualized to get better insight onto them. What are the possibilities Julia offers in this area?

An Overview of the Plotting Techniques in Julia

03:02

Scatterplots, histograms, and box plots are some of the basic tools of the data scientist. We investigate our iris data by using each of them in turn.

Visualizing Data with Scatterplots, Histograms, and Box Plots

04:24

In statistical investigations, we need to be able to define distributions, cluster data into groups, and test hypotheses.

Distributions and Hypothesis Testing

05:34

A lot of useful libraries exist written in R that are not yet implemented in Julia. Can we use these R libraries from Julia code?

Interfacing with R

04:24

Data must be prepared before machine learning algorithms can be applied. Furthermore, applying an algorithm follows a specific cycle, which we will review here. The MLBase package will be used in this section.

Preview
06:15

Data often needs to be classified in groups; Decision Tree is one of the basic algorithms to do that.

Classification Using Decision Trees and Rules

07:00

In a realistic setting, a model is first trained, and then tested.

Training and Testing a Decision Tree Model

03:58

To obtain better linear regression models, and to be able to work with more independent variables, we need more generalized linear modeling.

Applying a Generalized Linear Model with GLM

06:17

We need a better classification algorithm than Decision Trees for more complex data, like in pattern recognition. The Support Vector Machine is developed for these tasks.

Working with Support Vector Machines

07:11

+
–

Julia Solutions
37 Lectures
02:51:48

This video gives an overview of the entire course.

Preview
05:02

In this video, we will explain ways in which you can handle files with the comma-separated values (CSV) file format.

Handling Data with CSV Files

06:28

TSV files are files whose contents are separated by commas(,)In this video, we will explain how to handle TSV files.

Handling Data with TSV Files

03:33

This video will teach you how to interact with websites by sending and receiving data through HTTP requests.

Interacting with the Web

06:42

We will see Interpretation and representation of Julia programs.

Preview
06:38

Improve the tasks that need to process the string both time-and space-efficient.

Symbols

03:07

Create an expression with a single argument.

Quoting

03:32

Construction of Expression objects when having multiple objects and/or variables is difficult.

Interpolation

03:48

Evaluating an expression object.

The eval Function

03:24

Compilation of code directly rather than the conventional method of constructing expression statements and using the eval function.

Macros

04:31

Metaprogramming techniques help speed up the process of dealing with data frames.

Metaprogramming with DataFrames

07:56

You will learn about doing statistics in Julia, along with common problems in handling data arrays, distributions, estimation, and sampling techniques.

Preview
05:15

Descriptive statistics helps us estimate the shape and features of data for model and algorithm selection.

Descriptive Statistics

07:04

Deviation metrics helps calculate the distance between two vectors. These metrics help us understand the relationship between the different vectors and the data in them.

Deviation Metrics

03:36

Sampling is the process where sample units are selected from a large population for analysis.

Sampling

06:27

Correlation analysis is the process that indicates the similarity and relationship between two random variables.

Correlation Analysis

07:52

In this video, you will learn about the concept of dimensionality reduction.

Preview
05:09

This video will let you explore about Linear regression model which can be used for explaining the relationship between a single dependent variable and independent variable.

Data Preprocessing

05:16

Linear regression is a linear model that is used to determine and predict numerical values. We will deal with that in this video.

Linear Regression

03:20

What could we do in those scenarios where the variable of interest is categorical in nature, such as buying a product or not, approving a credit card or not, tumor is cancerous or not, and so on? Logistic regression is the best solution to these.

Classification

03:19

Analysis of performance is very important for any analytics and machine learning processes. In this video, we will deal with performance evaluation and model selection.

Performance Evaluation and Model Selection

04:47

In this video, we will deal with cross validation is one of the most underrated processes in the domain of data science andanalytics.

Cross Validation

03:28

In statistics, the distance between vectors or data sets are computed in various ways depending on the problem statement and the properties of the data. In this video, we will deal with distances.

Distances

04:35

In this video, we will deal with difference type of distribution.

Distributions

05:14

Time series is another very important form of data. This video deals with time series analysis.

Time Series Analysis

01:35

Plotting of arrays is important in visualization as arrays are quick to store data.

Preview
06:21

DataFrames are the best way for representing tabular data.

Plotting DataFrames

05:12

Use several functions for both transforming and exploratory analytics steps and to plot separate functions as well as to stack several functions in a single plot.

Plotting Functions

05:31

It is thorough the exploration of the data that we find any possible patterns that can be identified through basic statistics and the shape of the data using plots and visualizations.

Exploratory Data Analytics Through Plots

05:13

Line plotscan be used both to understand correlations and look at data trends.

Line Plots

02:46

Scatter plots help in data distribution and see the relationship between the corresponding columns, which in turn helps identify some prominent patterns in the data.

Scatter Plots

03:33

Histograms are one of the best ways for visualizing and finding out the three main statistics of a dataset—the mean, median, and mode.

Histograms

03:45

Customization of plot enhances the visualization of the plot even further.

Aesthetic Customizations

03:49

Basic Concepts of Parallel Computing

05:46

Optimize data movements as it is quite common and should be minimized due to the time and the network overhead.

Data Movement

02:45

Learn about the famous Map-Reduce framework and why it is one of the most important ideas in the domains of big data and parallel computing and how to parallelize loops and use reducing functions on them through several CPUs and machines.

Parallel Maps and Loop Operations

03:25

Channels are like background plumbing for parallel computing in Julia, They are the reservoirs from which the individual processes access their data.

Channels

02:04

About the Instructor

Tech Knowledge in Motion

Packt has been committed to developer learning since 2004. A lot has changed in software since then - but Packt has remained responsive to these changes, continuing to look forward at the trends and tools defining the way we work and live. And how to put them to work.

With an extensive library of content - more than 4000 books and video courses -Packt's mission is to help developers stay relevant in a rapidly changing world. From new web frameworks and programming languages, to cutting edge data analytics, and DevOps, Packt takes software professionals in every field to what's important to them now.

From skills that will help you to develop and future proof your career to immediate solutions to every day tech challenges, Packt is a go-to resource to make you a better, smarter developer.

Packt Udemy courses continue this tradition, bringing you comprehensive yet concise video courses straight from the experts.

- About Us
- Udemy for Business
- Become an Instructor
- Affiliate
- Blog
- Topics
- Mobile Apps
- Support
- Careers
- Resources

- Copyright © 2017 Udemy, Inc.
- Terms
- Privacy Policy and Cookie Policy
- Intellectual Property