Learn to Analyze Text Data in Bash Shell and Linux
4.2 (30 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1,271 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Learn to Analyze Text Data in Bash Shell and Linux to your Wishlist.

Add to Wishlist

Learn to Analyze Text Data in Bash Shell and Linux

An animated course for learning data analytics and big data pre-processing with Bash shell scripting in Linux
4.2 (30 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1,271 students enrolled
Created by Ahmed Arefin, PhD
Last updated 7/2017
English
Current price: $10 Original price: $100 Discount: 90% off
5 hours left at this price!
30-Day Money-Back Guarantee
Includes:
  • 1 hour on-demand video
  • 52 Articles
  • 16 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Use Bash to quickly sort, search, match, replace, clean and optimise various aspect of a data set
  • Use bash in processing real-world data sets (included)
  • Use Bash commands and scripting
  • Use Regular Expressions (RegEX) in Bash
  • Use AWK programming language commands to tweak and format data
  • Use SED and GREP to quickly search in large-scale data sets
View Curriculum
Requirements
  • A Linux based operating system installed on your computer
Description
  • Last updated on 4 Aug 2017: added the project data sets and many new lessons on Bash Shell scripting, RegEx, AWK and Sed.

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

This course is specifically designed to show you how to use Bash commands and shell programming to handle textual data which can be a csv format data or systems log file. In this course you will  learn Bash by doing projects. 

However, you need to understand the fact that Bash may not the best way to handle all kinds of data! But there often comes a time when you are provided with a pure Bash environment, such as what you get in the common Linux based Super-computers and you just want an early result or view of the data before you drive into the real programming, using Python, R and SQL, SPSS, and so on. Expertise in these data-intensive languages also comes at the price of spending a lot of time on them.

In contrast, bash scripting is simple, easy to learn and perfect for mining textual data! Particularly if you deal with genomics, microarrays, social networks, life sciences, and so on. It can help you to quickly sort, search, match, replace, clean and optimise various aspect of your data, and you wouldn’t need to go through any tough learning curves. We strongly believe, learning and using Bash shell scripting should be the first step if you want to say, Hello Big Data!

This course starts with some practical bash-based flat file data mining projects involving:

  • University ranking data
  • Facebook data
  • Crime Data
  • Shakespeare-era plays and poems data

If you haven’t used Bash before, feel free to skip the projects and get to the tutorials part (supporting materials: eBook). Read the tutorials and then come back to the projects again. The tutorial section will introduce with bash scripting, regular expressions, AWK, sed, grep and so on. Finally, it gives you a concise beginner friendly guide to the big data landscape including an overview of the critical Big Data tools such as HDFS, MapReduce, YARN, Flume, Hive and more. The course finishes with a near-complete list of references to all the relevant command line and Big data tools.

Who is the target audience?
  • Students who want to learn Bash and the command line to improve their career prospects
  • Researchers who want to add Bash and other command line tools to their bag of tricks
  • Scientists who want to learn to explore and analyze the data that their lab generates
  • Journalists who want to polish their reporting by analyzing publicly-available datasets
  • Anyone wants to deal with Big Data
Students Who Viewed This Course Also Viewed
Curriculum For This Course
72 Lectures
01:24:53
+
Course Intro
1 Lecture 01:59

In this section, I will introduce you with my Hello Big Data @ Bash course.

Bash may not the best way to handle all kinds of Data! But, there often comes a time when you are provided with a pure Bash environment, such as what you get in the common Linux based super computers and you just want some early results or view of the data before you drive into the real programming, using Python, R and SQL, SPSS, and so on. Expertise in these data-intensive languages also comes at the cost of spending a lot of time on them.

In contrast, bash scripting is simple, easy to learn and perfect for mining textual data! Particularly if you deal with genomics, microarrays, social networks, life sciences, and so on. It

can help you to quickly sort, search, match, replace, clean and optimize various aspect of your data, and you wouldn’t need to go through any tough learning curves. We strongly believe, learning and using Bash shell scripting should be the first step if you want to say, Hello! To Big Data.

Preview 01:59
+
Project 1: University ranking data mining
7 Lectures 11:17



Finding the correlation between university tuition and ranks (tail and redirect)
02:27

Uni rank data project commands demo.

Preview 01:23

Project documentation
00:02


Bash commands quiz
3 questions
+
Project 2: Facebook data mining
6 Lectures 10:20
Data preview (head command)
02:00

Find the number of status and most popular status entry (cut, sort, grep, awk)
03:02

Building a function to find the most vibrant Facebook status (Bash functions)
02:17

Facebook Project Demo

Demonstration
02:59

Project data set (Facebook like and share data)
00:01

Project documentation (how-to)

Project documentation
00:00
+
Project 3: Australian cities crime statistics
8 Lectures 15:20
Data preview (head and csvlook commands)
04:13

Finding rows and columns stats (wc, sed, csvstat)
03:09

Finding the top most crime per city (awk)
02:05

Finding the best city in Australia (Bash shell programming)
02:05

Demonstration of commands used in this project

Demonstration
02:40

AU crime per city data set (from data.gov.au).

Project data set (Au Crime data set)
00:01

Crimestat.sh source code description
01:06

Project documentation

Project documentation
00:00
+
Project 4: Shakespearean-era plays and poems data minng
6 Lectures 11:28
Data preview (head command)
02:51

Counting Plays/Poems
01:39

Finding plays/poems by each author (complex example)
02:15

Find the most frequent words by Shakespeare (Complex project)
04:41

Project data set (Plays and poems data)
00:01

Project documentation
00:00
+
Tutorial 1: Bash Shell Scripting!
13 Lectures 15:25
Hello! Bash
01:05

"Hello World!" Bash Shell Programming
00:59

Bash variables and functions
01:37

Bash command execution and passing arguments
00:19

Bash meta characters
02:12

Bash quotation basics
00:56

Bash’s built-in read function helps you to read user input and the variable next to read will store the input value. See an example in this lesson. Too easy!

Preview 00:10

Bash redirections
00:40

Bash conditional statements
02:23

Bash `loop` statements
01:53

Bash arithmatic operations
02:29


Download all the Bash scripting tutorials as a single pdf file.

Download the Bash shell scripting tutorials
00:01
+
Tutorial 2: RegEx (Regular Expressions)
8 Lectures 05:35
Hello! RegEx
01:20

Basic Regular Expressions (BRE)
01:06

Extended Regular Expressions (ERE)
00:27

REGEX character classes
00:31

REGEX look arounds
00:47

REGEX atomic groups
00:56


Download all the RegEx tutorials as a single pdf file.

Download the regular expressions tutorials
00:01
+
Tutorial 3: AWK scripting
5 Lectures 04:39
Hello! AWK
01:12

AWK Built-in Variables
00:39

AWK built-in functions
01:06

AWK useful examples
01:39

Download all the AWK tutorials as a single pdf file.

Download the AWK programming tutorial
00:01
+
Tutorial 4: SED (Stream Editor)
4 Lectures 03:48
Hello! SED - Stream Editor
01:08


SED substitutions
01:29

SED and regular expressions
00:35

+
Tutorial 5: GREP and Find
3 Lectures 01:21
What is GREP?
00:58

Hello! Find command
00:21

This lesson contains tutorials on the following

  • sed
  • grep
  • find
  • Download all the tutorials as a single pdf file.


Download the SED, GREP and Find tutorials
00:02
1 More Section
About the Instructor
Ahmed Arefin, PhD
4.4 Average rating
32 Reviews
1,293 Students
2 Courses
Computation Scientist

Ahmed Arefin, PhD is an enthusiastic computer programmer with more than a decade of well-rounded computational experience. He likes to code, but loves to write, research and teach. Following a PhD and Postdoc research in the area of data-parallelism he moved forward to become a Scientific Computing professional, keeping his research interests on, in the area of parallel, distributed and accelerated computing. 

In his day job, he pets a few of the world’s fastest T500 supercomputers at a large Australian agency for scientific research.