Learn to Analyze Text Data in Bash Shell and Linux
3.9 (34 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1,299 students enrolled
Wishlisted Wishlist

Please confirm that you want to add Learn to Analyze Text Data in Bash Shell and Linux to your Wishlist.

Add to Wishlist

Learn to Analyze Text Data in Bash Shell and Linux

An animated course for learning data analytics and big data pre-processing with Bash shell scripting in Linux
3.9 (34 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
1,299 students enrolled
Last updated 9/2017
English
Current price: $10 Original price: $20 Discount: 50% off
5 hours left at this price!
30-Day Money-Back Guarantee
Includes:
  • 38 mins on-demand video
  • 50 Articles
  • 13 Supplemental Resources
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion
What Will I Learn?
  • Use Bash to quickly sort, search, match, replace, clean and optimise various aspect of a data set
  • Use bash in processing real-world data sets (included)
  • Use Bash commands and scripting
  • Use Regular Expressions (RegEX) in Bash
  • Use AWK programming language commands to tweak and format data
  • Use SED and GREP to quickly search in large-scale data sets
View Curriculum
Requirements
  • A Linux based operating system installed on your computer
Description

Udemy Early Access Program Reviews (5 out of 5 Stars):

"This is one of the best course I have reviewed in Udemy. All the chapters are very useful. The instructor explained exactly what you need to use Bash as your data analysis tool in your pocket. I look forward more coursed from this Instructor. The instructor is very experienced, explanations are on point. Than you for creating a great course." -  Tarique Syed

"The instructor was very engaging. Changed a boring, hard-to-understand tool into something usable and easy-to-use, all the while making it fun to learn." - Prat Ram

"Well done. Well - structured and explained course. Will definitely recommend the course to my course. From my point of view, everything was OK in the course." - Sem Milaserdov

"Overall, the course delivered what promised with a good resource for those who want to learn and do more. The course is filled with resource and the educator attached his own book on the subject for the learners." - Afshin Kalantari

"It's a very well organized course, from the background, basic Linux cli which everyone should be to build data processing scenarios. wonderful class." - Charley Guan

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

This beginner friendly course is specifically designed to show you how to use Bash commands and shell programming to handle textual data which can be a csv format data or systems log file. In this course you will  learn Bash by doing projects. 

However, you need to understand the fact that Bash may not the best way to handle all kinds of data! But there often comes a time when you are provided with a pure Bash environment, such as what you get in the common Linux based Super-computers and you just want an early result or view of the data before you drive into the real programming, using Python, R and SQL, SPSS, and so on. Expertise in these data-intensive languages also comes at the price of spending a lot of time on them.

In contrast, bash scripting is simple, easy to learn and perfect for mining textual data! Particularly if you deal with genomics, microarrays, social networks, life sciences, and so on. It can help you to quickly sort, search, match, replace, clean and optimise various aspect of your data, and you wouldn’t need to go through any tough learning curves. We strongly believe, learning and using Bash shell scripting should be the first step if you want to say, Hello Big Data!

This course starts with some practical bash-based flat file data mining projects involving:

  • University ranking data
  • Facebook data
  • AU Crime Data

(Data sets and documentations are provided at the end of each section)

If you haven’t used Bash before, feel free to skip the projects and get to the tutorials part (supporting materials: eBook). Read the tutorials and then come back to the projects again. The tutorial section will introduce with bash scripting, regular expressions, AWK, sed, grep and so on. Finally, it gives you a concise beginner friendly guide to the big data landscape including an overview of the critical Big Data tools such as HDFS, MapReduce, YARN, Flume, Hive and more. The course finishes with a near-complete list of references to all the relevant command line and Big data tools.

Authored by Ahmed Arefin, PhD and thankfully voiced by A. Collinwood (voice artist). This course is a core component of the 'Learn Scientific Programming' project at scientificprogramming io.

Who is the target audience?
  • Students who want to learn Bash and the command line to improve their career prospects
  • Researchers who want to add Bash and other command line tools to their bag of tricks
  • Scientists who want to learn to explore and analyze the data that their lab generates
  • Journalists who want to polish their reporting by analyzing publicly-available datasets
  • Anyone wants to deal with Big Data
Compare to Other Bash Shell Courses
Curriculum For This Course
66 Lectures
01:12:53
+
Course Intro
1 Lecture 01:27

This course starts with some practical bash-based flat file data mining projects involving:

  • University ranking data
  • Facebook data
  • Crime Data; and
  • Shakespeare-era plays and poems data (coming soon!)

If you haven’t used Bash before, feel free to skip the projects and get to the tutorials part. Read the tutorials and then come back to the projects again. The tutorial section will introduce with bash scripting, regular expressions, AWK, sed, grep and so on.

Finally, it gives you a concise beginner friendly guide to the big data landscape including an overview of the critical Big Data tools such as HDFS, MapReduce, YARN, Flume, Hive and more. The book finishes with a near-complete list of references to all the relevant command line and Big data tools.

Preview 01:27
+
Project 1: University ranking data mining
7 Lectures 11:17



Finding the correlation between university tuition and ranks (tail and redirect)
02:27

Uni rank data project commands demo.

Preview 01:23

Project documentation
00:02


Bash commands quiz
3 questions
+
Project 2: Facebook data mining
6 Lectures 10:20
Data preview (head command)
02:00

Find the number of status and most popular status entry (cut, sort, grep, awk)
03:02

Building a function to find the most vibrant Facebook status (Bash functions)
02:17

Facebook Project Demo

Demonstration
02:59

Project data set (Facebook like and share data)
00:01

Project documentation (how-to)

Project documentation
00:00
+
Project 3: Australian cities crime statistics
8 Lectures 15:20
Data preview (head and csvlook commands)
04:13

Finding rows and columns stats (wc, sed, csvstat)
03:09

Finding the top most crime per city (awk)
02:05

Finding the best city in Australia (Bash shell programming)
02:05

Demonstration of commands used in this project

Demonstration
02:40

AU crime per city data set (from data.gov.au).

Project data set (Au Crime data set)
00:01

Crimestat.sh source code description
01:06

Project documentation

Project documentation
00:00
+
Tutorial 1: Bash Shell Scripting!
13 Lectures 15:25
Hello! Bash
01:05

"Hello World!" Bash Shell Programming
00:59

Bash variables and functions
01:37

Bash command execution and passing arguments
00:19

Bash meta characters
02:12

Bash quotation basics
00:56

Bash’s built-in read function helps you to read user input and the variable next to read will store the input value. See an example in this lesson. Too easy!

Preview 00:10

Bash redirections
00:40

Bash conditional statements
02:23

Bash `loop` statements
01:53

Bash arithmatic operations
02:29


Download all the Bash scripting tutorials as a single pdf file.

Download the Bash shell scripting tutorials
00:01
+
Tutorial 2: RegEx (Regular Expressions)
8 Lectures 05:35
Hello! RegEx
01:20

Basic Regular Expressions (BRE)
01:06

Extended Regular Expressions (ERE)
00:27

REGEX character classes
00:31

REGEX look arounds
00:47

REGEX atomic groups
00:56


Download all the RegEx tutorials as a single pdf file.

Download the regular expressions tutorials
00:01
+
Tutorial 3: AWK scripting
5 Lectures 04:39
Hello! AWK
01:12

AWK Built-in Variables
00:39

AWK built-in functions
01:06

AWK useful examples
01:39

Download all the AWK tutorials as a single pdf file.

Download the AWK programming tutorial
00:01
+
Tutorial 4: SED (Stream Editor)
4 Lectures 03:48
Hello! SED - Stream Editor
01:08


SED substitutions
01:29

SED and regular expressions
00:35

+
Tutorial 5: GREP and Find
3 Lectures 01:21
What is GREP?
00:58

Hello! Find command
00:21

This lesson contains tutorials on the following

  • sed
  • grep
  • find
  • Download all the tutorials as a single pdf file.


Download the SED, GREP and Find tutorials
00:02
+
Tutorial 6: Beyond the Text files (Big Data File Concepts)
11 Lectures 08:25
Introduction
00:42

HDFS - Hadoop Distributed File System
01:31

Map Reduce
00:33

YARN - Yet Another Resource Negotiator!
00:33

Flume
00:30

SQOOP
00:56

Hive
00:47

Pig
00:47

Spark
01:06

HBase
00:56

Download the Big data tools and concepts
00:00
About the Instructor
Ahmed Arefin, PhD
3.8 Average rating
42 Reviews
1,380 Students
3 Courses
Computation Scientist| Founder- Learn Scientific Programming

Ahmed Arefin, PhD is an enthusiastic computer programmer with more than a decade of well-rounded computational experience. He likes to code, but loves to write, research and teach. Following a PhD and Postdoc research in the area of data-parallelism he moved forward to become a Scientific Computing professional, keeping his research interests on, in the area of parallel, distributed and accelerated computing. 

In his day job, he pets a few of the world’s fastest T500 supercomputers at a large Australian agency for scientific research.

Learn Scientific Programming
3.8 Average rating
42 Reviews
1,380 Students
3 Courses

Learn Scientific Programming is an innovative E-Learning school that aims to demonstrate the use of scientific programming languages, e.g., Julia, OpenMP, MPI, C++, Matlab, Octave, Bash, Python Sed and AWK including RegEx in processing scientific and real-world data. 

We help you to solve large-scale science biological, engineering, and humanities problems, gain adequate understanding through the analysis of mathematical models implemented on high-performance computers and share the knowledge. 

scientificprogramming io