A Real World Introduction to Amazon's Redshift
3.4 (29 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
145 students enrolled
Wishlisted Wishlist

Please confirm that you want to add A Real World Introduction to Amazon's Redshift to your Wishlist.

Add to Wishlist

A Real World Introduction to Amazon's Redshift

Learn how to spin up Redshift Clusters and load data into Redshift Tables
3.4 (29 ratings)
Instead of using a simple lifetime average, Udemy calculates a course's star rating by considering a number of different factors such as the number of ratings, the age of ratings, and the likelihood of fraudulent ratings.
145 students enrolled
Created by Mike West
Last updated 3/2016
English
Current price: $12 Original price: $20 Discount: 40% off
3 days left at this price!
30-Day Money-Back Guarantee
Includes:
  • 1 hour on-demand video
  • 5 Articles
  • 1 Supplemental Resource
  • Full lifetime access
  • Access on mobile and TV
  • Certificate of Completion

Training 5 or more people?

Get your team access to Udemy's top 2,000 courses anytime, anywhere.

Try Udemy for Business
What Will I Learn?
  • Be able to provision a Redshift Data Warehouse Cluster
  • You'll learn how to create tables and load data
  • The course was designed for DBAs and developers who want to get up to speed with Redshift as quickly as possible.
View Curriculum
Requirements
  • You'll need an Amazon AWS account. You can take advantage of the free preview feature to see how easy it is to create one.
  • We will download several pieces of software in the course. I'll will walk through each of these steps.
Description

Note: The vast majority of the first section is free. Please view these free videos. They will give you an idea of how the rest of the course is structured. Thank you. 

Although a gross oversimplification, Amazon Redshift is a traditional data warehouse platform.

Data warehousing has been around for quite a number of years now. There have been many evolutions in data modeling, storage, and ultimately the vast variety of tools that the business user now has available to help utilize their quickly growing stores of data.

As the industry is moving more towards self service business intelligence solutions for business users, there are also changes in how data is being stored. Amazon Redshift is one of those "game-changing" platforms that is not only driving down the total cost, but also driving up the ability to store even more data to enable even better business decisions to be made.

One of the greatest features about all Amazon’s service is that much of the mundane administration tasks have been removed. The hardware, software patching, and disk management (all of which are no small tasks) have been taken on by Amazon. Disk management, particularly the automated recovery from disk failure, and even the ability to begin querying a cluster that is being restored (even before it is done) are all powerful and compelling things Amazon has done to reduce your workload and increase up-time.

In the course we will create nodes, called redshift cluster. Once we spun up a node we can upload our data sets and perform data analysis. We will walk through all the steps necessary to begin using a redshift cluster in the real world.

One of the greatest benefits of Redshift is blazing fast query performance. There are two core items that are responsible for this. The use of columnar storage technology to improve I/O efficiency and parallelizing queries across multiple nodes. The parallelizing of queries across many nodes is known as MPP or Massive Parallel Processing.

The underlying hardware is designed for high performance data processing, using local attached storage to maximize throughput between the CPUs and drives, and a 10GigE mesh network to maximize throughput between nodes.

The last nail in the coffin for the traditional brick and mortal data warehouse is cost. Redshift accomplishes all this at a fraction of the cost of the traditional data warehouse.

If you are looking to expand your knowledge about Amazon’s data platform and specially about their Redshift service then this course is for you.

Thank you and welcome to Redshift.

Who is the target audience?
  • You should have a solid foundation in database concepts and an understanding of data warehouses would be very beneficial.
  • This is a beginners course but it's a more advanced topic than traditional OLTP courses.
Compare to Other Redshift Courses
Curriculum For This Course
28 Lectures
01:08:06
+
An Introduction to Redshift
9 Lectures 15:17

What are we going to learn in this course? 

Preview 02:05

I want to make sure you are in the right place. 

This course is Redshift. 

Redshift is Amazon's data warehouse inside AWS. (Their cloud)

Preview 01:39

Let's define what a data warehouse is. 

It's not the same as an OLTP system. 

Preview 02:55

You'll need an AWS account for this course. 

The account is free and if you stick to the free tier you'll only incur small usage fees. 

Don't leave a cluster up. 

When you are learning spin up and delete the cluster in the same time span. 

Preview 02:52

This is one of the largest benefits to Redshift. 

Let's learn what it is. 

Preview 02:25

MPP breaks data sets down into bite size segments. 

Let's learn more about MPP in the short video. 

What is MPP?
01:21

The great thing about the cloud is that much of the mundane is offloaded to our cloud provider. 

Let's learn what is offloaded in the short video. 

What Does Fully Managed Mean?
01:14

Download the course content here. 

The download button is on the top right hand side of this lecture. 

Download Course Content Here
00:01

Let's wrap up what we've learned in our first section. 

Summary
00:44

Quiz
5 questions
+
RedShift
10 Lectures 29:23

An Amazon Redshift data warehouse is a collection of computing resources called nodes, which are organized into a group called a cluster.

The Cluster
01:16

Each cluster has a leader node and one or more compute nodes.

Let's learn more about nodes in this short lesson. 

Cluster Node Types
02:23

Before you create you real world cluster you have some things to think about. 

Let's take a look at a few of the more important ones. 

What Do I Need Consider Before I Spin Up the Cluster?
01:03

Let's provision our first cluster. 

Provision is the swanky word and spin up is what the nerds use. 

Demo: Creating Our First Cluster
07:16

You can't connect to you cluster without an inbound rule. 

Let's set one up. 

Demo: Setting Up Security Rule for Your IP
02:25

Deleting a cluster is very straightforward. 

Let's see how it's done in this short lesson. 

Demo: Shutting Down or Deleting a Cluster
02:50

There is no Amazon provided tool to interact with your data when you are using Redshift. 

So, we need a third party one. 

This one is free. 

Demo: Downloading SQL Workbench/J
03:53

The client tool will need to connect to the cluster. 

Drivers do this. 

Let's connect via a JDBC driver in this lesson. 

Demo: Downloading Drivers and Connecting to Our Cluster
04:06

There is a lot of moving parts to setting this up. 

Let's go over all the steps one more time before we dig into our data. 

Demo: An End to End Setup Example
03:04

Summary
01:07

Quiz
15 questions
+
Managing and Loading Our Data
8 Lectures 23:11

In this lecture we are going to look at the recommend approach to copying data into our Redshift tables. 

Demo: Creating Our Tables and Loading Data Using the Copy Command
06:04

Let's load some data into an S3 bucket then use the copy command to move it into our cluster. 

Demo: Moving On Premise Data to S3
05:00

Sort keys are similar in nature to primary keys in OLTP databases. 


Let's learn the basics of sort keys for Redshift. 

Sort Keys
02:42

Snapshots are backups. 

Let's look at some snapshot considerations in Redshift. 

Working with Snapshots
01:44

Let's take a moment to learn how to create backups and learn some of the nuances of restoring. 

Demo: Creating and Managing Snapshots
02:21

In this lecture let's vertically scale our cluster. This simply means adding more resources to it. 


Demo: Vertically Scaling Our Cluster
02:27

Let's take a quick look at Cloudwatch. This tools gives us key metics for measuring our performance.

Demo: Monitoring Our Instances With ClouldWatch. This is Amazon's Perfmon.
01:42

Summary
01:11

Quiz
10 questions
+
Conclusion
1 Lecture 00:16
Congratulations and Thank You
00:16
About the Instructor
Mike West
4.2 Average rating
3,020 Reviews
49,779 Students
42 Courses
SQL Server and Machine Learning Evangelist

I've been a production SQL Server DBA most of my career.

I've worked with databases for over two decades. I've worked for or consulted with over 50 different companies as a full time employee or consultant. Fortune 500 as well as several small to mid-size companies. Some include: Georgia Pacific, SunTrust, Reed Construction Data, Building Systems Design, NetCertainty, The Home Shopping Network, SwingVote, Atlanta Gas and Light and Northrup Grumman.

Experience, education and passion

I learn something almost every day. I work with insanely smart people. I'm a voracious learner of all things SQL Server and I'm passionate about sharing what I've learned. My area of concentration is performance tuning. SQL Server is like an exotic sports car, it will run just fine in anyone's hands but put it in the hands of skilled tuner and it will perform like a race car.

Certifications

Certifications are like college degrees, they are a great starting points to begin learning. I'm a Microsoft Certified Database Administrator (MCDBA), Microsoft Certified System Engineer (MCSE) and Microsoft Certified Trainer (MCT).

Personal

Born in Ohio, raised and educated in Pennsylvania, I currently reside in Atlanta with my wife and two children.