Welcome to Serverless Data Analysis with Big Query on Google's Cloud This is the second course in a series of courses designed to help you attain the coveted Google Certified Data Engineer.
Additionally, the series of courses is going to show you the role of the data engineer on the Google Cloud Platform.
At this juncture the Google Certified Data Engineer is the only real world certification for data and machine learning engineers.
Note: This is not a programmers course on BigQuery. The goal of this course and the entire series of courses is to provide students with the foundation of the services you'll need to know for the Google Certified Data Engineering Exam.
Because SQL is a prerequisite for the course this course is mostly lecture. Don't let that lull to sleep though, this service is heavily covered on the exam.
BigQuery is Google's fully managed, petabyte scale, low cost enterprise data warehouse for analytics. BigQuery is serverless. There is no infrastructure to manage and you don't need a database administrator, so you can focus on analyzing data to find meaningful insights using familiar SQL.
We are in a data revolution. Data used to be viewed as a simple necessity and lower on the totem pole. Now it is more widely recognized as the source of truth. As we move into more complex systems of data management, the role of the data engineer becomes extremely important as a bridge between the DBA, developer and the data consumer. Beyond the ubiquitous spreadsheet, graduating from RDBMS (which will always have a place in the data stack), we now work with NoSQL and Big Data technologies.
Most cloud computing vendors are moving to a serverless architecture. What's serverless? Serverless is about abstracting users away from servers, infrastructure, and having to deal with low-level configuration or the core operating system. Instead, developers make use of single purpose services to execute code.
Imagine for a second being able to upload data into a storage bucket and then run SQL like queries against it. Many data analysts call this the grail to data analysis. With BigQuery, that's exactly what you do. There's no spinning up or configuring anything. You upload data in the form of a csv or json file and an query against it. I don't mean a hundred thousand rows. I mean a billion.
*Five Reasons to take this Course.*
1) You Want to be a Data Engineer
It's the number one job in the world. (not just within the computer space) The growth potential career wise is second to none. You want the freedom to move anywhere you'd like. You want to be compensated for your efforts. You want to be able to work remotely. The list of benefits goes on.
2) The Google Certified Data Engineer
Google is always ahead of the game. If you were to look back at at timeline of their accomplishments in the data space you might believe they have a crystal ball. They've been a decade ahead of everyone. Now, they are the first and the only cloud vendor to have a data engineering certification. With their track record I'll go with Google.
3) The Growth of Data is Insane
Ninety percent of all the world's data has been created in the last two years. Business around the world generate approximately 450 billions transactions a day. The amount of data collected by all organizations is approximately 2.5 Exabytes a day. That number doubles every month.
4) The Data Revolution is Here
We are in a data revolution. Data used to be viewed as a simple necessity and lower on the totem pole. Now it is more widely recognized as the source of truth. As we move into more complex systems of data management, the role of the data engineer becomes extremely important as a bridge between the DBA and the data consumer.
5) You want to be ahead of the Curve
The data engineer role is fairly new. While your learning, building your skills and becoming certified you are also the first to be part of this burgeoning field. You know that the first to be certified means the first to be hired and first to receive the top compensation package.
Thank you for your interest in Serverless Data Analysis with Big Query on Google's Cloud and we will see you in the course!!
In this lesson let's high level what this course is about.
This is the same video as the course preview.
Let's talk about what you are going to learn in the course.
This course is all about one cloud service in Google's cloud called BigQuery.
In this lesson I'll try to answer some questions that are specific about the course.
In this lesson let's define the data engineer.
What do they do all day?
How can you become one?
Serverless simply means that the hardware is abstracted away from us.
With BigQuery, this means all we do is log into the console, upload some data and write a SQL query.
There's no hardware we have to configure.
What prompted Google to create BigQuery?
Let's learn the why in this lesson.
BigQuery uses SQL.
Let's discuss SQL in this lesson.
Data in BigQuery is not stored in rows but in columns.
Let's take a look at columnar storage.
BigQuery is a structured data store.
Let's talk about that structure in this lesson.
There are three core approaches to loading data into BigQuery.
Let's learn about them in this lesson.
BigQuery isn't a scaled up architecture.
It's a scaled out architecture.
Let's learn about scale in this lesson.
BigQuery uses a global namespace.
Let's learn what that is in this lesson.
Jobs in BigQuery run asynchronously.
Let's learn about job basics in this lesson.
In this lesson let's learn what BigQuery is not.
If you haven't taken the introductory course on Google's Cloud there will be knowledge gaps.
This lesson reinforces the importance of taking the courses in order.
This brief lesson will help define BigQuery visually.
We always need to keep in mind that in GCP everything is created within the confines of a project.
The graphical user interface for BigQuery is a Web interface.
Let's cover the very basics in this lesson.
Let's learn how to execute and save a query in this lesson.
As the query author you decide who receives access to it.
In this lesson let's show you have to gran access to users on a project.
In this lesson let's upload a dataset from our computer.
In this lesson let's learn the hierarchy naming convention for Queries in BigQuery.
In this short lesson let's learn how to export data via the cloud shell.
It's very straightforward.
In this lesson let's learn how pricing in BigQuery works.
In this lesson let's learn about data durability specific to BigQuery.
You can rent your won dremel clusters.
Let's find out exactly what that means in this lesson.
In this lesson let's learn the basic structure of a BigQuery query.
In this lesson let's learn about the anatomy of a BigQuery Query.
We will step line be line through the code.
In this lesson let's learn how craft a simple join in BigQuery.
In this lesson let's cover the basics of the inner join in BigQuery.
In this lesson let's learn how to us order and group by in BigQuery.
In this lesson let's the learn the basics of the subquery in BigQuery.
We can easily join tables in BigQuery.
In this lesson let's learn how to do that.
In this lesson let's walk through a more advanced subquery.
Instead of using LIKE let's take a look at CONTAINS in BigQuery is this brief lesson.
Used infrequently but power never the less, let's learn about windowing functions in this lesson.
The typical GROUP BY clause with a slight twist.
Let's learn about the EACH keyword in this lesson.
This is quite different than traditional SQL.
Let's learn about this clause in this lesson.
I've been a production SQL Server DBA most of my career.
I've worked with databases for over two decades. I've worked for or consulted with over 50 different companies as a full time employee or consultant. Fortune 500 as well as several small to mid-size companies. Some include: Georgia Pacific, SunTrust, Reed Construction Data, Building Systems Design, NetCertainty, The Home Shopping Network, SwingVote, Atlanta Gas and Light and Northrup Grumman.
Experience, education and passion
I learn something almost every day. I work with insanely smart people. I'm a voracious learner of all things SQL Server and I'm passionate about sharing what I've learned. My area of concentration is performance tuning. SQL Server is like an exotic sports car, it will run just fine in anyone's hands but put it in the hands of skilled tuner and it will perform like a race car.
Certifications are like college degrees, they are a great starting points to begin learning. I'm a Microsoft Certified Database Administrator (MCDBA), Microsoft Certified System Engineer (MCSE) and Microsoft Certified Trainer (MCT).
Born in Ohio, raised and educated in Pennsylvania, I currently reside in Atlanta with my wife and two children.