
Please make sure to check-out this lecture! Also, I've provided a reference of all tables we're going to use in a .txt file.
Learn how to set up Google BigQuery for data analytics by pinning the class project, navigating the SQL workspace, and accessing the sample data set and customers table.
Don't skip this! Also please find the list of samples in a downloadable text file if you'd like to keep them handy.
Take advantage of this new technology, especially for getting unstuck!
Don't worry about the details, just try to focus on how I use the tool to learn something.
Explore how to interpret tables and columns, write basic select queries in BigQuery, use where and order by clauses, and understand rows, columns, and data types to generate insights.
Explore the anatomy of a select statement in BigQuery, learning how to choose columns, alias tables, join related data, and apply where filters for clean analytics.
Write select statements to count how many questions each tag has in a StackOverflow dataset, using group by, distinct counts of question IDs, and ordering by tag.
Learn to identify questions with multiple tags in BigQuery by aggregating tags into arrays, counting distinct tags with array length, and using group by and having.
Explore how to treat subqueries and common table expressions as building blocks in BigQuery, using where clauses and array length to filter questions by tags.
Analyze how to obtain accurate views for questions with Python in the title by handling repeated tags and using a distinct subquery to sum quarter views.
Extract all questions per tag from the stack overflow data set in BigQuery, deduplicate with distinct, compute per-tag counts with subqueries and a CTE, and identify the top tag.
Use a SQL case statement to classify stack overflow question titles into languages, build a language column, and count distinct titles to reveal popularity in BigQuery.
Apply case statements to map post titles and tags to languages, aggregate distinct tags into a single field, and produce one row per document for BigQuery analysis.
Explore language growth on Stack Overflow data using SQL to measure year-over-year views by extracting year, summing views, and applying window functions.
Learn to use Python to generate larger SQL statements dynamically by building a case statement from a languages list using f-strings. Apply approach by assembling substatements and handling else null.
Explore inner, left, right, and full outer joins in BigQuery, learning how to write select statements across multiple tables, handle nulls with coalesce, and visualize join behavior.
Apply inner and left joins to relate orders and customers in a real dataset, count orders and customers, and analyze statuses and year-month trends with date functions.
Analyze most popular cities by joining orders and customers, count city orders, and use row_number window functions to analyze customer purchasing sequences.
Identify new customers by year and month using a row_number window over order timestamp partitioned by customer_id to isolate first orders, then count by year and month with group by.
Explore how to use ROW_NUMBER() as a window function to count the nth purchase per product, joining order items with orders and partitioning by product ID.
Apply lag to compute previous order timestamps per product in the order_items data, then use date_diff to measure days between orders and enrich with product category information.
Discover how to compute the average days between orders for each product using row_number and lag, filter non-null intervals, and date diff in BigQuery.
Use row_number and lag in BigQuery to identify customers with multiple orders, use lag to get previous order dates, exclude single-order customers, and compute the average days between orders.
Discover how the full outer join returns all records from two tables on customer id, revealing overlaps and gaps that inner or left joins miss.
Learn to use a full outer join to identify all records from two tables, categorize each row as left-only, right-only, or in the intersection, and label with case statements.
Analyze hourly revenue trends with time series analysis by joining orders to order payments, summing payment values across installments, and extracting purchase hour to reveal peak hours.
Use case statements to classify hourly sales into day parts—overnight, morning, afternoon, evening—and compute revenue by time of day in BigQuery, enabling flexible revenue analytics.
Learn a fun trick with case statements to create conditional sums and columns in SQL, using overnight, morning, and afternoon sales as examples.
Learn how to unnest repeated fields in BigQuery, generate array values, and compute running totals using a correlated subquery pattern with inner and outer queries.
Learn to compute moving averages in BigQuery using an over clause with order by index and preceding rows, and perform a quick comprehension check with lag to detect price trends.
Apply a correlated subquery to approximate lag on a base table by taking the max dollars where the outer index is greater, comparing with the lag function.
Learn to build a customer analytics table from the dvd rental payments data, capturing first order date and total revenue, then derive first-order revenue with window functions and joins.
Analyze correlated subqueries to measure customer lifetime value within the first 30 and 60 days after the first payment, using common table expressions and revenue summaries.
Analyze stock prices in BigQuery by computing 50-day and 200-day moving averages for Google's closing price, using window functions and a CTE to generate a buy signal from crossovers.
Learn to compute, for each customer, their top two movie ratings by rental revenue using joins across payments, rentals, inventory, and film, with partition by customer and row_number.
Rank each customer's rentals by revenue within each rating to surface the top two, using partition by customer id and rating and ordering by rental revenue descending, and applying having.
Explains using multiple fields in a partition by clause with row_number to track top ratings and revenue by customer and month, illustrating shifting preferences over time for forecasting and offerings.
Explore text processing in BigQuery by extracting words from Stack Overflow titles using regex and split, then unnest the results to count common words per tag.
Analyze top occurring unigrams in stack overflow titles by tag using big query. Normalize words, remove stop words, and extract the top three words per tag.
Explore natural language processing with BigQuery ML N-grams to generate unigrams, bigrams, and trigrams from text, extract common phrases, and rank them by frequency for data analytics.
Woah, another SQL course? Yes! Here's why this one is different.
We write 100% of the code together and I explain everything precisely and with abundant context. I have over a decade of industry experience over have taught this at the university level.
100% we do is application based. I rarely use toy data to illustrate points, unless it's more illustrative. Hopefully you'll learn much more than SQL throughout the course.
ZERO SET-UP. As long as you have a Google Account, you can login to BigQuery and get started immediately. No headaches configuring databases locally. You'll be up and running in under 3 minutes.
Spaced repetition to develop mastery. This is NOT a table of contents course. This is NOT a list of disparate exercises. Everything is connected. Concepts are revisited throughout the course so you can see them from different angles and maximize understanding.
Can you solve it? I provide tons of mini-challenges throughout the lecture material. The lecture material is basically us solving the problems. No time wasted on theory without context.
I'm not boring. I am human. I do make mistakes. I dwell on them so you can master the debugging process. Debugging is much more important than writing code.