Baseball Database Queries with SQL and dplyr

Explaining the relationship between SQL and dplyr with baseball examples.
Free tutorial
Rating: 4.7 out of 5 (571 ratings)
15,156 students
Baseball Database Queries with SQL and dplyr
Free tutorial
Rating: 4.7 out of 5 (571 ratings)
15,156 students
download the Lahman baseball database for Access
install dplyr and the Lahman baseball database package in R
select, filter, and order with SQL and dplyr
group and aggregate with SQL and dplyr
perform inner joins with SQL and dplyr

Requirements

  • Students will need to have R and RStudio installed on their own computers. (I will cover how to do this.)
  • It will be best if students also have Microsoft Access, but this is not absolutely necessary.
Description

In this course, we explain the relationship between SQL and the R package dplyr. I will show you how to query a baseball database with SQL in Microsoft Access and then show you how to do exactly the same thing with dplyr in R. We will begin with simple queries, progress to aggregation and grouping, and finish with queries involving joins. By the end of the course, you should be able to use dplyr to explore your own data sets.

At a relaxed pace, it should take about three weeks to complete the course. The course is for beginners in SQL, R, and dplyr. You also do not have to understand very much about baseball. We will be using the Lahman Baseball Database, R, dplyr, and Microsoft Access. I will show you how to install everything.

Who this course is for:
  • This course is for beginners who would like to learn about SQL and/or dplyr.
  • This course is for beginners interested in baseball analytics.
  • This course is NOT for those with extensive knowledge of both SQL and dplyr.
Course content
5 sections • 30 lectures • 3h 2m total length
  • Introduction
    01:05
  • Access Set-up
    05:55
  • RStudio Set-up
    07:10
  • The Rcpp Package
    01:00
  • SELECT (SQL) and select (dplyr)
    09:04
  • ORDER BY (SQL) and arrange (dplyr)
    04:58
  • WHERE (SQL) and filter (dplyr)
    07:00
  • AND and OR
    06:11
  • Grouping and Sum in Access SQL
    03:49
  • Grouping, Summarize, and Sum in dplyr
    05:33
  • Averaging in Access SQL and dplyr
    06:31
  • max and min
    05:33
  • count (SQL) and n (dplyr)
    03:40
  • WHERE vs. HAVING
    09:50
  • Batting Average and Mutate (dplyr)
    07:10
  • Career Batting Average
    10:54
  • Inner Joins with Access SQL
    07:18
  • Inner Joins with dplyr
    06:27
  • A Query with an INNER JOIN
    08:36
  • Joining on more than one field with SQL in Access
    06:22
  • Joining on more than one field with dplyr in R
    07:00
  • Joining three tables with SQL in Access
    05:07
  • Joining three tables with dplyr in R
    04:31
  • Grouping and joining with SQL in Access
    04:07
  • Grouping and joining with dplyr in R
    06:09
  • Problem #1 with SQL
    06:50
  • Problem #1 with dplyr
    07:23
  • Problem #2 with Access
    05:38
  • Problem #2 with dplyr
    04:10
  • Reading Other Data into R
    07:21

Instructor
Professor at Mercyhurst University
Charles Redmond
  • 4.6 Instructor Rating
  • 5,764 Reviews
  • 70,286 Students
  • 7 Courses

Dr. Charles Redmond is a professor in the Tom Ridge School of Intelligence Studies and Information Science at Mercyhurst University. He has been a member of the Department of Mathematics and Computer Systems at Mercyhurst for 21 years and has recently completed a term as chair of the department. Dr. Redmond received his PhD in mathematics from Lehigh University in 1993 and has published in the Annals of Applied Probability, the Journal of Stochastic Processes and Their Applications, Mathematics Magazine, the College Mathematics Journal, and Mathematics Teacher. In his spare time he enjoys making music and computer generated art, reading, and owning a Clumber Spaniel.