Baseball Database Queries with SQL and dplyr

Explaining the relationship between SQL and dplyr with baseball examples.
4.5 (117 ratings) Instead of using a simple lifetime average, Udemy calculates a
course's star rating by considering a number of different factors
such as the number of ratings, the age of ratings, and the
likelihood of fraudulent ratings.
3,442 students enrolled
Free
Start Learning Now
  • Lectures 30
  • Length 3 hours
  • Skill Level Beginner Level
  • Languages English
  • Includes Lifetime access
    30 day money back guarantee!
    Available on iOS and Android
    Certificate of Completion
Wishlisted Wishlist

How taking a course works

Discover

Find online courses made by experts from around the world.

Learn

Take your courses with you and learn anywhere, anytime.

Master

Learn and practice real-world skills and achieve your goals.

About This Course

Published 6/2015 English

Course Description

In this course, we explain the relationship between SQL and the R package dplyr. I will show you how to query a baseball database with SQL in Microsoft Access and then show you how to do exactly the same thing with dplyr in R. We will begin with simple queries, progress to aggregation and grouping, and finish with queries involving joins. By the end of the course, you should be able to use dplyr to explore your own data sets.

At a relaxed pace, it should take about three weeks to complete the course. The course is for beginners in SQL, R, and dplyr. You also do not have to understand very much about baseball. We will be using the Lahman Baseball Database, R, dplyr, and Microsoft Access. I will show you how to install everything.

What are the requirements?

  • Students will need to have R and RStudio installed on their own computers. (I will cover how to do this.)
  • It will be best if students also have Microsoft Access, but this is not absolutely necessary.

What am I going to get from this course?

  • download the Lahman baseball database for Access
  • install dplyr and the Lahman baseball database package in R
  • select, filter, and order with SQL and dplyr
  • group and aggregate with SQL and dplyr
  • perform inner joins with SQL and dplyr

What is the target audience?

  • This course is for beginners who would like to learn about SQL and/or dplyr.
  • This course is for beginners interested in baseball analytics.
  • This course is NOT for those with extensive knowledge of both SQL and dplyr.

What you get with this course?

Not for you? No problem.
30 day money back guarantee.

Forever yours.
Lifetime access.

Learn on the go.
Desktop, iOS and Android.

Get rewarded.
Certificate of completion.

Curriculum

Section 1: Getting Set Up
01:05

This is our course intro.

05:55

After viewing this lecture, you will know how to download the Access version of the Lahman Baseball Database and how to set it up for querying.

07:10

After viewing this lecture, you will be able to download R and RStudio and install R packages. You will also gain a basic familiarity with the RStudio interface.

01:00

After viewing this lecture, you will be able to install the Rcpp package.

Section 2: Basic Queries
09:04

After viewing this lecture, you will be able to use SQL in Access and dplyr in R to select columns of interest from a table or data frame.

04:58

After viewing this lecture, you will be able to use SQL in Access and dplyr in R to order records by a particular column.

07:00

After viewing this lecture, you will be able to use SQL in Access and dplyr in R to filter records according to certain criteria.

06:11

After viewing this lecture, you will be able to employ more complicated filters with both SQL and dplyr.

Section 3: Grouping and Aggregation
03:49

After viewing this lecture, you will be able to group records and sum a particular field for a grouping with SQL in Access.

05:33

After viewing this lecture, you will be able to group records and sum a particular field for a group with

dplyr in R.

06:31

After viewing this lecture, you will be able to use the avg (SQL) and mean (dplyr) aggregate functions.

05:33

After viewing this lecture, you will be able to calculate the max and min values of a field for a group, with either SQL or dplyr.

03:40

After viewing this lecture, you will be able to count the number of records in a group, with either SQL or dplyr.

09:50

After viewing this lecture, you will be able to filter before and after grouping, in either SQL or dplyr.

07:10
After viewing this lecture, you will be able to calculate batting averages for individual records with SQL or with dplyr. You will also learn how to use the mutate verb in dplyr.
10:54

In this lecture, we calculate career batting averages with both SQL and dplyr.

Section 4: Inner Joins
07:18

After viewing this lecture, you will be able to perform INNER JOINS with SQL in Access.

06:27

After viewing this lecture, you will be able to perform INNER JOINS with dplyr in R.

08:36

In this video, we practice with INNER JOINS.

06:22

After viewing this lecture, you will be able to perform an INNER JOIN with SQL in Access on more than one field.

07:00

After viewing this lecture, you will be able to perform an INNER JOIN with dplyr in R on more than one field.

05:07

After viewing this lecture, you will be able to join three tables with SQL in Access.

04:31

After viewing this lecture, you will be able to join three tables with dplyr in R.

04:07

After viewing this lecture, you will be able to write queries where you are both grouping records and joining tables with SQL in Access.

06:09

After viewing this lecture, you will be able to write queries where you are both grouping records and joining tables with dplyr in R.

Section 5: Practice Problems
06:50

This is our first practice problem, solved with SQL.

07:23

This is our first practice problem, solved with dplyr.

05:38

This is our second practice problem, solved with SQL.

04:10

This is our second practice problem, solved with dplyr.

07:21

After viewing this lecture, you will be able to read csv files into R and query them with dplyr.

Students Who Viewed This Course Also Viewed

  • Loading
  • Loading
  • Loading

Instructor Biography

Charles Redmond, Professor at Mercyhurst University

Dr. Charles Redmond is a professor in the Tom Ridge School of Intelligence Studies and Information Science at Mercyhurst University. He has been a member of the Department of Mathematics and Computer Systems at Mercyhurst for 21 years and has recently completed a term as chair of the department. Dr. Redmond received his PhD in mathematics from Lehigh University in 1993 and has published in the Annals of Applied Probability, the Journal of Stochastic Processes and Their Applications, Mathematics Magazine, the College Mathematics Journal, and Mathematics Teacher. In his spare time he enjoys making music and computer generated art, reading, and owning a Clumber Spaniel.

Ready to start learning?
Start Learning Now