
Dr. Jay Zhou has been involved in 3 head to head competitions to build the best models for clients and he won them all. His work has been used by top telecommunication companies and banks in America and Canada. His favorite tool is SQL. He is the author of the blog https://www.deep-data-mining.com, a Feedspot Top 30 Big Data Blogs Winner.
All slides used in the course are downloadable as a PDF file. The course will cover the following topics. *Why SQL for Data Science? *How to perform common tasks using SQL including: **Data Validation and Understanding **Data Cleansing and Preparation **Feature Variable Calculation. These tasks typically take 80% or more time when performing a data science project. The course will NOT cover predictive model building.
There are 2 SQL script files used for this course. Script File 1. sql_for_ds_data_prep.sql. This file should be run first. -- This script file will create 3 tables and populate them with data. -- These tables are card_txn, sales and score. -- If these tables exist, they will be dropped first. -- There are 110, 9 and 817 records in table card_txn, sales and score, respectively. Script File 2. sql_for_ds.sql. This file contains the SQL queries presented in course. We may load them into SQL Clients such as SQL Developer and run them.
All slides used in the course are in the downloadable PDF file "SQL for Data Science.pdf" below.
In this course, Dr. Jay Zhou, an industrial practitioner and and competition winner, will share his Oracle SQL skills and best practices to perform typical data science/data analytics tasks. Hopefully, after taking the course you will become a better data scientist/data analyst. In your future project, you will be more efficient, make less mistakes, manage your data and scripts better and be stress free when performing complex data work. The course will NOT cover predictive model building.