Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Created by365 Careers

Last updated 5/2026

English

What you'll learn

How to use Python, SQL, and Tableau together
Software integration
Data preprocessing techniques
Apply machine learning
Create a module for later use of the ML model
Connect Python and SQL to transfer data from Jupyter to Workbench
Visualize data in Tableau
Analysis and interpretation of the exercise outputs in Jupyter and Tableau

Course content

10 sections • 95 lectures • 5h 26m total length

What Does the Course Cover?3:55
Explore software integration using Python, SQL, and Tableau through a case-study workflow: from data connectivity and preprocessing to logistic regression, database integration, and Tableau visualization.

Properties and Definitions: Data, Servers, Clients, Requests and Responses4:44
Explore core concepts behind data, servers, clients, requests, and responses, including how databases store information, how web and database servers interact in the client-server model, and how browsers request data.
Properties and Definitions: Data, Servers, Clients, Requests and Responses
Properties and Definitions: Data Connectivity, APIs, and Endpoints7:04
Explore how data connectivity links clients and servers through APIs and endpoints, enabling real-time data transfer and access to data assets via apps and developers.
Properties and Definitions: Data Connectivity, APIs, and Endpoints
Further Details on APIs8:06
Explore how APIs act as gateways between clients and servers, with endpoint-driven access to data, enabling multiple services and faster, targeted information retrieval across apps.
Further Details on APIs
Text Files as Means of Communication4:21
Explore how text files enable cross-language data exchange via APIs, using json as the common format to connect apps and servers across languages.
Text Files as Means of Communication
Definitions and Applications5:25
Define integration in programming as both cross-system communication and multi-tool unification, then show how SQL, Python, and Tableau drive data analysis and visualization.
Definitions and Applications

Setting Up the Environment - An Introduction (Do Not Skip, Please)!0:51
Install Anaconda, Python, and Jupyter Notebook to set up your data science environment and learn how we will install packages and code in Python.
Why Python and why Jupyter?4:59
Python proves ideal for data science: open-source, cross-platform, general-purpose, and high-level with easy syntax and rich packages; Jupyter links Python to a browser-based notebook for integrated code, text, and output.
Why Python and why Jupyter?
Installing Anaconda3:34
Install Anaconda to get Python, Jupyter notebook, and data science packages; learn to download, install, and launch Anaconda Navigator and Jupyter dashboard across Windows, Mac, and Linux.
Intro to Using Jupyter4:53
Explore the Jupyter Notebook dashboard, the app’s starting point, noting occasional updates, and manage folders, files, and notebooks by renaming, deleting, and uploading or creating new notebooks.
Jupyter - Working with Notebook Files4:30
Explore the Jupyter notebook shell, learn about cells, and switch between command mode and edit mode to write, run, and view code outputs.
Jupyter - Using Shortcuts7:24
Master Jupyter shortcuts to speed up coding by running cells and navigating in command mode. Collapse or expand code, show line numbers with Shift+L, and convert between code and markdown.
Jupyter Shortcuts0:09
Jupyter - Handling Error Messages5:52
Learn to read and interpret Python error messages in Jupyter, identify name errors, fix typos, and Google solutions by adjusting variables and code accordingly.
Jupyter - Restarting the Kernel2:03
Learn how to restart the Jupyter kernel and clear outputs. Run all cells from the first to the last and handle errors as you verify code execution.
The Jupyter Dashboard
Installing sklearn1:16
Install scikit-learn with pip in the Anaconda prompt, leveraging Anaconda’s bundled NumPy and Pandas, and get ready to run machine learning experiments for the course.
Installing Packages - Exercise0:09
Installing Packages - Solution0:12

Up Ahead4:08
Explore integrating SQL, Python, and Tableau to manage data like a business analyst, build predictive insights with logistic regression, and present findings through clear visualizations.
Real-Life Example: Absenteeism at Work2:48
Predict work absenteeism from data on factors like distance to work, family and education, to estimate hours away and guide productivity improvements.
Real-Life Example: The Dataset3:18
Explore a real-world dataset on predicting absenteeism at work, distinguish primary and secondary data, and practice data pre-processing to transform raw data into analysis-ready information.
Real-Life Example: The Dataset
Important Notice Regarding Datasets0:37

What to Expect from the Next Couple of Sections1:39
Data Sets in Python3:23
Import the pandas library as pd, load a csv into a pandas data frame with read_csv, and inspect raw_csv_data, noting quote usage and file paths.
Data at a Glance5:53
Inspect column names and zero-based indices to preview data, copy the initial dataset to df for safe pre-processing, and use display options plus info to confirm no missing values.
A Note on Our Usage of Terms with Multiple Meanings3:27
ARTICLE - A Brief Overview of Regression Analysis1:51
Picking the Appropriate Approach for the Task at Hand2:17
Removing Irrelevant Data6:27
Drop the id column as it is a nominal identifier that does not explain absenteeism time, and assign the result to df to make the change permanent.
EXERCISE - Removing Irrelevant Data0:25
SOLUTION - Removing Irrelevant Data0:01
Examining the Reasons for Absence5:04
Extract the reason for absence from the data frame and use pandas unique, min, max, and len to reveal distinct values and detect a missing number.
Splitting a Column into Multiple Dummies8:37
Convert the categorical nominal reason for absence column into 28 dummy variables with pandas get_dummies, validate that rows have a single reason, and drop the check column.
EXERCISE - Splitting a Column into Multiple Dummies0:04
SOLUTION - Splitting a Column into Multiple Dummies
ARTICLE - Dummy Variables: Reasoning1:32
Dummy Variables and Their Statistical Importance1:28
Learn how to drop the first dummy variable in Python with get_dummies(drop_first=True) to avoid multicollinearity, preparing data for grouping in upcoming analysis.
Grouping - Transforming Dummy Variables into Categorical Variables8:35
Group dummy variables into categorical classes to reduce dimensionality and avoid multi-collinearity, drop the original reason column, and use loc and max to create four reason-type groups.
Concatenating Columns in Python4:35
Concatenate the reason type columns to the main dataframe with pandas pd.concat using axis=1, rename the columns for clarity, and preview the updated df with head.
EXERCISE - Concatenating Columns in Python0:04
SOLUTION - Concatenating Columns in Python0:01
Changing Column Order in Pandas DataFrame1:43
Learn how to reorder columns in a pandas data frame by creating a column_names_reordered list, moving the last four columns to the front, and applying the new order.
EXERCISE - Changing Column Order in Pandas DataFrame0:06
SOLUTION - Changing Column Order in Pandas DataFrame0:12
Implementing Checkpoints in Coding2:52
Learn how to create checkpoints in Python and Jupyter by copying the current df state, using a named variable like df_reason_mod to safely test data preprocessing steps.
EXERCISE - Implementing Checkpoints in Coding0:04
SOLUTION - Implementing Checkpoint in Coding
Exploring the Initial "Date" Column7:48
Using the "Date" Column to Extract the Appropriate Month Value7:00
Introducing "Day of the Week"3:36
Create a day of the week column from a date column using date_to_weekday and apply to the data frame, illustrating zero to six weekday values and prep for SQL transfer.
EXERCISE - Removing Columns0:37
Further Analysis of the DataFrame: Next 5 Columns3:17
Analyze the next five dataframe columns—transportation expense, distance to work, age, daily workload average, and body mass index—highlighting data types, rounding, and their role in regression analysis of absenteeism.
Further Analysis of the DaraFrame: "Education", "Children", "Pets"4:38
Transform the education column into a dummy variable with map and a dictionary after computing its unique values and counts, while leaving children and pets unchanged.
A Final Note on Preprocessing1:59
Master preprocessing in Python for absenteeism time in hours data. Create a df_preprocessed checkpoint from df_reason_date_mod and prepare the frame for statistical analysis before transferring to MySQL workbench.
A Note on Exporting Your Data as a *.csv File0:26

Exploring the Problem from a Machine Learning Point of View3:20
Explore framing a machine learning problem with a clean notebook, compare logistic regression and random forest, and predict absenteeism from features like reason for absence, workload, and distance to work.
Creating the Targets for the Logistic Regression6:32
Use logistic regression to classify absenteeism into two classes by using the median hours as the cutoff, creating a zeros-and-ones targets array.
Selecting the Inputs2:41
Choose the regression inputs with pandas iloc, selecting all rows and the first fourteen columns while excluding the target column; store the result as unscaled_inputs for later scaling.
A Bit of Statistical Preprocessing3:26
Standardize data with a StandardScaler by fitting on unscaled inputs and transforming data to subtract the mean and divide by the standard deviation, producing scaled_inputs 700 observations and 14 features.
Train-test Split of the Data6:12
Learn to prevent overfitting by splitting data into train and test sets using train_test_split. Control train_size and shuffle, and fix randomness with random_state for reproducible results.
Training the Model and Assessing its Accuracy5:39
Train a logistic regression with sklearn, assess accuracy on training data using both score and a manual comparison method showing 80%.
Extracting the Intercept and Coefficients from a Logistic Regression5:16
Extract the intercept and coefficients from the logistic regression, align them with feature names from unscaled_inputs, and build a summary table for use in Tableau.
Interpreting the Logistic Regression Coefficients6:14
Interpret logistic regression coefficients as weights or log odds, convert to odds ratios, and assess feature importance using standardized coefficients and the base model with reason zero as baseline.
Omitting the dummy variables from the Standardization4:12
Omit dummy variables from standardization with a CustomScaler, preserving interpretability while selectively scaling features; dummies stay untouched, with a minor accuracy drop and clearer coverage of absenteeism reasons.
Interpreting the Important Predictors5:10
Identify the key predictors of excessive absence, including poisoning, diseases, pregnancy, transportation expense, and pets, and compare standardized vs. unstandardized models for accuracy and interpretability.
Simplifying the Model (Backward Elimination)4:02
Apply backward elimination to simplify a regression model, removing near-zero contributors like day of the week, daily workload average, and distance to work, then verify the simpler model remains accurate.
Testing the Machine Learning Model4:43
Test the model on unseen data to reach about 74% accuracy, compare with 77% training accuracy, and use predict_proba for class probabilities; save the model for SQL and Tableau use.
How to Save the Machine Learning Model and Prepare it for Future Deployment4:06
Save and deploy machine learning models by pickling the trained logistic regression and the pre-processing scaler, then load them in a new notebook for consistent predictions.
ARTICLE - More about 'pickling'1:13
EXERCISE - Saving the Model (and Scaler)0:13
Creating a Module for Later Use of the Model4:04
Deploy a machine learning model by saving it and building a reusable Python module to load and clean data and predict probabilities and categories with an absenteeism_model class.

Installing MySQL9:27
Install MySQL Workbench and MySQL Server via the community installer, select custom features, and set a root password for startup; the video notes cross-platform steps and version considerations.
Additional Note - Installing Visual C0:22
Installing MySQL on macOS and Unix systems1:24
Setting Up a Connection2:34
Set up and manage a MySQL Workbench connection between the GUI and the MySQL server, test the connection, and navigate multiple connections including the default one.
Introduction to the MySQL Interface5:09
Explore the MySQL Workbench interface to write and run SQL queries, view results in the result grid, and monitor script output while managing schemas and SQL scripts.

Are you sure you're all set?0:13
Implementing the 'absenteeism_module' - Part I3:50
Learn to run the absenteeism module in Python using a local custom module in Jupyter, with five required files: notebook, csv, the module file, and the supplementary model and scaler.
Implementing the 'absenteeism_module' - Part II6:23
Import the absenteeism module, load and clean the new data with Absenteeism_new_data, and run predicted_outputs to generate absenteeism probabilities.
Creating a Database in MySQL6:37
Create and configure a MySQL database named predicted_outputs using drop if exists and use commands, then connect from Python to enable integration with SQL, Python, and Tableau.
Importing and Installing 'pymysql'2:44
Install PyMySQL with pip, import PyMySQL in Jupyter, and establish a bridge between MySQL Workbench and Python to connect to the predicted outputs database.
Creating a Connection and Cursor2:54
Open a MySQL connection from a Jupyter notebook with PyMySQL.connect, creating a conn variable for the database, then instantiate a cursor to run SQL in Python.
EXERCISE - Create 'df_new_obs'0:10
Creating the 'predicted_outputs' table in MySQL4:52
Drop any existing predicted_outputs table in MySQL, create it with not null constraints, and assign bit, int, and float data types for Python and Tableau compatibility.
Running an SQL SELECT Statement from Python3:04
Learn how to run a select statement from Python using a cursor, execute the query, and handle empty predicted_outputs_table while guarding against risky variable queries.
Transferring Data from Jupyter to Workbench - Part I6:15
Move data from Python to MySQL using a single multi-row insert, storing new predictions in df_new_obs and inserting into the predicted_outputs table in Workbench for efficient data transfer.
Transferring Data from Jupyter to Workbench - Part II6:35
Create insert query by looping over df_new_obs rows and columns, extracting values and converting them to strings. Remove trailing commas and append a semicolon for MySQL.
Transferring Data from Jupyter to Workbench - Part III2:45
Apply the execute method to insert data, commit the connection, and verify 40 new records in Workbench, then export the dataset as a CSV for Tableau.

EXERCISE - Age vs Probability0:14
Analysis in Tableau: Age vs Probability8:49
Analyze the absenteeism model in Tableau, using a csv dataset of 40 observations to visualize age versus probability and compare logistic regression predictions with other models. Display probabilities as percentages.
EXERCISE - Reasons vs Probability0:14
Analysis in Tableau: Reasons vs Probability7:49
Explore how to use Tableau to relate absence reasons to the probability of excessive absence. Convert measures to dimensions, set continuous variables, and interpret four reasons to gain actionable insights.
EXERCISE - Transportation Expense vs Probability0:22
Analysis in Tableau: Transportation Expense vs Probability6:00
In Tableau, analyze how transportation expense relates to the probability of excessive absence using a scatterplot, percent formatting, and size/color filters by number of children to reveal trends.

Requirements

Basic coding skills in Python
Basic knowledge of SQL
Basic ability to use Tableau for data visualization

Description

Python, SQL, and Tableau are three of the most widely used tools in the world of data science.

Python is the leading programming language;

SQL is the most widely used means for communication with database systems;

Tableau is the preferred solution for data visualization;

To put it simply – SQL helps us store and manipulate the data we are working with, Python allows us to write code and perform calculations, and then Tableau enables beautiful data visualization. A well-thought-out integration stepping on these three pillars could save a business millions of dollars annually in terms of reporting personnel.

Therefore, it goes without saying that employers are looking for Python, SQL, and Tableau when posting Data Scientist and Business Intelligence Analyst job descriptions. Not only that, but they would want to find a candidate who knows how to use these three tools simultaneously. This is how recurring data analysis tasks can be automated.

So, in this course we will to teach you how to integrate Python, SQL, and Tableau. An essential skill that would give you an edge over other candidates. In fact, the best way to differentiate your job resume and get called for interviews is to acquire relevant skills other candidates lack. And because, we have prepared a topic that hasn’t been addressed elsewhere, you will be picking up a skill that truly has the potential to differentiate your profile.

Many people know how to write some code in Python.

Others use SQL and Tableau to a certain extent.

Very few, however, are able to see the full picture and integrate Python, SQL, and Tableau providing a holistic solution. In the near future, most businesses will automate their reporting and business analysis tasks by implementing the techniques you will see in this course. It would be invaluable for your future career at a corporation or as a consultant, if you end up being the person automating such tasks.

Our experience in one of the large global companies showed us that a consultant with these skills could charge a four-figure amount per hour. And the company was happy to pay that money because the end-product led to significant efficiencies in the long run.

The course starts off by introducing software integration as a concept. We will discuss some important terms such as servers, clients, requests, and responses. Moreover, you will learn about data connectivity, APIs, and endpoints.

Then, we will continue by introducing the real-life example exercise the course is centered around – the ‘Absenteeism at Work’ dataset. The preprocessing part that follows will give you a taste of how BI and data science look like in real-life on the job situations. This is extremely important because a significant amount of a data scientist’s work consists in preprocessing, but many learning materials omit that

Then we would continue by applying some Machine Learning on our data. You will learn how to explore the problem at hand from a machine learning perspective, how to create targets, what kind of statistical preprocessing is necessary for this part of the exercise, how to train a Machine Learning model, and how to test it. A truly comprehensive ML exercise.

Connecting Python and SQL is not immediate. We have shown how that’s done in an entire section of the course. By the end of that section, you will be able to transfer data from Jupyter to Workbench.

And finally, as promised, Tableau will allow us to visualize the data we have been working with. We will prepare several insightful charts and will interpret the results together.

As you can see, this is a truly comprehensive data science exercise. There is no need to think twice. If you take this course now, you will acquire invaluable skills that will help you stand out from the rest of the candidates competing for a job.

Also, we are happy to offer a 30-day unconditional no-questions-asked-money-back-in-full guarantee that you will enjoy the course.

So, let’s do this! The only regret you will have is that you didn’t find this course sooner!

Who this course is for:

Intermediate and advanced students
Students eager to differentiate their resume
Individuals interested in a career in Business Intelligence and Data Science

What you'll learn

Explore related topics

Course content

Introduction1 lecture • 4min

What is software integration?5 lectures • 30min

Setting up the working environment12 lectures • 36min

What's next in the course?4 lectures • 11min

Preprocessing33 lectures • 1hr 30min

Machine Learning16 lectures • 1hr 7min

Installing MySQL and Getting Acquainted with the Interface5 lectures • 19min

Connecting Python and SQL12 lectures • 46min

Analyzing the Obtained data in Tableau6 lectures • 23min

Bonus lecture1 lecture • 1min

Requirements

Description

Who this course is for: