
This video provides an overview of the entire course.
In this video, we will show how to install a Jupyter Notebook environment on your machine.
Cover the ways of installing a Jupyter Notebook
Show how to install Docker
Show how to use the Jupyter Notebook Data Science Docker stack
In this video, we will show you how to work with Jupyter Notebooks.
Show how to navigate cells
Show how the documentation is read and shell code accessed
Show how to work with a sample notebook for analyzing life expectancies
In this video, we explain how to publish finished Jupyter Notebooks.
Explain the different notebook formats
Show how some of these formats can be obtained
Export the example notebook
In this section, we examine the Chicago crime dataset and show how to download and import it using Pandas.
Explain and download the Chicago crime dataset
Examine the dataset format in Jupyter Notebook
Choose the necessary options to successfully read it as a Pandas DataFrame
We will examine the core data structures available in Pandas.
Examine the 1D Series data structure
Examine the 2D DataFrame data structure
Explore the Pandas API for said data structures in a Jupyter Notebook
In this video, we will learn about Pandas hierarchical indexes and apply them to visually explore the crime dataset.
Examine the Pandas MultiIndex for hierarchically indexed data
Show a MultiIndex example in Jupyter Notebook
Use a MultiIndex to restructure the crime dataset and visualize it
We explain how to add basic interactivity to a Jupyter Notebook.
Explain what interactive widgets are
Create an example interactive widget using our crimes dataset
Show where to find more examples of interactive widgets
In this video, we will learn what scraping is and why it's important.
Explain what unstructured data is
Explain the different data formats and their differences: CSV, Excel, REST APIs, plain websites, scanned PDFs...
This video will teach you how to scrape data from a REST API.
Explain the weather API
Show how to set up the API key to download the data
Cover user requests to fetch data from a REST API
This video takes the last example further to import the downloaded REST data into pandas.
Show how to convert a Python dict provided to us by Requests into a pandas DataFrame
Show how to iterate over multiple API requests to download all the data chunks
Combine data chunks into a singular Chicago weather DataFrame
In this video, we will show a more difficult example of scraping data from an unstructured website.
Show the website we will be using to fetch the Chicago weather data
Show how to use BeautifulSoup to download the website and parse the HTML
Show how to convert the parsed HTML object into a pandas DataFrame
In this video, we will learn what information-dense visualisations are.
Explain data visualisation as visual storytelling
Talk about Edward Tufte's books and website
Explain Charles Joseph Minard's excellent map
This section explains how to visualise scatter plots for examining data correlation.
Explain time series components
Show how to plot a scatter plot
Explain data correlation a bit better
This video takes the last example further to import the downloaded REST data into pandas.
Explain what linear regression is
Explain how modeling real-world behavior relates to general scientific research
Show how to create a linear model using linear regression in Python
In this video, we will show a more difficult example of scraping data from an unstructured website.
Explain why correlation matrices are useful
Show how to create a correlation matrix in Python
See why maps are helpful.
Talk about spatial data
Talk about John Snow's London cholera outbreak map and how it was helpful
Mention I Quant NY's visual storytelling
See how we can build a map from our dataset.
Explain how data layers can be overlaid on maps
Show how to use Basemap to tile a map
Show how to zoom into a specific area of the map and overlay the data there
In this section, we talk about adding interactivity to our map using Plotly.
Set up Plotly/Mapbox API keys
Show how to draw points on the Plotly map
Show how to render roads on a Plotly map
Closing words for the course.
Summarize what was learned
Suggest some possible next steps for the viewer
Instructions for feedback
This video course will help you get familiar with Jupyter Notebook and all of its features to perform various data science tasks in Python. Jupyter Notebook is a powerful tool for interactive data exploration and visualization and has become the standard tool among data scientists. In the course, we will start from basic data analysis tasks in Jupyter Notebook and work our way up to learn some common scientific Python tools such as pandas, matplotlib, and plotly. We will work with real datasets, such as crime and traffic accidents in New York City, to explore common issues such as data scraping and cleaning. We will create insightful visualizations, showing time-stamped and spatial data.
By the end of the course, you will feel confident about approaching a new dataset, cleaning it up, exploring it, and analyzing it in Jupyter Notebook to extract useful information in the form of interactive reports and information-dense data visualizations.
This course uses Jupyter 5.4.1, while not the latest version available, it provides relevant and informative content for data science enthusiasts.
About the Author
Dražen Lucanin is a developer, data analyst, and the founder of Punk Rock Dev, an indie web development studio. He's been building web applications and doing data analysis in Python, JavaScript, and other technologies professionally since 2009. In the past, Dražen worked as a research assistant and did a PhD in computer science at the Vienna University of Technology. There he studied the energy efficiency of geographically distributed datacenters and worked on optimizing VM scheduling based on real-time electricity prices and weather conditions. He also worked as an external associate at the Ruder Boškovic Institute, researching machine learning methods for forecasting financial crises. During Dražen's scientific work Python, Jupyter Notebook (back then still IPython Notebook), Matplotlib, and Pandas were his best friends over many nights of interactive manipulation of all sorts of time series and spatial data. Dražen also did a Master's degree in computer science at the University of Zagreb.