
This will be our first lecture, and will discuss what geo-analytics is, and why it is an important skill for your career. We’ll also talk about SQL, and how it was created to allow professionals with only moderate computer skills to perform robust analysis of scientific data sources.
Before we start our hands-on portion of the workshop, we want to download our COVID-19 data along with SQLite and SpatiaLite. we will also test our connections to make sure everything is working. Make sure to download the covid.zip file.
This video explains you how the setup scripts work. This gets in the weeds a little and might seem overly complex this early in the workshop. Don't worry, the rest of what we are doing is much more straightforward and understandable. But, some of you might want to understand how to grab data off of the Internet and load it into a database without any human interaction.
This video shows you how to install the free and open-source QGIS GIS software product. We won't use QGIS until the second week (and you don't actually have to use QGIS at all if you don't want to, I'll show you how to visualize the data in ArcGIS as well). Be forewarned, QGIS is over 2GB in size, so the download and installation will take some time. We won't be doing much with QGIS on the first day of the workshop, but it is probably still a good idea to install it now.
In our first module, we'll get an overview of the workshop, and dive right into a discussion of SQL and basic SQL queries using the real-time COVID-19 data. We'll not only interrogate the databases, but we'll use other tools to display the results in both tabular and graphic form.
At the end of this lecture you’ll learn how to issue basic SQL commands like the SELECT statement with appropriate SQL clauses like WHERE, IN, ORDER BY, LIMIT, and others.
At the end of this lecture you’ll know how to perform in-query calculations with SQL statements like division, multiplication, addition, and subtraction.
At the end of this lecture you’ll know how to advance your knowledge of SQL using aggregate statements like SUM, MIN, MAX, AVG, and GROUP BY.
We are going to integrate everything we learned with basic SQL, calculations, and aggregate functions, and apply it to countywide COVID-19 data.
In this short video, I want to show you how to connect QGIS to SQLite to display spatial information. If you are like me, you'll probably spend hours playing around with this capability before the next workshop.
So, here are a couple of questions you may want to explore:
1. How different was the spatial distribution of deaths in Maryland, New York, and Virginia during April? How about July?
2. Try to link the map using the states table, and visualize the number of deaths in the United States. The following query will get you on your way:
DROP TABLE IF EXISTS spatialtable;
CREATE TABLE spatialtable AS
SELECT state, sum(positiveIncrease) AS positive,sum(negativeIncrease) as negative,
sum(deathIncrease) AS deaths, geometry
FROM daily, state
WHERE stusps = daily.state
GROUP BY geometry, state;
SELECT recovergeometrycolumn('spatialtable','geometry',4269,'MULTIPOLYGON');
And, you can start to issue other filters in the WHERE clause, like AND dt = '2020-12-25' to visualize the deaths, cases, or hospitalizations on Christmas day.
Above all, have fun with this, and see how far you can take it.
Today, we'll expand our knowledge of SQL and begin linking multiple tables on-the-fly to discover the power for relationships in data analysis. To do this, we'll like the real-time COVID-19 data with other database tables to assess COVID-19 activity with statewide demographics. We will also explore the powerful capabilities of SQL for updating and modifying database tables.
At the end of this lecture, you’ll learn how to link multiple database tables together with primary and secondary identifiers using SQL constructs of JOIN and alter data and add records in database tables using ALTER and UPDATE.
We’re going to explore a hot topic a lot of people have strong opinions on: are mask mandates effective. This can be a very sensitive topic, especially for those who’ve been directly effected by COVID-19. So, we are going to be as respectful as we can of the subject, and focus on ways a data analyst might perform and initial assessment of the problem. While we might not come to a conclusion to this very complex question, we’ll see how pocessing the data with SQL can help us identify other areas of research to explore.
If you are using Esri products like ArcGIS Pro or ArcMap, you may be feeling a little left out. In this short video, I'm going to show you a little workaround to get ArcGIS to read the spatial tables directly from SQLite. I'll also explain why we have to do this.
Check out the .pdf file to get the SQL text to paste in.
Over the next two hours, you will work alongside me to explore the spatial capabilities of SQLite with SpatiaLite. We'll eventually incorporate COVID-19 data, but initially, we will look at just a few spatial capabilities. As you follow along, feel free to pause the videos and try the SQL code for yourself.
In this module, we'll build on our basic and intermediary work to include spatial considerations in database analysis with SQL. This will include basic visualization of maps (making the results of our queries spatial) for states and counties. The transition to counties will allow us to see the COVID-19 trends on a finer scale as well as integrate multiple tables together, providing an even richer set of analyses.
In this lecture we'll create a time-series analysis for COVID-19 data throughout the United States.
In this lecture we’ll create a time-series analysis for COVID-19 data throughout the United States.
COVID-19 has changed our world. It's frightened us, inspired us, isolated us, and drawn us closer. Obviously, the effects of the pandemic have affected us in different ways. But one thing is certain, we've all looked at COVID-19 data and wondered what's happening, and how is it affecting me?
In this workshop you will learn how to access, analyze, and interpret for yourself, the different data associated with COVID-19. We'll use all free software that can run directly off a thumb drive, so you don't have to install anything: just download the associated .zip file, and start working.
The course will teach you how to use SQL with the free and open source database, SQLite, to answer an almost unlimited number of questions related to COVID-19 at the National, State, County, and regional levels. We'll also analyze the geographic distribution of COVID-19 using the spatial mapping extension for SQLite, SpatiaLite.
With these tools, you'll be able to analyze the temporal and spatial characteristics of COVID-19. And, with the extensive SQL training in the workshop, you'll be able to come up with your own questions to analyze the data.
Finally, we'll prepare the data for statistical analysis, asking questions like:
are there regional differences in COVID cases, hospitalizations, and deaths throughout the United States?
how have cases, hospitalizations, and deaths changed over time?
what are the patterns and trends in my county, and how does that compare to the State or National numbers?