Graph Theory Introduction

William Fiset
A free video tutorial from William Fiset
Google engineer; ACM-ICPC world finalist
4.6 instructor rating • 2 courses • 106,755 students

Learn more from the full course

Graph Theory Algorithms

A complete overview of graph theory algorithms in computer science and mathematics.

09:02:45 of on-demand video • Updated July 2020

  • Storage and representation of graphs (networks) on a computer
  • Common graph theory problems
  • Breadth first search algorithm
  • Depth first search algorithm
  • Various tree algorithms including: the height or a tree, finding the center of a tree, rooting a tree, and etc...
  • Dijkstra's algorithm
  • Topological sort algorithm
  • Shortest/longest path on a acyclic graph
  • Bellman Ford's algorithm
  • Floyd-Warshall all pairs shortest path algorithm
  • Finding bridges/articulation points
  • Finding strongly connected components (Tarjan's)
  • Travelling salesman problem (TSP)
  • How to find the maximum flow of a flow graph
  • Finding bipartite graph matchings
  • Various network flow algorithms including: Edmonds-Karp, Capacity Scaling, and Dinic's algorithm
  • Kruskal's Minimum Spanning Tree algorithm
  • The Lowest Common Ancestor (LCA) Problem
English [Auto] Hello and welcome. My name is William, and I'm super excited to bring to you this a video series focused on graph theory. Graph theory is one of my absolute favorite topics in computer science. We're going to see a lot of very awesome algorithms. The whole field is very diverse and hugely applicable to real world applications. I think everybody should be able to learn, love and enjoy graph theory. These first few videos are going to be a ramp up videos to introduce the topics of how we store, represent and traverse graphs on a computer. By the way, this whole video series will be taking on a computer science point of view of graph theory rather than a mathematical one. So we won't be covering proofs and so on. Persay instead will be looking at algorithm implementation details and code. So what is graph theory? In essence, it is the study of properties and applications of graphs which common folk or non mathematical folks call networks. This is a very broad topic and my goal with this video series is to teach you how to apply graph theory to real world situations. Graphs can be used to represent almost any problem, which makes them so interesting because they pop up absolutely everywhere. A simple problem that can be phrased as a graph theory problem might be given the constraints in this picture, how many different sets of clothes can I make choosing an article from each category? Of course, this could be phrased and solved using only mathematics. But the advantage to graph theory is that it allows us to visualize the problem using nodes to represent an article of clothing and edges to represent relationships between them. Another canonical example of a graph theory problem is a social network of friends. A graph representation enables us to answer interesting questions, such as how many friends this person X have, or how many degrees of separation are there between Person X and person Y? Now we talk about different types of graphs. There are many different types of graph representations and it's really important, I mean really important to be able to recognize what type of graph you're working with, and especially when you're programming and trying to solve a particular problem. This first type of graph is an undirected graph. It's the most simple kind of graph you'll encounter, and it is where edges have no orientation. That is, if there's an edge from node you to Node V, it is identical to the edge from V to you. For instance, in the following graph, nodes are cities and edges represent bidirectional roads. Since if you drive from one city to another, you can always retrace your steps by driving the other way. In contrast to undirected graphs, there are also direct graphs, sometimes called digraphs. And these graphs, you've guessed it, the edges are directed. So if we have an edge from you to V, then you can only go from node due to Node V, not the other way around. In this graph you can see that the edges are directed because of the arrowheads on the edges between nodes. This graph could represent people who bought each other gifts. So an incoming edge represents receiving a gift and an outgoing edge represents giving a gift. Therefore, person E in this graph, but person D gift person A bought themselves and person B a gift and person F but nobody any gifts and received none. So far, we've only seen unweighted graphs, but edges on graphs can contain weights to represent arbitrary values such as cost, distance, quantity, you name it. Weighted graphs come in both directed and undirected flavours. As a side note, I will usually denote an edge of a graph as a triplet you VW to indicate where an edge is coming from, where it's going to and what its weight is, of course, with this notation. I also need to specify whether the graph is directed or undirected. Next up, I just want to have a quick chat about special types of graphs and graph theory, there are so many different types of graphs that I only had to select a few which will be most relevant for this upcoming video series. The most important type of special graph is definitely the tree. A tree is simply an underrated graph with no cycles. There are several equivalent definitions of a tree, such as a graph with anodes and and minus one edges. All the graphs. Below are trees. Yes, even the leftmost one, since it has no cycles. A related but totally different type of graph is a rooted tree, the distinction here is that a rooted tree has a designated root node where every edge either points away from or towards the root node when edges point away from the root node. The graph is called an arborists or an out tree and an antidepressant or entry. Otherwise, out trees are by far more common than entries from what I've observed. It is also fairly common for people to refer to a rooted tree simply as a tree instead of an in or out tree. But there is an important distinction there. Next are directed a sickly Graff's. These are graphs with directed edges and no cycles. These graphs are very important and fairly common in computer science, actually, since they often represent structures with dependencies such as a scheduler, a build system, a compiler maybe, or perhaps more relatable university class prerequisites. There are several efficient algorithms will be looking at to deal specifically with directed acyclic graphs, such as how to find the surest path and produce a topological ordering of nodes. A topological ordering of nodes is an ordering of nodes that tells you how to process the nodes in the graph so you don't perform a task before first having completed all its dependencies. For example, a topological ordering of class prerequisites would tell you to take intro biology and intro chemistry before taking a class on, say, genomics. This next type of special graph is a bipartite graph. It is one whose vertices can be split into two independent groups, U and V, such that every edge connects between you and V. This is just a fancy way of saying that the graph is two colorable or that there are no odd length cycles and graph. Often a problem we'd like to ask is what is the maximum matching we can create on a bipartite graph? Suppose white nodes are jobs and run nodes are people. Then we can ask how many people can be matched to jobs? In this case, there are a lot of edges in each graph, so I think the answer for both is for but in general it's not so easy if there are less edges, tougher constraints and more conflicts. Bipartite graphs also play a critical role in the field of network flow, which we will talk about later. This last type of graph is a complete graph is one where there is a unique edge between every pair of notes and the graph. A complete graph with and vertices is denoted as the graph case and I have listed K one through K six on the bottom and you can easily see how the scales when we add more notes, complete graphs are often seen as the worst case possible graph you can possibly encounter because of how many edges there are. So if you want to test your algorithm for performance, a complete graph is an easy way to start. One thing we're going to have to be really cognizant about is how we're actually representing our graphs on the computer. This isn't just what type of graph it is, but what type of data structure it. But what type of data structure are we representing our graph with, and this can have a huge impact on performance. The simplest way is inside a 2D adjacency matrix. The idea is that the cell I. J represents the edge of going from Node I to know J. So in the graph below, there are four nodes. So I create a four by four matrix and populate the graph with the edge awaits. If you look at the educate from Node C to Node D., you'll see that it has an educative of. So in a row three and column four of The Matrix, there is a value of two note that is often assumed that the edge of going from a node to itself has a cost of zero, which is why the diagonal of the matrix has all zero values. This matrix form has several advantages. First, that is very specific and for dense graphs that is graphs with a lot of edges. The weight look up can be found in constant time, which is quite nice. And lastly, I would argue that it is the simplest form of graph representation you can have. On the downside, however, the main reason people don't go for the adjacency matrix as their first pick is because it requires V squared space, which is, well, a lot of space in practice. Graff's with 10000 nodes or more start to become infeasible very quickly. The other issue with adjacency matrix is that it requires V squared work. Iterate over all the edges of your graph. This is fine for dense graphs with lots of edges, but it isn't so great for sparse graphs since most cells will be empty. The main alternative to the adjacency matrix is the adjacency list, which is a way to represent a graph as a map of nodes to list of edges. The idea is that each node tracks all of its outgoing edges. For example, Node C has three outgoing edges, so the map entry for C will track the edge from C A with cost for that from C to be with cost one and edge from C to deal with cost. To notice that in the list of edges we only need to tract two things the node we're going to and the cost to get there. We don't need to keep track of where we came from because that's already implicitly known. The nice thing about adjacency lists is that it is great for sparse graphs because it only tracks the edges that you have and doesn't allocate additional memory that you might not use like the adjacency matrix does. This also means it's efficient when iterating over all the edges. The main disadvantage to using an adjacency list is that it is less space efficient on denser graphs. And other subtle disadvantage is that it takes big O of E time to access a specific edges weight, although in practice you rarely or if ever actually need to do this. The last representation I want to talk about is the edge list and edge list is a way to represent a graph simply as an unordered list of edges. Basically, it's exactly what it sounds like. A list of edges assume that the notation for any triplet you ve w means the cost from node you to node V is W. So for this graph, the Edgehill's is simply a list of six edges represented as those triplets. This representation is very simple. However, it lacks structure and that's why it is seldomly used. Advantage to the edge list is as great for sparse graphs. Iterating over all the edges is super easy and the structure is simple. The downside is that edge look up can be slow and you can run into memory issues on large graphs.