
Let's compare the pros and cons of installing Grafana locally versus using the cloud-based Grafana.
You will learn how to install and configure Grafana on Ubuntu LTS 18.04 ( and above ). The step by step instructions of setting up Grafana is attached to this lecture as well.
Windows is the most popular operating system for servers and personal computers. Therefore it is essential to know that how Grafana can be installed and configured on a Windows instance.
If you are a proud Mac user, you can install Grafana directly on your Mac computer and use it to learn more about it. In this lecture you will learn that how you can install and configure Grafana using Homebrew.
In this course, I have provided a single docker-compose file that can launch the entire Grafana Stack (including Loki, Alloy and Tempo) as well as Prometheus in a few seconds. The stack will include the mock data generators for Shoe Hub company and OpenTelemetry as well.
Dashboards in Grafana are designed for different purposes, such as monitoring browser applications or infrastructure. Each dashboard type is used by a different role or team in the organisation, who may have different KPIs to watch.
In this lecture I will explain the most common dashboard layouts and structure for each dashboard type.
The Shoe Hub is an imaginary company we will use throughout the course to explain how you can visualise business and technical metrics.
Graph panel is suitable for creating charts and histograms. In this lecture you will learn how to use Graph panel and display the metrics from Graphite on it.
In this lecture you will visualise the data of different payment methods in the US so that we can have a good understanding how the customers prefer to pay.
Using the Data Transformations feature of Grafana, you can mix and match existing panel rows to create new rows, look up data or convert data types.
The Time Series panel is suitable for showing the data trend over time. However, ,we can compare different related values in percentage form using Pie Charts. For example, we can show the percentage of infrastructure failures are related to disk, what percentage is related to network and what percentage is related to power outage.
Sometimes, we want to compare a metric's current value(s) to the values(s) of the same metric but in the past. For example, you could display the current Shoe sales compared to last month's sales or make a week-on-week revenue comparison. Such graphs can be used to understand of the state of a metric easilywhether a metric's state is increasing or decreasing. For example we can see if the network errors have gone down since last week, or if our marketing efforts have paid off and our sales has gone up since last month. In this lecture we will learn that how we can do this using Grafana and Prometheus.
Sometimes it is essential for us to know if the values of data points are above or below a given threshold. For example, if the network errors go above a certain number, or if the orders received per hour are unusually low. We achieve this by Thresholds in Grafana.
Variables, a key feature of Grafana, allow us to create dynamic dashboards and panels with less work and effort than when we hard-code everything.
In Grafana, if we show two or more lines on a Graph panel, and the values of these lines are vastly different, then one or some of those lines may become so compressed that we may see their data points as zeros. For example of we show the response time of an IoT device that responds slowly, and the response time of an API that responds very quickly, on the same Graph panel, the response time of the API may be seem as a straight line with value of zero.
In this lecture we will learn that how we can overcome this problem.
Alerts are defined based on thresholds or mathematical formulations in Grafana. Over time, the alerting system in Grafana has evolved, improved and become somewhat complex. In this lecture you will learn about the concepts and terminalogies of the Grafana Alerting System, and you will learn how this ecosystem works.
Alerts in Grafana are based on queries written in a data source-specific language, such as PromQL for Prometheus. The results of these queries are checked periodically, and if they violate a rule, such as a threshold, that we define, alerts are raised.
In this lecture you will learnt that how you define an alerting rule.
It is not practical to constantly watch the dashboards to see if alerts are raised. Instead, we deliver notifications in various formats, such as emails or Slack messages, to inform relevant people of the alert.
In this lecture you will learn that how you create contact points as well as notification policies to filter the notifications and direct them to the right people.
Slack is a popular collaboration tool that many teams use to chat, exchange team data and receive notifications. Grafana can send alert messages to Slack, too. In this lecture, you will learn how to send Slack notifications for a firing alert.
Sometimes, we do not want to send out notifications temporarily. For example, you may not want to send out notifications at midnight. In such cases we can use Grafana's ability to silence the alerts based on a define time period.
Annotations are a way to describe the rich events. In this lecture you will see that how you can use annotations to describe and understand your Grafana panels better.
With the advent of cloud-native applications and the microservices architecture, commercial observability platforms gained attention and became famous. However, they can be costly, and once integrated with them, it may be pretty challenging to break away from them and adopt a different vendor's observability platform.
OpenTelemetry, or OTel, is an open-source initiative incubated by the Cloud Native Computing Foundation (CNCF) that aims to enable developers and DevOps engineers to generate, export, and collect telemetry data without being locked into a specific vendor.
Learn about the architecture of a scalable observability system based on Opentelemetry.
Learn about configuring Prometheus to receive Opentelemetry metrics.
Grafana Alloy is Grafana's Opentelemetry Collector. It can receive OTel metrics from various sources and deliver them to a variety of backend databases after processing them.
In this lecture, we will install Grafana Alloy locally on a Mac computer. Installation instructions for installing Grafana Alloy on Windows and Linux are provided at the end of this section.
Grafana Alloy plays a pivotal role in receiving, processing, and forwarding Opentelemetry signals to downstream systems, such as Prometheus. In this lecture, you will learn how to create receivers, processors, and exporters to achieve this goal.
In this lecture, we will analyse a microservice that produces a counter and exports it to Grafana Alloy via OTLP.
Tracing in distributed systems, particularly within a microservices architecture, is crucial for understanding and optimizing system performance and reliability. Tracing becomes complex yet essential as modern applications are built using microservices, where various components communicate over networks.
Tracing involves tracking a request's journey across multiple microservices, providing insights into each service's performance, dependencies, and bottlenecks. Distributed tracing tools like Jaeger and Zipkin enable developers to visualize this journey, often represented as a trace or a series of interconnected spans.
In microservices, where each service is responsible for a specific function, tracing helps identify latency issues, failures, and inefficiencies that may occur at any point in the system. By correlating traces across services, developers can pinpoint the root cause of problems and optimize performance.
Moreover, tracing facilitates debugging and monitoring in production environments, aiding in troubleshooting and ensuring system reliability. It also supports distributed system testing, allowing developers to simulate various scenarios and analyze system behaviour under different conditions.
In this lecture you will learn all aspects of Telemetry and its relevance to Grafana.
Grafana Tempo is part of Grafana Stack from Grafna Labs! It is used to visualise Traces produced by OpenTelemetry, i.e., from a Microservice. Grafana Tempo can show the traces and spans. It can also be used with Prometheus to let you go from a trace to a metric and vice versa. In this lecture, you will be introduced to Grafana Tempo and some of the ways youit signals, i.e., Zipkin and New Relic as well as some ways to visualise them.
Since most DevOps engineers use Mac , at this point we learn that how Grafana Tempo can be deployed on a Mac computer for learning and practicing. If you use Windows or Docker, you can find the installation guides at the end of the current section.
In this lecture you will learn an easy method for installing Grafana Tempo on Ubuntu.
Although Grafana Tempo comes with a built-in Opentelemetry Collector, the recommended architecture by Grafana Labs is to use Grafana Alloy between the source of the Opentelemetry traces (i.e., your microservices) and Grafana Tempo. Grafana Alloy can receive the Opentelemetry traces and forward them to Grafana Tempo. Using Grafana Alloy as the middleware will reduce the vendor-lock in.
In this lecture, we have a microservice called Order Service, which is written in dotnet (C#) and Python. The code sends metrics and traces to Grafana Alloy, and Grafana Alloy forwards the traces to Grafana Tempo.
In this lecture, you will learn about trace context propagation, which propagates trace and span information from a client service to a server service. You will see this concept in two microservices codes.
Then you will learn how to create a Service Graph in Grafana Tempo, to better understand the flow of the request and the relationship of the services.
Grafana Tempo supports a query language called TraceQL. Using TraceQL you can find traces and spans across a complex distributed system easily.
In this lecture you will practice writing TraceQL queries to get hands-on experience.
When setting up Grafana Tempo to be used in production capacity, Tempo must use a cloud storage such as Amazon Web Services (AWS) S3. In this lecture you will learn the procedure of configuring Tempo for using AWS S3.
Master observability with the Grafana Stack, including Grafana Loki for logs, Grafana Tempo for distributed tracing, Grafana Alloy as an Open Telemetry (OTel) collector, and Grafana Mimir for large-scale enterprise metrics storage, by by enrolling in this top rated course, which has been the best selling Grafana course on Udemy for seven consecutive years!
This course provides a comprehensive, hands-on path to building modern observability systems. It starts with metrics using Prometheus and progresses to logs, traces, alerting, and custom dashboards in Grafana.
We begin with the core concepts of observability, telemetry data, and methods for metric collection. Then, you'll dive into Prometheus — learning how to install, configure, and use it like a pro.
Next, you'll deploy Grafana across Windows, macOS, Linux (including Ubuntu and Amazon Linux), and Docker. Once your stack runs, we cover Grafana dashboard design for real-world use cases: APIs, infrastructure, and microservices.
In the logging section, you'll work with Grafana Loki to ingest and visualise logs, including dynamic label extraction from unstructured logs.
Then we go deeper: you'll learn the fundamentals of OpenTelemetry and set up Grafana Alloy to receive, process, and export OTel metrics and traces. You'll instrument microservices (in Python and C#) and export signals to Grafana Tempo, where you'll trace distributed calls and analyze service graphs with TraceQL.
Now also included is Grafana Mimir, a highly scalable time-series database for storing metrics at scale. You'll learn what Mimir is, how it works, and how to deploy it locally in monolithic mode and microservices mode into Kubernetes.
To make it practical, the course is based on a fictional online retailer, ShoeHub, with mock data, dashboards, alerts, and services that simulate real-world observability use cases.
No setup headaches — you'll also get instant access to a browser-based playground powered by Killer Coda, so you can start experimenting without installing anything.
Included in the course:
Docker Compose files for Prometheus, Grafana, Loki, Alloy, Tempo, Mimir, ShoHub metrics, and Example Microservices Tracing.
Sample dashboards and panel configurations.
Log generator script in Python.
Set up guides for multiple platforms.
Binary executable files for ShoeHub and the example microservices (if you don't want to use Docker).
Optional cloud lab environment (Killer Coda) for instant hands-on practice.
I will respond promptly via the Udemy Q&A system if you encounter any issues or have questions.
Happy learning — and welcome to the world of observability!