
Develop reliability engineering fundamentals with hands-on GCP services (GCE, GKE, Cloud Run) and observability. Build golden signals dashboards with Grafana and master Linux automation tools like kubectl and gcloud.
Connect with the instructor across Udemy, YouTube, blogs, and professional networks, and dive into the SRE bootcamp agenda on build, deploy, run and observability.
Learn the origins of site reliability engineering and core observability concepts, including golden signals, SLOs, SLIs, and error budgets, then deploy demo apps with GCP, GKE, and Cloud Run.
Set up a Google Cloud Platform free tier project, configure gcloud authentication, enable compute and container services, and establish billing export to BigQuery for cost visibility.
Introduce site reliability engineering concepts and observability, covering golden signals, slis, slos, and error budgets, while exploring the Google definition, sre characteristics, and essential foundational skills.
Explore the core of site reliability engineering by defining reliability targets, identifying golden signals (traffic, errors, latency, saturation), and using SLI, SLO, and error budgets to balance availability and performance.
Learn how site reliability engineers measure health, automate tasks, and build observability with SLIs, SLOs, and log insights across cloud and Linux environments.
Review the core concepts of site reliability engineering, including observability, the golden signals, SLIs, SLOs, and error budgets, plus the Google definition and key SRE skill sets.
Provide a GCP overview, show how to interact with GCP, compare five key services with AWS and Azure, and outline GKE, Cloud Run, Cloud Logging, and Cloud Monitoring features.
Get a concise Google Cloud Platform overview, covering Compute Engine, GKE, Cloud Run, Cloud Storage, BigQuery, Cloud Logging, and Cloud Monitoring for observability.
Explore Compute Engine with spot VMs and managed instance groups with auto scaling and auto heal, GKE autopilot and Cloud Run for scalable containers and observability via Cloud Logging.
Review the GCP overview, catalog the products and services, and highlight the five core GCP services we will use throughout this course.
Master Linux command-line basics for sre, including system info, navigation, file operations, troubleshooting, and tools like find and grep, with crontab and quick overviews of Red Hat, Ubuntu, Debian, CentOS.
Explore how to find help in the Linux terminal using ls with --help and common switches. Learn to use man pages, apropos, and commands like who and who am I.
Inspect system information with uname and options to reveal Linux distribution, kernel, and full details. Monitor uptime, hostname and FQDN, internal IP, and user login history to assess runtime status.
Master essential Linux commands like cd, pwd, mkdir, touch, cp, mv, rm, vi, and utilities such as cat, less, head, tail, cal, and date.
Explore essential troubleshooting commands to monitor and diagnose systems, including top, ps, and ps aux, plus memory, disk, and network checks like free, df, ping, ip addr, and ip route.
Use the find command to locate files by name, modification time, size, and permissions, with case sensitive and size qualifiers, and learn to combine conditions with and, or, and not.
Learn to compare and manipulate CSV and text files using sed for search and replace, cut for field extraction, and diff tools for file comparison, with practical examples.
Explore grep and regex patterns to search text, handle case sensitivity with -i, and apply practical patterns like starts with, ends with, and contains using grep, egrep, and cut.
Learn to interpret octal permission values and use chmod to adjust user, group, and others permissions, guided by the umask default of 644 and practical script examples.
learn how to automate mundane tasks on linux using crontab, schedule bash scripts with crontab -e and -l, and monitor outputs via mail from cron jobs.
Learn about Linux OS distributions—CentOS, Red Hat Enterprise Linux, Debian, and Ubuntu—and their package managers dnf, yum, and apt.
Recap covers essential Linux commands for file operations, searching with find and grep, managing permissions, editing crontab schedules, and a quick overview of CentOS, Debian, Red Hat, and Ubuntu.
Explore how automation boosts efficiency in cloud platform engineering by reducing toil, automating infrastructure provisioning and deployments, and using simple utilities like get cmd and get roles.
Explore how a zsh profile boosts daily efficiency with aliases and quick path shortcuts, and connect to cloud tools like gcloud, Terraform, and kubectl.
Create a quick getcmd utility to search notes and markdown files for gcloud and kubectl commands. This tool saves minutes and boosts efficiency for daily SRE tasks, supporting observability workflows.
Use the getroles utility to search gcloud IAM roles for a permission, filter by a search phrase, and report roles granting that permission, for example monitoring.dashboards.create, to enforce least privilege.
Learn how to write bash scripts by examining two utilities: get command utility and get roles by permission utility, featuring if-else and for loops that iterate files to produce output.
Explore the automation session highlights, explaining why automation matters for SREs and IAC, and demonstrating practical examples with zsh profile customization and bash-coded utilities like get cmd and get roles.
Learn to use gcloud, the Google Cloud CLI, to manage VMs, container clusters, storage, BigQuery, and Dataproc, then master command formats, interactive help, output formatting, and filtering and sorting.
learn to use gcloud with Google documentation and cheat sheets, install and connect to GCP, explore command structure, interactive help, and working with compute instances and groups in the terminal.
Format gcloud command output with the format flag into csv, json, or yaml, selecting specific fields and columns. Apply filters, describe resources, and redirect or pipe results for automation.
Learn to filter and sort machine type results using exact and partial criteria by zone and name, narrow to high cpu n1 types, and sort by cpu then memory.
Review how to use Google Cloud documentation and cheat sheets, explore the command-line interface interactive help, and practice formatting, filtering, and sorting compute instances to extract the information you need.
Learn to use kubectl as the main control panel for deploying and managing Kubernetes resources, defining configurations in YAML, inspecting cluster health, viewing logs, and troubleshooting on a GKE cluster.
Learn to connect to a GKE cluster with gcloud credentials, and master kubectl context management, including switching and setting a default namespace and working with deployments across namespaces.
Learn to help yourself with kubectl, cluster info, deploy and scale resources, and troubleshoot with describe, logs, and exec. Use aliases to streamline kubectl, Linux, and gcloud commands.
Learn how to use kubectl get to inspect deployments, pods, and services across namespaces, export outputs in json or yaml, and customize views with columns, field selectors, and jsonpath.
Learn to deploy Kubernetes resources with declarative yaml, defining namespaces, deployments, and services, manage replicas and exposure with cluster ip and load balancer, and perform rolling restarts with kubectl.
Apply imperative kubectl commands to deploy Nginx and expose it as a load balancer. See the external IP and Nginx page, then compare imperative and declarative approaches.
Explore Kubernetes troubleshooting with practical commands, including kubectl describe, kubectl logs, and kubectl exec, to diagnose unhealthy pods, image pull failures, and events in production.
Connect to the GKE cluster and set the context to begin practicing kubectl. Explore kubectl options, json and yaml, jsonpath, declarative versus imperative deployments, troubleshoot with describe, logs, and exec.
Explore the vi editor, a command-line text editor on Unix like systems, and learn navigation, editing, search and replace, plus configuration tips and a GitHub cheat sheet.
Master vi navigation: use h j k l to move, w and b by word, gh and G to top or bottom, and colon or number G to jump.
Learn core vi editing: insert modes (i, I, a, A, o, O), delete operations (dd, 5dd, dw, D), and copy-paste with yy, p, and P.
Master search and replace in the vi editor, including ignore-case and case-sensitive options, regex patterns, undo, and using set and list to show or hide invisible characters.
Set permanent vim configurations by editing the vimrc profile in your home directory, applying defaults like set number and set ignore case, using your bash profile.
Apply the 80/20 principle while learning the vi editor, using the SRE bootcamp public GitHub repo’s cheat sheet for quick Linux command references and hands-on practice.
Navigate with vim keys j k h l, w b, gh, G; edit with i a o, d, y y, and p; search with slash; configure set and vim profile.
Explore how IP addresses identify devices in GCP networks, enable communication and security, and how RFC 1918, CIDR notation, and subnetting guide cloud architecture for SREs and platform engineers.
Learn RFC 1918 private IPv4 address ranges of 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16, and how nat enables internal hosts to reach the internet. Not routable on the public internet.
Explore cidr notation and its ip range representations, learn how slash prefixes define subnets like 192.168.0.0 to 192.168.255.255, and use a subnetting chart showing 65,536 addresses.
Design subnets from the 10.2.40.0/16 RFC 1918 space to support gcc clusters and gcc vms in east and west regions within a hybrid multi-cloud GCP landing zone.
Break down the 10.240.0.0/16 network into /18, /19, and /20 subnets for GKE pods, services, and GCE apps across east and west, detailing gateway, broadcast, and Google future use IPs.
Apply subnetting concepts to design and implement VPC subnets from scratch using CIDR ranges, such as /20, in GCP or via Terraform, avoiding overlapping networks.
Explore RFC 1918 address ranges and CIDR notations, complete a subnetting exercise, and apply findings to building private IP space and creating subnets in GCP infrastructure.
This module begins implementing observability on GCE, GKE, and Cloud Run, covering golden signals, dashboards, and error budgets, while hosting an Apache web server on GCE.
Create a Debian 11 VM in the GCP console with default e2 medium and Apache web server. Tag the VM as web server and note its internal and external IPs.
Learn to ssh into the vm with gcloud compute ssh or from the console, view home directory with pwd, and inspect logs via journalctl and basic Linux commands.
Install and validate the Apache web server on a Debian-based VM using apt, sudo, and systemctl, then open HTTP and HTTPS ports via firewall rules and verify with curl.
Configure Apache web-server by editing index.html with vi, replace hostname and IP address using shell commands, and add script.js to enable a button, validating via page refresh.
Create a VM from the console and Google Cloud Shell, then validate it with Linux commands. Open the firewall, verify Apache is running, and serve a custom HTML page.
Welcome to the SRE Bootcamp | Build, Deploy, Run and Implement Observability, the only course you need to get ready to be a rockstar SRE on the job.
At 7.5 hours of lectures, demos packed with industry experience, this course is without a doubt the most practical-oriented SRE course available anywhere online. Even if you have zero understanding of SRE concepts, this course will take you from beginner to intermediate levels of proficiency, and will enable you on implementing, not just understanding theory. Here are the reasons why:
The course is taught by an industry expert on the subject, who is a daily practitioner himself.
The instructor is an SRE interviewer, and knows exactly what is needed in a candidate to succeed.
The demos and the corresponding GitHub repo access will enable you to not just follow-along, but reuse the instructor's months of hard work, and apply on the job.
The course is current with 2023 trends, hence ensures that you'll be learning the latest tools and technologies used at large companies running their applications on Google Cloud.
The curriculum was developed over a period of 1 year, after a dry-run of the content with a private group of students.
I will take you step-by-step through engaging video tutorials and teach you everything you need to know to succeed as an SRE.
The course includes hands-on demos that build your SRE expertise; this enables you to be productive day 1 as a GCP SRE.
Throughout this course, we cover SRE relevant tools and technologies in details, with demos, including:
Site Reliability Engineering origin
Observability core concepts - Golden Signals, SLIs, SLOs, Error Budgets
Understands the characteristics of a good SRE
Get enabled on SRE foundational skillset - Linux, vi editor, ip sebnetting etc.
GCP CLI - gcloud and kubectl
Deploy apps in all forms of compute on GCP - GCE, GKE and Cloud Run
Automation - how to, and real world examples using Bash (Python not covered in this course).
GCP Logging and Monitoring, Log based metrics
Observability Tools - GCP Native Monitoring, and Grafana
Troubleshooting tools and techniques using Cloud logging and monitoring and kubectl.
By the end of this course, you will be confident, not just clearing SRE job interviews, but also being productive and efficient as an SRE.
REMEMBER… I'm so confident that you'll love this course that I'm offering a FULL money-back guarantee for 30 days! So it's a complete no-brainer, sign up today with ZERO risk and EVERYTHING to gain.
This course is the best way to get ready to crack the toughest of SRE interviews, and be ready to work efficiently as an SRE.
Don’t waste any more time wondering what course is best for you. You’ve already found it. Get started right away!