About
Courses
Security Pro Data Pro DevOps Pro AI Pro Pricing Contact
DevOps Pro

DevOps & Infrastructure

From Linux basics to platform engineering in 10 levels. Learn Docker, Nginx, Terraform, Kubernetes, Prometheus, Grafana, GitOps, and Python for infrastructure automation.

72Tutorials
22Labs
72Quizzes
9Exams

Full Curriculum

Click any level to see what you'll learn.

The foundation everything else builds on. You'll learn to navigate a Linux system, create and manage files, control who can access what, and chain commands together to automate tasks. By the end, you'll be able to set up users, connect to servers over SSH, install software, and run your first Docker container.

T01 Files & Navigation (+Quiz)
Learn to move around a Linux filesystem, create and organize directories, copy, move, and remove files, and search for anything on the system.
ls, cd, pwd, mkdir, rm, cp, mv, find
T02 Viewing, Editing & Permissions (+Quiz)
Read files from the terminal, edit them with nano, and control exactly who can read, write, or execute every file on the system.
cat, head, tail, less, nano, chmod, chown
T03 Pipes & Redirection (+Quiz)
Chain commands together to filter, sort, count, and search through data. Redirect output to files, combine streams, and build one-liners that do real work.
|, >, >>, 2>&1, tee, sort, uniq, wc, grep
T04 Processes & System (+Quiz)
See what's running on your system, monitor resource usage in real time, stop runaway processes, and manage background jobs.
ps, top, kill, bg, fg, jobs
Lab 1A: Server Cleanup
Put your skills to the test. You're handed a messy server with oversized log files and runaway processes. Find them, clean them up, and bring the system back to health.
Covers: filesystem audit, log cleanup, process management, disk recovery, system health
T05 Users & SSH (+Quiz)
Create and manage user accounts, configure sudo access, generate SSH keys, and connect to remote servers securely.
useradd, passwd, sudo, ssh, ssh-keygen, authorized_keys
T06 Networking (+Quiz)
Check your network interfaces, test connectivity, make HTTP requests from the terminal, understand how DNS resolves names, and identify what's listening on which ports.
ip, ss, ping, curl, DNS resolution, ports
T07 Package Management (+Quiz)
Install, update, and remove software. Manage repositories, resolve dependency issues, and keep your system current.
apt, dpkg, repositories, dependencies
T08 Docker Basics (+Quiz)
Pull images, run containers, map ports, and manage the container lifecycle. Your first step into the tool that changed how software gets deployed.
images, containers, run, ps, stop, rm, port mapping
Lab 1B: New Server Checklist
You just got access to a fresh server. Set up users, configure SSH, install essential packages, and get it production-ready from scratch.
Covers: user setup, SSH configuration, package installation, server hardening, baseline config
Exam Level 1 Master Exam
Prove you've mastered the fundamentals. 10 questions covering everything from Level 1. Score 80% or higher to pass.
Covers: files, permissions, pipes, processes, users, SSH, networking, packages, docker

Running a Linux server means keeping it healthy, secure, and automated. Job scheduling, log management, firewalls, disk management, backups, shell environments, systemd services, and text processing with awk and sed.

T09Cron & Job Scheduling (+Quiz)
Automate recurring tasks with cron. Write crontab entries, schedule scripts to run at specific times, and build the habits that keep servers running without you watching.
crontab, cron.d, at, 5-field syntax, scheduling
T10Log Management (+Quiz)
Read and analyze system logs with journalctl and syslog. Set up log rotation so your disks don't fill up, and learn to spot patterns that tell you something is wrong before it breaks.
journalctl, syslog, logrotate, /var/log, log analysis
T11Firewalls with UFW (+Quiz)
Control what traffic gets in and out of your server. Set up allow and deny rules, configure default policies, and understand the basics of network security at the OS level.
ufw, allow, deny, status, default policies
T12Disk Management (+Quiz)
Monitor disk usage, understand mount points, work with block devices, and get an introduction to LVM for flexible storage management.
df, du, lsblk, mount, fdisk, LVM
Lab2A: Scheduled Maintenance
Build a maintenance routine for a production server. Set up automated backups, schedule log rotation, and write a cron job that monitors disk usage and alerts when it gets critical.
Covers: automated backups, log rotation, cron scheduling, disk monitoring, alerting
T13Backup Strategies (+Quiz)
Use rsync and tar to create reliable backups. Learn incremental backup strategies, automate them with cron, and understand the difference between a backup you have and one you can actually restore.
rsync, tar, incremental backups, cron automation
T14Shell Environment (+Quiz)
Customize your terminal with .bashrc and .profile. Set up PATH, create aliases for common commands, and configure environment variables that persist across sessions.
.bashrc, .profile, PATH, aliases, export
T15Systemd Deep Dive (+Quiz)
Understand how Linux manages services at boot and runtime. Work with units, targets, and dependencies. Start, stop, enable, and debug services with systemctl and journalctl.
systemctl, units, services, targets, journalctl
T16Text Processing (+Quiz)
Slice, transform, and analyze structured text from the command line. Use cut for columns, awk for field extraction and pattern matching, and sed for find-and-replace across files.
cut, awk, sed, sort, uniq, tr
Lab2B: Server Health Report
Build an automated health report from scratch. Pull disk usage, memory stats, service status, and recent errors into a single report that runs on a schedule and saves to a file.
Covers: disk usage, memory stats, service status, error aggregation, report automation
ExamLevel 2 Master Exam
Prove you can administer a Linux server. 10 questions covering scheduling, logs, firewalls, backups, systemd, and text processing. Score 80% or higher to pass.
Covers: scheduling, logs, firewalls, backups, systemd, text processing

The DevOps toolkit. Advanced Docker Compose for multi-service stacks. Nginx configuration and reverse proxying. SSL/TLS certificates. Git workflows for team collaboration. CI/CD pipeline fundamentals. Shell scripting for automation.

T17Docker Compose Advanced (+Quiz)
Go beyond single containers. Define multi-service applications with Docker Compose, configure health checks, manage dependencies between services, and handle environment variables across your stack.
docker-compose.yml, depends_on, healthcheck, networks, env_file
T18Nginx Configuration (+Quiz)
Set up Nginx as a web server and reverse proxy. Write server blocks, route traffic to upstream services, configure location rules, and read access logs to understand what's hitting your server.
server blocks, proxy_pass, upstream, location, access_log
T19SSL & Certificates (+Quiz)
Understand how TLS secures traffic between clients and servers. Generate self-signed certificates, learn how Let's Encrypt works, and configure HTTPS on your web server.
openssl, self-signed certs, certificate chains, HTTPS
T20Git Workflows (+Quiz)
Work with branches, merge code, resolve conflicts, and follow the collaboration patterns used by real engineering teams. From feature branches to pull requests.
git branch, merge, rebase, pull request, .gitignore
Lab3A: Deploy a Web Stack
Put it together. Deploy a multi-container web application with Nginx in front, configure SSL, and verify the entire stack is healthy and serving traffic.
Covers: multi-container deployment, Nginx config, SSL setup, health checks, traffic verification
T21CI/CD Fundamentals (+Quiz)
Learn the pipeline stages that take code from commit to production. Understand build, test, and deploy phases, and how GitHub Actions automates the entire process.
pipeline stages, build, test, deploy, GitHub Actions
T22Shell Scripting (+Quiz)
Write real scripts that handle variables, conditionals, loops, and functions. Learn error handling with set -euo pipefail and build automation tools you'll actually use.
variables, if/else, for/while, functions, set -euo pipefail
T23System Monitoring (+Quiz)
Monitor server health from the command line. Check uptime, disk space, memory, and CPU usage. Write health check scripts that catch problems before they become outages.
uptime, df, free, top, health check scripts
T24Networking Deep Dive (+Quiz)
Go deeper into how networks actually work. TCP vs UDP, DNS resolution under the hood, routing tables, network namespaces, and how Docker networking ties it all together.
TCP/UDP, DNS, routing, network namespaces, Docker networking
Lab3B: Reverse Proxy Setup
Configure Nginx as a reverse proxy for multiple backend services. Route traffic by path, set up upstream load balancing, and verify everything works end to end.
Covers: reverse proxy, upstream routing, load balancing, path-based routing, end-to-end testing
ExamLevel 3 Master Exam
Prove you can build and deploy infrastructure. 10 questions covering Docker Compose, Nginx, SSL, Git, CI/CD, and networking. Score 80% or higher to pass.
Covers: Docker Compose, Nginx, SSL, Git, CI/CD, networking

Power tools for serious Linux work. vim for efficient editing on any server. Systemd unit authoring for custom services. iptables for real firewall control. lsof and strace for debugging. Netcat for network testing. Ansible for automation.

T25vim (+Quiz)
The editor that's on every server. Learn modes, navigation, editing, search and replace, and how to configure .vimrc. Once you're comfortable in vim, you can edit anything, anywhere.
modes, navigation, search/replace, buffers, .vimrc
T26Systemd Unit Authoring (+Quiz)
Write your own systemd service files from scratch. Define how services start, set dependencies, create timers, and manage custom applications as first-class system services.
service files, ExecStart, dependencies, timers, targets
T27iptables (+Quiz)
Real firewall control. Understand chains, tables, and rule ordering. Write ACCEPT, DROP, and REJECT rules. Configure NAT, log suspicious traffic, and persist rules across reboots.
chains, tables, ACCEPT, DROP, REJECT, NAT, iptables-save
T28lsof & Debugging (+Quiz)
Find out which processes have files open, which ports are in use, and what's holding onto resources it shouldn't be. The go-to tool when something is stuck and you need to figure out why.
lsof -i, lsof +D, open files, network connections
Lab4A: Debug a Broken Server
A server is misbehaving. Services won't start, ports are blocked, and something is eating disk space. Use vim, systemctl, iptables, and lsof to diagnose and fix every issue.
Covers: service diagnostics, port conflicts, disk triage, process debugging, system recovery
T29strace (+Quiz)
Trace system calls to see exactly what a process is doing under the hood. Debug hanging processes, identify permission issues, and measure where latency is coming from.
strace -p, strace -e, system calls, latency tracing
T30Netcat (+Quiz)
The Swiss army knife of networking. Test if ports are open, transfer files between machines, set up simple servers, and debug network connectivity issues in seconds.
nc, port testing, file transfer, simple servers
T31Ansible (+Quiz)
Automate configuration across multiple servers. Write inventory files and playbooks, use modules and handlers, and understand idempotency so your automation is safe to run twice.
inventory, playbooks, modules, handlers, idempotency
T32Make (+Quiz)
Use Makefiles as task runners for your projects. Define targets, manage dependencies between tasks, set variables, and build repeatable workflows that anyone on the team can run.
Makefile, targets, dependencies, variables, .PHONY
Lab4B: Automate a Deployment
Write an Ansible playbook that provisions a server, installs dependencies, deploys an application, and verifies it's running. Automation that works the first time and every time after.
Covers: Ansible playbooks, provisioning, dependency management, deployment verification, idempotency
ExamLevel 4 Master Exam
Prove you know the power tools. 10 questions covering vim, systemd, iptables, lsof, strace, netcat, Ansible, and Make. Score 80% or higher to pass.
Covers: vim, systemd, iptables, lsof, strace, Ansible, Make

Defining infrastructure in files instead of clicking consoles. Terraform fundamentals, AWS CLI, S3, IAM, and automating infrastructure with CI/CD pipelines.

T33IaC Concepts (+Quiz)
Understand why infrastructure should be defined in code, not clicked in consoles. Learn the declarative model, what drift means, why idempotency matters, and how state files track what exists.
declarative model, drift, idempotency, state management
T34Terraform Fundamentals (+Quiz)
Write your first Terraform configurations in HCL. Define providers and resources, run plan to preview changes, apply to create infrastructure, and destroy to tear it down cleanly.
HCL, providers, resources, plan, apply, destroy, state files
T35Terraform Modules & Variables (+Quiz)
Build reusable infrastructure components with modules. Use input variables and outputs to make your configs flexible, and manage multiple environments without duplicating code.
modules, input variables, outputs, multi-environment configs
T36AWS CLI Foundations (+Quiz)
Interact with cloud services from the terminal. Run AWS CLI commands, parse JSON responses with jq, filter and project output, and build scripts that automate cloud operations.
aws cli, jq, JSON parsing, filtering, projecting output
Lab5A: Terraform from Scratch
Define a complete infrastructure stack in Terraform. Write the config, plan it, apply it, modify a resource, re-apply, and tear it all down. The full IaC lifecycle in one lab.
Covers: Terraform config, plan, apply, modify, destroy, IaC lifecycle
T37S3 & Object Storage (+Quiz)
Work with cloud object storage. Create buckets, upload and manage objects, configure lifecycle policies for automatic cleanup, enable versioning, and control access permissions.
S3 buckets, objects, lifecycle policies, versioning, ACLs
T38IAM & Access Control (+Quiz)
Control who can do what in the cloud. Create roles, write policy documents, apply least privilege principles, and understand how identity federation works at the organizational level.
IAM roles, policies, least privilege, identity federation
T39Infrastructure Pipelines (+Quiz)
Automate Terraform with CI/CD. Run plan on pull requests, automate apply on merge, handle state locking to prevent conflicts, and build a pipeline that deploys infrastructure safely.
Terraform in CI/CD, plan review, automated apply, state locking
T40Cloud Architecture Patterns (+Quiz)
Think about infrastructure at the design level. Multi-tier architectures, high availability patterns, disaster recovery strategies, and cost optimization decisions that matter in production.
multi-tier, high availability, disaster recovery, cost optimization
Lab5B: Multi-Environment IaC
Build a Terraform setup that manages dev, staging, and production environments from a single codebase. Use variables and modules to keep it DRY while keeping environments isolated.
Covers: environment isolation, variables, modules, DRY infrastructure, state management
ExamLevel 5 Master Exam
Prove you can define and manage infrastructure as code. 10 questions covering Terraform, AWS CLI, S3, IAM, and cloud architecture. Score 80% or higher to pass.
Covers: Terraform, AWS CLI, S3, IAM, cloud architecture

Container orchestration at scale. Kubernetes concepts, pods, deployments, services, ConfigMaps, namespaces, debugging, and deploying a multi-tier application.

T41Kubernetes Concepts (+Quiz)
Understand the big picture before touching a cluster. The desired state model, how the control plane works, what worker nodes do, and why Kubernetes thinks about infrastructure differently than you're used to.
control plane, worker nodes, etcd, kube-apiserver, desired state
T42Pods & kubectl (+Quiz)
The fundamental building block and your command line tool. Write pod YAML, use get, describe, logs, and exec to inspect running workloads, and understand labels and selectors.
kubectl get, describe, logs, exec, pod YAML, labels, selectors
T43Deployments & ReplicaSets (+Quiz)
Scale your applications and update them without downtime. Manage replicas, perform rolling updates, roll back to previous versions, and understand how Kubernetes tracks revision history.
replicas, rolling updates, rollbacks, revision history, scaling
T44Services & Networking (+Quiz)
Give your pods stable endpoints that don't change when containers restart. ClusterIP for internal traffic, NodePort for external access, LoadBalancer for production, and how DNS ties it all together.
ClusterIP, NodePort, LoadBalancer, DNS, endpoints, selectors
Lab6A: Deploy a Microservice
Deploy a containerized application to Kubernetes. Create the deployment, expose it with a service, scale it up, perform a rolling update, and verify zero downtime throughout.
Covers: K8s deployment, service exposure, scaling, rolling updates, zero downtime
T45ConfigMaps & Secrets (+Quiz)
Separate your application code from its configuration. Inject settings through environment variables and volume mounts. Store sensitive data in Secrets with base64 encoding.
ConfigMap, Secret, env vars, volume mounts, base64 encoding
T46Namespaces & Resources (+Quiz)
Isolate workloads and control resource consumption. Create namespaces for different teams or environments, set CPU and memory requests and limits, and enforce quotas.
namespaces, resource requests, limits, ResourceQuota, LimitRange
T47Debugging & Troubleshooting (+Quiz)
When things go wrong in Kubernetes, you need to know where to look. Diagnose CrashLoopBackOff, ImagePullBackOff, and Pending pods. Read events, check previous logs, and exec into containers.
CrashLoopBackOff, ImagePullBackOff, Pending, events, logs --previous
T48Kubernetes Capstone (+Quiz)
Bring it all together. Deploy a multi-tier application with frontend, backend, and database. Configure services, manage config with ConfigMaps, perform a rolling update, and debug a failure.
multi-tier deployment, services, rolling update, debugging
Lab6B: Multi-Tier App
Deploy a complete application stack on Kubernetes. Multiple services talking to each other, proper resource limits, config externalized, and everything recoverable if a pod dies.
Covers: multi-service stack, resource limits, config externalization, pod recovery, networking
ExamLevel 6 Master Exam
Prove you can work with Kubernetes. 10 questions covering pods, deployments, services, ConfigMaps, namespaces, and troubleshooting. Score 80% or higher to pass.
Covers: pods, deployments, services, ConfigMaps, namespaces, troubleshooting

Knowing what your systems are doing. Prometheus, PromQL, alerting, structured logging, dashboards, SLIs/SLOs, and investigating real incidents through metrics and logs.

T49Observability Concepts (+Quiz)
Learn the three pillars of observability: metrics, logs, and traces. Understand the RED and USE methods for measuring system health, and how SLIs, SLOs, and SLAs define reliability targets.
metrics, logs, traces, RED method, USE method, SLI, SLO, SLA
T50Prometheus Fundamentals (+Quiz)
Understand how Prometheus collects and stores time series data. Learn the difference between counters, gauges, and histograms, configure scrape targets, and write your first PromQL queries.
time series, counter, gauge, histogram, scrape config, PromQL
T51PromQL Deep Dive (+Quiz)
Go beyond basic queries. Calculate request rates with rate(), aggregate across labels with sum by, compute percentiles with histogram_quantile, and build the error rate calculations that drive real alerting.
rate(), increase(), sum by, avg by, histogram_quantile, error rate
T52Alerting (+Quiz)
Set up alerts that wake people up only when they should. Write alert rules, configure routing and severity, use silences during maintenance, and learn the on-call practices that reduce noise without missing real incidents.
alert rules, routing, severity, silences, on-call practices
Lab7A: Build an Alert Pipeline
Design and configure an alerting pipeline from scratch. Define what matters, set thresholds, write the rules, route alerts by severity, and test that the right people get notified for the right reasons.
Covers: alert design, thresholds, routing rules, severity tiers, notification testing
T53Structured Logging (+Quiz)
Move beyond unstructured log lines. Write JSON-formatted logs that machines can parse, add correlation IDs to trace requests across services, and understand how log aggregation works at scale.
JSON-lines, log levels, correlation IDs, log aggregation
T54Dashboards & Visualization (+Quiz)
Build dashboards that answer questions at a glance. Apply the RED method for request-driven services, the USE method for resources, and learn which chart types actually communicate what you need.
RED dashboard, USE dashboard, chart types, Grafana concepts
T55SLIs, SLOs & Error Budgets (+Quiz)
Define reliability in terms your team and your business can agree on. Set service level indicators, establish objectives, calculate error budgets, and make data-driven decisions about when to ship and when to stabilize.
SLI definition, SLO targets, error budget calculation, burn rate
T56Observability Capstone (+Quiz)
An incident just happened. Use metrics to identify the affected service, read logs to trace the root cause, check deployment events for the trigger, and write up what went wrong and how to prevent it.
incident metrics, log correlation, deployment events, root cause
Lab7B: Incident Investigation
You're on call and the alerts are firing. Triage the situation, use dashboards and queries to narrow down the problem, identify the root cause, and document the incident for the team.
Covers: alert triage, dashboard analysis, root cause isolation, incident documentation, postmortem
ExamLevel 7 Master Exam
Prove you can monitor and observe production systems. 10 questions covering Prometheus, PromQL, alerting, logging, dashboards, and SLOs. Score 80% or higher to pass.
Covers: Prometheus, PromQL, alerting, logging, dashboards, SLOs

The capstone. Deploy a real application stack with web frontend, API, PostgreSQL, Prometheus, and Grafana. Platform engineering, GitOps, deployment pipelines, and full lifecycle operations.

T57Platform Engineering (+Quiz)
Shift from managing infrastructure to building platforms. Understand internal developer tools, self-service workflows, and how platform teams improve developer experience across an organization.
internal platforms, self-service, developer experience, golden paths
T58GitOps Principles (+Quiz)
Use git as the single source of truth for your infrastructure. Learn automated reconciliation, environment promotion through branches, and why every change should go through a pull request.
git as source of truth, reconciliation, environment promotion
T59Deploying the Stack (+Quiz)
Deploy a real application stack with Docker Compose: web frontend, API, PostgreSQL, Prometheus, and Grafana. Verify every service is healthy, read container logs, and troubleshoot startup failures.
docker compose up, service verification, health checks, container logs
T60Configuration Management (+Quiz)
Manage configuration across environments without hardcoding values. Use environment files, compose overrides, feature flags, and detect config drift before it causes production issues.
env files, compose overrides, feature flags, config drift detection
Lab8A: Blue-Green Deploy
Deploy a new version of your application alongside the old one. Shift traffic to the new version, verify it's working, and roll back instantly if something goes wrong. Zero downtime.
Covers: blue-green deployment, traffic shifting, rollback, zero downtime, verification
T61Deployment Pipelines (+Quiz)
Build deploy scripts that track versions, run health check gates before promoting, and automatically roll back when checks fail. The difference between deploying and deploying safely.
deploy scripts, version tracking, health gates, automated rollback
T62GitOps Workflow (+Quiz)
Put GitOps into practice. Deploy by merging to main, promote changes across environments through git, detect drift between what's in git and what's running, and roll back by reverting a commit.
git-based deploys, promotion, drift detection, rollback via revert
T63Monitoring the Real Stack (+Quiz)
Point Prometheus at your running stack and watch real metrics flow in. Write PromQL queries against live data, generate traffic to see the numbers move, and set up Grafana dashboards that show what matters.
live PromQL queries, traffic generation, real alerts, Grafana
T64Scaling & Performance (+Quiz)
Scale your application horizontally with Docker Compose. Distribute load across replicas, set resource limits so one service can't starve the others, and understand capacity planning at a practical level.
horizontal scaling, load distribution, resource limits, capacity
Lab8B: Full Lifecycle
The ultimate DevOps test. Deploy the stack, push an update, monitor the rollout, intentionally break something, detect the failure through alerts, roll back, and verify recovery. The full lifecycle.
Covers: deployment, monitoring, failure injection, alert detection, rollback, recovery
ExamLevel 8 Master Exam
Prove you can run production infrastructure. 10 questions covering platform engineering, GitOps, deployment pipelines, monitoring, and scaling. Score 80% or higher to pass.
Covers: platform engineering, GitOps, deployment pipelines, monitoring, scaling

Python as a DevOps power tool. Scripting, API automation, configuration management, infrastructure tooling, and building production-grade CLI tools.

T65Python Fundamentals (+Quiz)
Learn Python the way DevOps engineers use it. Variables, data types, functions, conditionals, and loops. No theory-heavy computer science approach, just the practical foundation you need to start automating.
variables, data types, functions, conditionals, loops, dicts, lists
T66File I/O & Config Parsing (+Quiz)
Read and write files, parse configuration in JSON and YAML, process log files line by line, and build scripts that extract the data you need from the files your infrastructure generates.
open(), read, write, json module, yaml, csv, log parsing
T67API Automation (+Quiz)
Talk to APIs from Python. Use the requests library to hit REST endpoints, handle authentication, parse JSON responses, and build scripts that automate tasks you've been doing manually in a browser.
requests, GET, POST, authentication, JSON responses, webhooks
T68Infrastructure Scripting (+Quiz)
Connect to remote servers with paramiko, interact with AWS using boto3, run shell commands from Python with subprocess, and build the glue scripts that tie your infrastructure together.
paramiko, boto3, subprocess, fabric, remote execution
Lab9A: API Health Monitor
Build a Python script that checks multiple API endpoints on a schedule, tracks response times and status codes, and alerts you when something goes down. A real monitoring tool you wrote yourself.
Covers: endpoint monitoring, response tracking, status alerts, scheduling, Python scripting
T69CLI Tools (+Quiz)
Build command line tools that your team can actually use. Parse arguments with argparse, add subcommands, validate inputs, and create reusable scripts that feel like real utilities, not throwaway hacks.
argparse, subcommands, click, input validation, help text
T70Error Handling & Logging (+Quiz)
Write scripts that fail gracefully instead of crashing silently. Use try/except properly, set up structured logging with Python's logging module, and build automation that tells you what went wrong and why.
try/except, logging module, log levels, structured error output
T71Testing (+Quiz)
Test your automation before it runs in production. Write unit tests with pytest, mock external services so tests run fast, and build the confidence that your scripts do what you think they do.
pytest, unittest, mocking, fixtures, test coverage
T72DevOps Automation Capstone (+Quiz)
Build a production-grade DevOps tool from scratch. It takes arguments, reads config, calls APIs, handles errors, logs everything, and has tests. The kind of tool you'd actually commit to your team's repo.
argparse, config, API calls, error handling, logging, tests
Lab9B: Build a Deploy Tool
Write a Python deployment tool that pulls the latest code, runs health checks, swaps traffic to the new version, and rolls back automatically if anything fails. Real automation for real infrastructure.
Covers: automated deployment, health checks, traffic swap, rollback logic, error handling
ExamLevel 9 Master Exam
Prove you can automate with Python. 10 questions covering scripting, APIs, infrastructure tooling, error handling, and testing. Score 80% or higher to pass.
Covers: Python scripting, APIs, infrastructure tooling, error handling, testing

Four capstone labs that combine everything you've learned across all levels.

LabZero-Downtime Migration
Migrate a running application from v1.0 to v2.0 without dropping a single request. Plan the cutover, deploy the new version alongside the old, shift traffic, verify, and decommission. The real-world upgrade path.
Covers: version migration, traffic cutover, parallel deployment, verification, decommission
LabChaos Engineering
Intentionally break your infrastructure and prove it can recover. Kill containers mid-request, corrupt configuration files, drop database tables, and verify that your monitoring catches it and your recovery procedures work.
Covers: failure injection, container recovery, config corruption, monitoring validation, recovery procedures
LabSecurity Hardening Sprint
Take a running Docker stack and audit it against CIS benchmarks. Find every weakness, apply every fix, re-scan to verify, and document what you changed and why. Security as a hands-on practice, not a checklist.
Covers: CIS benchmarks, vulnerability audit, fix application, re-scan verification, documentation
LabCapacity Planning
Analyze 4 weeks of Prometheus metrics, identify usage trends, project when you'll hit capacity, find the bottleneck before your users do, and build a scaling plan with actual numbers behind it.
Covers: metrics analysis, trend projection, bottleneck detection, scaling plan, capacity forecasting

Where these skills take you

Real job titles that use the tools taught in this course.

Entry-level
$60K – $80K
  • Junior DevOps Engineer
  • Systems Administrator
  • Junior Site Reliability Engineer
  • Cloud Support Engineer
  • Build & Release Engineer
  • IT Infrastructure Engineer
2 Years Experience
$110K – $150K
  • DevOps Engineer
  • Site Reliability Engineer
  • Platform Engineer
  • Cloud Infrastructure Engineer
  • CI/CD Engineer
  • Kubernetes Administrator
4+ Years Experience
$160K – $220K+
  • Senior DevOps Engineer
  • Staff SRE
  • Infrastructure Architect
  • Principal Platform Engineer
  • Engineering Manager (Infra)
  • Cloud Architect

Salary ranges based on 2025-2026 US market data. The first role in each column is the most common entry point from this course.

Start building infrastructure skills

One purchase. Lifetime access. No subscription.

Get DevOps Pro