top of page

Grafana Dashboard with Prometheus: The Ultimate Monitoring Setup Guide for 2025



In today's cloud-native world, knowing what's happening inside your infrastructure in real time isn't optional — it's essential. Whether you're running microservices on Kubernetes, managing bare-metal servers, or operating a hybrid cloud environment, the Prometheus + Grafana stack has become the industry gold standard for metrics collection, visualization, and alerting. In this complete guide, you'll learn how to set up a production-grade Grafana dashboard powered by Prometheus from scratch in 2025.


What Is Prometheus and Why Does It Matter?


Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud in 2012 and now a graduated project of the Cloud Native Computing Foundation (CNCF). It collects metrics from configured targets at given intervals, evaluates rule expressions, displays results, and can trigger alerts when specified conditions are met.


Key features of Prometheus include a multi-dimensional data model with time series data identified by metric name and key/value pairs, PromQL — a flexible query language to leverage this dimensionality, no reliance on distributed storage (single server nodes are autonomous), time series collection happening via a pull model over HTTP, and multiple modes of graphing and dashboarding support.


What Is Grafana and How Does It Complement Prometheus?



Grafana is the world's most popular open-source observability and analytics platform. While Prometheus excels at collecting and storing metrics, its built-in visualization is basic. Grafana bridges that gap by connecting to Prometheus as a data source and rendering beautiful, interactive, real-time dashboards. Together they form the core of the modern observability stack — often extended with Loki for logs and Tempo for traces.


Architecture Overview: How Prometheus and Grafana Work Together



The flow is straightforward: your applications and infrastructure expose metrics endpoints (usually at /metrics). Prometheus scrapes these endpoints at regular intervals and stores the time series data in its local TSDB (time series database). Grafana then queries Prometheus using PromQL to fetch and display that data. Alertmanager handles alert routing when Prometheus rules fire.

The key components are: Prometheus Server (scrapes and stores metrics), Exporters (node_exporter, kube-state-metrics, blackbox_exporter), Alertmanage

r (handles alerts), Grafana (visualizes data), and Pushgateway (for batch jobs).



Step 1: Install Prometheus



For Kubernetes environments, the recommended approach in 2025 is the kube-prometheus-stack Helm chart, which bundles Prometheus, Grafana, Alertmanager, node-exporter, and kube-state-metrics together. Install it with:


helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
  --namespace monitoring \
  --create-namespace \
  --set grafana.adminPassword='YourSecurePassword'

For standalone Linux installation, download the latest Prometheus binary from the official releases page. Extract it, create a prometheus.yml configuration file, and run prometheus --config.file=prometheus.yml. Prometheus will start on port 9090 by default.



Step 2: Install Grafana

If you used kube-prometheus-stack above, Grafana is already installed. Access it by port-forwarding the Grafana service:

kubectl port-forward svc/kube-prometheus-stack-grafana 3000:80 -n monitoring

For standalone Docker installation, run: docker run -d -p 3000:3000 --name grafana grafana/grafana-enterprise. Navigate to http://localhost:3000 and log in with admin/admin (change the password immediately).



Step 3: Connect Prometheus as a Grafana Data Source

In Grafana, navigate to Connections > Data Sources > Add new data source. Select Prometheus. Set the server URL to http://prometheus-server:9090 (or your Prometheus endpoint). Click Save & Test — you should see a green checkmark confirming the connection. This is the bridge between your metrics store and your dashboards.



Step 4: Master PromQL — The Key to Powerful Dashboards

PromQL (Prometheus Query Language) is what makes Prometheus dashboards so powerful. Here are the most important queries every DevOps engineer should know:

CPU Usage per Node: 100 - (avg by (instance) (rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) — This shows the CPU usage percentage across all nodes, averaged over a 5-minute window.

Memory Usage: (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 — Calculates the percentage of memory in use per node.

HTTP Request Rate: rate(http_requests_total[5m]) — Shows the per-second rate of HTTP requests averaged over 5 minutes. Filter by status code or endpoint using label selectors like {status="500"} for error tracking.

Kubernetes Pod Restarts: increase(kube_pod_container_status_restarts_total[1h]) — Tracks pod restarts in the last hour, useful for detecting unstable deployments.



Step 5: Build Your Grafana Dashboard

Click the + icon > New Dashboard > Add visualization. Select your Prometheus data source. In the query editor, enter your PromQL query. Grafana supports multiple visualization types: Time series (for trending metrics), Gauge (for current values like CPU %), Stat panels (for KPIs), Heatmaps (for latency distributions), and Table panels (for pod status lists).

Pro tip: Instead of building from scratch, import community dashboards from grafana.com/dashboards. The Node Exporter Full dashboard (ID: 1860) is the most popular with 20M+ downloads and gives you comprehensive server metrics immediately. For Kubernetes, the Kubernetes Cluster (ID: 7249) and Kubernetes Pods (ID: 6417) dashboards are excellent starting points.



Step 6: Configure Alerting in Grafana + Prometheus

Grafana 10+ includes a powerful unified alerting system. Navigate to Alerting > Alert Rules > New alert rule. Define your condition (e.g., CPU > 85% for 5 minutes), set the evaluation interval, and configure notification channels like Slack, PagerDuty, email, or webhooks. You can also define alerts directly in Prometheus using alerting rules in your prometheus.yml.


groups:
  - name: critical-alerts
    rules:
      - alert: HighCPUUsage
        expr: 100 - (avg by(instance)(rate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 85
        for: 5m
        labels:
          severity: critical
        annotations:
          summary: "High CPU usage detected on {{ $labels.instance }}"
          description: "CPU usage is above 85% for more than 5 minutes."


Kubernetes Monitoring with Grafana + Prometheus in 2025


For Kubernetes environments, the kube-prometheus-stack automatically deploys ServiceMonitors that tell Prometheus which services to scrape. Key metrics to monitor include: node CPU/memory/disk, pod CPU and memory limits vs requests, deployment replica availability, PVC usage, network I/O per pod, and API server latency. The kube-state-metrics exporter provides rich Kubernetes object state metrics, while node-exporter handles OS-level hardware metrics.

A critical best practice: always set resource requests and limits on your Prometheus pods to prevent them from consuming excessive cluster resources. In production, use Thanos or Cortex for long-term metrics storage and high availability across multiple Prometheus instances.


10 Pro Tips for Production Grafana Dashboards


1. Use dashboard variables (templating) to make dashboards reusable across clusters, namespaces, and services. 2. Set sensible time ranges and refresh rates — 5m refresh is enough for most dashboards. 3. Use recording rules in Prometheus for expensive PromQL queries to pre-compute them and reduce load. 4. Organize panels in rows with clear labels and collapse non-critical sections by default. 5. Always set Y-axis units (%, bytes, requests/s) for readability. 6. Use threshold colors (green/yellow/red) on gauge and stat panels for instant visual health status. 7. Export dashboards as JSON and store them in Git for version control. 8. Use Grafana annotations to mark deployments, incidents, and maintenance windows on graphs. 9. Enable RBAC in Grafana to control who can view vs edit dashboards. 10. Test your alerting rules regularly with amtool or grafana test alerts to ensure they fire correctly.



Conclusion: Build Your Observability Stack Today

The Grafana + Prometheus combination remains the most powerful, flexible, and cost-effective observability stack available in 2025. Whether you're a solo developer monitoring a side project or a DevOps team managing hundreds of Kubernetes nodes, this stack scales with you. Start with the kube-prometheus-stack Helm chart, import community dashboards, write your first PromQL alerts, and within a few hours you'll have production-grade visibility into your entire infrastructure. The investment in observability always pays dividends — the first time an alert wakes you up before your users notice an outage, you'll understand why.

Subscribe to our newsletter

 
 
 

Recent Posts

See All

Comments


bottom of page