DocumentationGrafana CloudKnowledge graphUse the knowledge graphExplore SLO breach
Grafana Cloud
Troubleshoot an SLO breach with Grafana Cloud Knowledge Graph
This topic show you how to interpret an error budget burn down chart and use the knowledge graph to troubleshoot an SLO breach.
Before you begin
Before you begin, ensure that you have defined a knowledge graph SLO.
Steps
To use the knowledge graph to troubleshoot an SLO breach, perform the following steps:
Sign in to Grafana Cloud and click Observability > SLO.
Expand Objective for the SLO you want to investigate.
Use the following table to interpret the Error Budget Burndown panel.
Number
Element
Description
1
Target vs Actual
Shows the target SLO compared to the actual SLO.
2
Incidents in Window
Counts the number of SLOs incidents that occurred during the compliance window defined with the SLO was created.
3
Budget Used
The amount of budget used expressed as a number. If the number is greater than 1, then more than 100% of error budget has been used.
4
Recent Budget Usage
The error burn rate calculated over a recent, specific time window. For example, calculating the error burn rate over the last hour gives you a sense of how quickly you’re burning through your error budget.
5
Current Incident Status
Shows an icon that indicates whether error budget is currently being consumed.
6
Events Query
Shows the Bad and Total Events Query used to calculate the SLO.
7
Error Budget Burndown chart
The yellow dashed line indicates the ideal error budget burn down rate. The green line indicates the actual burn down rate. In this example, the error budget remains untouched until the end of the compliance window, when there is consistent and dramatic use of the error budget.
Scroll down the page and review the SLI Zoomed In panel.
In this example, you can see a large spike in error budget usage.
In the Error Budget Burndown panel, click and drag your cursor to select the time range you want to investigate and click Open in RCA workbench.
The Open in RCA workbench button appears after you have added a search expression in the RCA workbench Context section while creating the SLO.
Use RCA workbench to explore entities and insights.
For more information about RCA workbench, refer to Perform root cause analysis in RCA workbench.
Additional helpful documentation, links, and articles:
Video
Getting started with managing your metrics, logs, and traces using Grafana
In this webinar, we’ll demo how to get started using the LGTM Stack: Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics.
Video
Intro to Kubernetes monitoring in Grafana Cloud
In this webinar you’ll learn how Grafana offers developers and SREs a simple and quick-to-value solution for monitoring their Kubernetes infrastructure.
Video
Building advanced Grafana dashboards
In this webinar, we’ll demo how to build and format Grafana dashboards.