Troubleshoot the Kubernetes Monitoring Helm chart configuration

Grafana Alloy has a web user interface that shows every configuration component the Alloy instance is using and the component status. By default, the web UI runs on each Alloy pod on port 12345. Since that UI is typically not exposed external to the Cluster, you can access it with port forwarding:

kubectl port-forward svc/grafana-k8s-monitoring-alloy 12345:12345

Then open a browser to http://localhost:12345 to view the GUI.

Specific Cluster platform providers

Certain Kubernetes Cluster platforms require some specific configurations for this Helm chart. If your Cluster is running on one of these platforms, see the example for the changes required to run this Helm chart:

Common issues

The following are frequently seen problems related to configuring with this Helm chart.

Authentication error: invalid scope requested

To deliver telemetry data to Grafana Cloud, you use an Access Policy Token with the appropriate scopes. Scopes define an action that can be done to a specific data type. For example metrics:write permits writing metrics.

If sending data to Grafana Cloud, this Helm chart uses the <data>:write scopes for delivering data.

If your token does not have the correct scope, you will see errors in the Grafana Alloy logs. For example, when trying to deliver profiles to Pyroscrope without the profiles:write scope:

msg="final error sending to profiles to endpoint" component=pyroscope.write.profiles_service endpoint=http://tempo-prod-1-prod-eu-west-2.grafana.net:443 err="unauthenticated: authentication error: invalid scope requested"

The following table shows the scopes required for various actions done by this chart:

Data type	Server	Scope for writing	Scope for reading
Metrics	Grafana Cloud Metrics (Prometheus or Mimir)	`metrics:write`	`metrics:read`
Logs & Cluster Events	Grafana Cloud Logs (Loki)	`logs:write`	`logs:read`
Traces	Grafana Cloud Trace (Tempo)	`traces:write`	`traces:read`
Profiles	Grafana Cloud Profiles (Pyroscope)	`profiles:write`	`profiles:read`

Kepler Pods crashing on AWS Graviton Nodes

Kepler cannot run on AWS Graviton Nodes and Pods; these Nodes will CrashLoopBackOff. To prevent this, you can add a Node selector to the Kepler deployment:

kepler:
  nodeSelector:
    kubernetes.io/arch: amd64

ResourceExhausted eror when sending traces

You might encounter the following if you have traces enabled and you see log entries in your alloy instance that looks like this:

Permanent error: rpc error: code = ResourceExhausted desc = grpc: received message after decompression larger than max (5268750 vs. 4194304)" dropped_items=11226
ts=2024-09-19T19:52:35.16668052Z level=info msg="rejoining peers" service=cluster peers_count=1 peers=6436336134343433.grafana-k8s-monitoring-alloy-cluster.default.svc.cluster.local.:12345

This error is likely due to the span size being too large. To fix this, adjust the batch size:

receivers:
  processors:
    batch:
      maxSize: 2000

Start with 2000 and adjust as needed.

Was this page helpful?

Email docs@grafana.com

Help and support

Community