Skip to content

Monitoring Stack — Overview

Manifests: k8s/clusters/main/infrastructure/monitoring/

Data Flow

Cluster pods

  ├── logs ──────► alloy-logs ───────────────► loki-gateway (loki ns)
  │                                                   │
  └── metrics ──► alloy-metrics ───────────► mimir-gateway (mimir ns)
                  (PodMonitor/ServiceMonitor            │
                   CRDs → prometheus-crds)             │
                                              ┌─────────┴────────┐
                                              │     Grafana      │
                                              │   (grafana ns)   │
                                              └─────────┬────────┘

                                                 Slack #it-alerts

Namespaces

The monitoring namespace is Flux control plane only — no workloads run there. Each component gets its own namespace via targetNamespace in its Flux Kustomization. This means YAML files that say namespace: monitoring still land in the component namespace at deploy time.

NamespaceWhat runs there
monitoringFlux Kustomization CRs, HelmRepository sources, prometheus-crds HelmRelease
grafanaGrafana pod, datasources, alert rules, contact points
grafana-operatorGrafana Operator controller
lokiLoki distributed cluster, COSI bucket objects, S3 setup job
mimirMimir distributed cluster, COSI bucket objects, S3 setup job
alloyAlloy log & metrics collectors
alloy-operatorAlloy Operator controller
kube-state-metricskube-state-metrics exporter
node-exporternode-exporter DaemonSet

COSI Credentials Pattern

Both Loki and Mimir use Garage (in-cluster S3) via the Container Object Storage Interface (COSI). The same bootstrapping pattern is used for both:

  1. BucketClaim → COSI creates the bucket in Garage
  2. BucketAccess → COSI writes S3 credentials into a *-cosi-credentials secret
  3. A one-shot setup Job reads the COSI secret, reshapes the credentials into component-specific env vars, and writes them into a *-s3-credentials secret
  4. The HelmRelease mounts that secret via extraEnvFrom; config.expand-env=true / structuredConfig with ${VAR} handles interpolation

The setup Jobs are idempotent (--dry-run=client | kubectl apply), safe to re-run.

Prerequisites

Flux CD, Grafana Operator, Alloy Operator, Istio ambient mesh, cert-manager (letsencrypt ClusterIssuer), mikrolb, COSI controller + Garage driver, StorageClass ssd-replicated-retain, BucketClass garage-ssd, Keycloak (idp.astaup.de).