---
layout: post
title: 'Step 3: Observability (LGTM, KSM)'
date: 2025-12-28 07:00:00 -0400
categories:
  - blog_app
highlight: true
---

[[2025-12-27-part-2]]

# 3. Observability: The LGTM Stack

In a distributed cluster, logs and metrics are scattered across different pods and nodes. We centralized monitoring using the LGTM Stack (Loki, Grafana, Prometheus) plus **Kube State Metrics** and the **Prometheus Adapter** to centralize our logs and metrics.

## 3.1 The Databases (StatefulSets)

- **Prometheus:** Scrapes metrics. We updated the config to scrape **Kube State Metrics** via its internal DNS Service.
- **Loki:** Aggregates logs. Configured with a 168h (7-day) retention period.

**`infra/observer/prometheus.yaml`**

```yaml
# Configuration
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: monitoring
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
      evaluation_interval: 15s
    storage:
      tsdb:
        out_of_order_time_window: 1m

    scrape_configs:
      # 1. Scrape Prometheus itself (Health Check)
      - job_name: 'prometheus'
        static_configs:
          - targets: ['localhost:9090']

      # 2. Scrape Kube State Metrics (KSM)
      # We use the internal DNS: service-name.namespace.svc.cluster.local:port
      - job_name: 'kube-state-metrics'
        static_configs:
          - targets: ['kube-state-metrics.monitoring.svc.cluster.local:8080']

---
# Service
apiVersion: v1
kind: Service
metadata:
  name: prometheus
  namespace: monitoring
spec:
  type: ClusterIP
  selector:
    app: prometheus
  ports:
    - port: 9090
      targetPort: 9090

---
# The Database (StatefulSet)
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: prometheus
  namespace: monitoring
spec:
  serviceName: prometheus
  replicas: 1
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      containers:
        - name: prometheus
          image: prom/prometheus:latest
          args:
            - '--config.file=/etc/prometheus/prometheus.yml'
            - '--web.enable-remote-write-receiver'
            - '--storage.tsdb.path=/prometheus'
            - '--web.console.libraries=/usr/share/prometheus/console_libraries'
            - '--web.console.templates=/usr/share/prometheus/consoles'
          ports:
            - containerPort: 9090
          volumeMounts:
            - name: config
              mountPath: /etc/prometheus
            - name: data
              mountPath: /prometheus
      volumes:
        - name: config
          configMap:
            name: prometheus-config
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ['ReadWriteOnce']
        storageClassName: 'openebs-hostpath'
        resources:
          requests:
            storage: 5Gi
```

**`infra/observer/loki.yaml`**

```yaml
# --- Configuration ---
apiVersion: v1
kind: ConfigMap
metadata:
  name: loki-config
  namespace: monitoring
data:
  local-config.yaml: |
    auth_enabled: false
    server:
      http_listen_port: 3100
    common:
      path_prefix: /loki
      storage:
        filesystem:
          chunks_directory: /loki/chunks
          rules_directory: /loki/rules
      replication_factor: 1
      ring:
        instance_addr: 127.0.0.1
        kvstore:
          store: inmemory
    schema_config:
      configs:
        - from: 2020-10-24
          store: tsdb
          object_store: filesystem
          schema: v13
          index:
            prefix: index_
            period: 24h

---
# --- Storage Service (Headless) ---
# Required for StatefulSets to maintain stable DNS entries.
apiVersion: v1
kind: Service
metadata:
  name: loki
  namespace: monitoring
spec:
  type: ClusterIP
  selector:
    app: loki
  ports:
    - port: 3100
      targetPort: 3100
      name: http-metrics

---
# --- The Database (StatefulSet) ---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: loki
  namespace: monitoring
spec:
  serviceName: loki
  replicas: 1
  selector:
    matchLabels:
      app: loki
  template:
    metadata:
      labels:
        app: loki
    spec:
      containers:
        - name: loki
          image: grafana/loki:latest
          args:
            - -config.file=/etc/loki/local-config.yaml
          ports:
            - containerPort: 3100
              name: http-metrics
          volumeMounts:
            - name: config
              mountPath: /etc/loki
            - name: data
              mountPath: /loki
      volumes:
        - name: config
          configMap:
            name: loki-config
  # Persistent Storage
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: ['ReadWriteOnce']
        storageClassName: 'openebs-hostpath'
        resources:
          requests:
            storage: 5Gi
```

## 3.2 The Bridge: Prometheus Adapter & KSM

Standard HPA only understands CPU and Memory. To scale on **Requests Per Second**, we needed two extra components.

**Helm (Package Manager)**
You will notice `kube-state-metrics` and `prometheus-adapter` are missing from our file tree. That is because we install them using **Helm**. Helm allows us to install complex, pre-packaged applications ("Charts") without writing thousands of lines of YAML. We only provide a `values.yaml` file to override specific settings.

1. **Kube State Metrics (KSM):** A service that listens to the Kubernetes API and generates metrics about the state of objects (e.g., `kube_pod_created`).
2. **Prometheus Adapter:** Installs via Helm. We use `infra/observer/adapter-values.yaml` to configure how it translates Prometheus queries into Kubernetes metrics.

**`infra/observer/adapter-values.yaml`**

```yaml
prometheus:
  url: http://prometheus.monitoring.svc.cluster.local
  port: 9090

rules:
  custom:
    - seriesQuery: 'nginx_http_requests_total{pod!="",namespace!=""}'
      resources:
        overrides:
          namespace: { resource: 'namespace' }
          pod: { resource: 'pod' }
      name:
        matches: '^(.*)_total'
        as: 'nginx_http_requests_total'
      metricsQuery: 'rate(<<.Series>>{<<.LabelMatchers>>}[1m])'
```

## 3.3 The Agent: Grafana Alloy (DaemonSets)

We need to collect logs from every node in the cluster.

- **DaemonSet vs. Deployment:** A Deployment ensures _n_ replicas exist somewhere. A **DaemonSet** ensures exactly **one** Pod runs on **every** Node. This is perfect for infrastructure agents (logging, networking, monitoring).
- **Downward API:** We need to inject the Pod's own name and namespace into its environment variables so it knows "who it is."

**`infra/alloy-env.yaml`**

```yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: monitoring-env
  namespace: monitoring
data:
  LOKI_URL: 'http://loki.monitoring.svc:3100/loki/api/v1/push'
  PROM_URL: 'http://prometheus.monitoring.svc:9090/api/v1/write'
```

**`infra/alloy-setup.yaml`**

```yaml
# --- RBAC configuration ---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: alloy-sa
  namespace: monitoring

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: alloy-cluster-role
rules:
  # 1. Standard API Access
  - apiGroups: ['']
    resources: ['nodes', 'nodes/proxy', 'services', 'endpoints', 'pods']
    verbs: ['get', 'list', 'watch']
  # 2. ALLOW METRICS ACCESS (Crucial for cAdvisor/Kubelet)
  - apiGroups: ['']
    resources: ['nodes/stats', 'nodes/metrics']
    verbs: ['get']
  # 3. Log Access
  - apiGroups: ['']
    resources: ['pods/log']
    verbs: ['get', 'list', 'watch']
  # 4. Non-Resource URLs (Sometimes needed for /metrics endpoints)
  - nonResourceURLs: ['/metrics']
    verbs: ['get']

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: alloy-cluster-binding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: alloy-cluster-role
subjects:
  - kind: ServiceAccount
    name: alloy-sa
    namespace: monitoring

---
# --- Alloy pipeline configuration ---
apiVersion: v1
kind: ConfigMap
metadata:
  name: alloy-config
  namespace: monitoring
data:
  config.alloy: |
    // 1. Discovery: Find all pods
    discovery.kubernetes "k8s_pods" {
      role = "pod"
    }

    // 2. Relabeling: Filter and Label "severed-blog" pods
    discovery.relabel "blog_pods" {
      targets = discovery.kubernetes.k8s_pods.targets

      rule {
        action = "keep"
        source_labels = ["__meta_kubernetes_pod_label_app"]
        regex = "severed-blog"
      }

      // Explicitly set 'pod' and 'namespace' labels for the Adapter
      rule {
        action = "replace"
        source_labels = ["__meta_kubernetes_pod_name"]
        target_label = "pod"
      }

      rule {
        action = "replace"
        source_labels = ["__meta_kubernetes_namespace"]
        target_label = "namespace"
      }

      // Route to the sidecar exporter port
      rule {
        action = "replace"
        source_labels = ["__address__"]
        target_label = "__address__"
        regex = "([^:]+)(?::\\d+)?"
        replacement = "$1:9113"
      }
    }

    // 3. Direct Nginx Scraper
    prometheus.scrape "nginx_scraper" {
      targets = discovery.relabel.blog_pods.output
      forward_to = [prometheus.remote_write.metrics_service.receiver]
      job_name   = "integrations/nginx"
    }

    // 4. Host Metrics (Unix Exporter)
    prometheus.exporter.unix "host" {
      rootfs_path = "/host/root"
      sysfs_path  = "/host/sys"
      procfs_path = "/host/proc"
    }

    prometheus.scrape "host_scraper" {
      targets    = prometheus.exporter.unix.host.targets
      forward_to = [prometheus.remote_write.metrics_service.receiver]
    }

    // 5. Remote Write: Send to Prometheus
    prometheus.remote_write "metrics_service" {
      endpoint {
        url = sys.env("PROM_URL")
      }
    }

    // 6. Logs Pipeline: Send to Loki
    loki.source.kubernetes "pod_logs" {
      targets    = discovery.relabel.blog_pods.output
      forward_to = [loki.write.default.receiver]
    }

    loki.write "default" {
      endpoint {
        url = sys.env("LOKI_URL")
      }
    }

    // 7. Kubelet Scraper (cAdvisor for Container Metrics)
    discovery.kubernetes "k8s_nodes" {
      role = "node"
    }

    prometheus.scrape "kubelet_cadvisor" {
      targets = discovery.kubernetes.k8s_nodes.targets
      scheme  = "https"
      metrics_path = "/metrics/cadvisor"
      job_name     = "integrations/kubernetes/cadvisor"

      tls_config {
        insecure_skip_verify = true
      }
      bearer_token_file = "/var/run/secrets/kubernetes.io/serviceaccount/token"

      forward_to = [prometheus.remote_write.metrics_service.receiver]
    }

---
# --- Agent Deployment (DaemonSet) ---
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: alloy
  namespace: monitoring
spec:
  selector:
    matchLabels:
      name: alloy
  template:
    metadata:
      labels:
        name: alloy
    spec:
      serviceAccountName: alloy-sa
      hostNetwork: true
      hostPID: true
      dnsPolicy: ClusterFirstWithHostNet
      containers:
        - name: alloy
          image: grafana/alloy:latest
          args:
            - run
            - --server.http.listen-addr=0.0.0.0:12345
            - --storage.path=/var/lib/alloy/data
            - /etc/alloy/config.alloy
          envFrom:
            - configMapRef:
                name: monitoring-env
                optional: false
          volumeMounts:
            - name: config
              mountPath: /etc/alloy
            - name: logs
              mountPath: /var/log
            - name: proc
              mountPath: /host/proc
              readOnly: true
            - name: sys
              mountPath: /host/sys
              readOnly: true
            - name: root
              mountPath: /host/root
              readOnly: true
      volumes:
        - name: config
          configMap:
            name: alloy-config
        - name: logs
          hostPath:
            path: /var/log
        - name: proc
          hostPath:
            path: /proc
        - name: sys
          hostPath:
            path: /sys
        - name: root
          hostPath:
            path: /
```

## 3.4 Visualization: Grafana

We deployed Grafana with pre-loaded dashboards via ConfigMaps.

**Key Dashboards Created:**

1. **Cluster Health:** CPU/Memory saturation.
2. **HPA Live Status:** A custom table showing the _real_ scaling drivers (RPS, CPU Request %) vs the HPA's reaction.

**`infra/observer/grafana.yaml`**

```yaml
# 1. Datasources (Connection to Loki/Prom)
apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-datasources
  namespace: monitoring
data:
  datasources.yaml: |
    apiVersion: 1
    datasources:
      - name: Prometheus
        type: prometheus
        access: proxy
        url: http://prometheus.monitoring.svc:9090
        isDefault: false
      - name: Loki
        type: loki
        access: proxy
        url: http://loki.monitoring.svc:3100
        isDefault: true

---
# 2. Dashboard Provider (Tells Grafana to load from /var/lib/grafana/dashboards)
apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-dashboard-provider
  namespace: monitoring
data:
  dashboard-provider.yaml: |
    apiVersion: 1
    providers:
      - name: 'Severed Dashboards'
        orgId: 1
        folder: ''
        type: file
        disableDeletion: false
        updateIntervalSeconds: 10 # Allow editing in UI, but it resets on restart
        options:
          path: /var/lib/grafana/dashboards

---
# 3. Service
apiVersion: v1
kind: Service
metadata:
  name: grafana-service
  namespace: monitoring
spec:
  type: LoadBalancer
  selector:
    app: grafana
  ports:
    - protocol: TCP
      port: 3000
      targetPort: 3000

---
# 4. Deployment (The App)
apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      containers:
        - name: grafana
          image: grafana/grafana:latest
          ports:
            - containerPort: 3000

          env:
            - name: GF_SECURITY_ADMIN_USER
              valueFrom:
                secretKeyRef:
                  name: grafana-secrets
                  key: admin-user
            - name: GF_SECURITY_ADMIN_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: grafana-secrets
                  key: admin-password

            - name: GF_AUTH_ANONYMOUS_ENABLED
              value: 'true'
            - name: GF_AUTH_ANONYMOUS_ORG_ROLE
              value: 'Viewer'
            - name: GF_AUTH_ANONYMOUS_ORG_NAME
              value: 'Main Org.'

          volumeMounts:
            - name: grafana-datasources
              mountPath: /etc/grafana/provisioning/datasources
            - name: grafana-dashboard-provider
              mountPath: /etc/grafana/provisioning/dashboards
            - name: grafana-dashboards-json
              mountPath: /var/lib/grafana/dashboards
            - name: grafana-storage
              mountPath: /var/lib/grafana
      volumes:
        - name: grafana-datasources
          configMap:
            name: grafana-datasources
        - name: grafana-dashboard-provider
          configMap:
            name: grafana-dashboard-provider
        - name: grafana-dashboards-json
          configMap:
            name: grafana-dashboards-json
        - name: grafana-storage
          emptyDir: {}
```

In the Deployment above, you see references to `grafana-secrets`. However, this file is **not** in our git repository.

```yaml
- name: GF_SECURITY_ADMIN_PASSWORD
  valueFrom:
    secretKeyRef:
      name: grafana-secrets # <--- where is this?
      key: admin-password
```

We don't commit it to version control. In our `deploy-all.sh` script, we generate this secret imperatively using `kubectl create secret generic`. In a real production environment, we would use tools like **ExternalSecrets** or **SealedSecrets** to inject these safely.

**`dashboard-json.yaml`**

```yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-dashboards-json
  namespace: monitoring
data:
  severed-health.json: |
    ...
```

Just like our blog, we need an Ingress to access Grafana. Notice we map a different hostname (`grafana.localhost`) to the Grafana service port (`3000`).

**`infra/observer/grafana-ingress.yaml`**

```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: grafana-ingress
  namespace: monitoring
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  rules:
    - host: grafana.localhost
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: grafana-service # ...send them to Grafana
                port:
                  number: 3000
```

This is how the grafana UI should look like. Notice that we are not signed in.

![part1.png](assets/part1.png)
![part2.png](assets/part2.png)

[[2025-12-27-part-4]]