2. The Application Engine & Auto-Scaling

2.1 Decoupling Configuration (ConfigMaps)

In Docker, if you need to update an Nginx default.conf, you typically COPY the file into the image and rebuild it. In Kubernetes, we use a ConfigMap to treat configuration as a separate object. By using a ConfigMap, we can update these rules and simply restart the pods to apply changes, no Docker build required.

We use a ConfigMap to inject the Nginx configuration.

apps/severed-blog-config.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: severed-blog-config
  namespace: severed-apps
data:
  default.conf: |
    # 1. Define the custom log format
    log_format observability    '$remote_addr - $remote_user [$time_local] "$request" '
                                '$status $body_bytes_sent "$http_referer" '
                                '"$http_user_agent" "$request_time"';

    server {
      listen           80;
      server_name      localhost;
      root             /usr/share/nginx/html;
      index            index.html index.htm;

      # 2. Apply the format to stdout
      access_log       /dev/stdout observability;
      error_log        /dev/stderr;

      # gzip compression
      gzip             on;
      gzip_types       text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
      gzip_vary        on;
      gzip_min_length  1000;

      # assets (images, fonts, favicons) - cache for 1 Year
      location ~* \.(jpg|jpeg|gif|png|ico|svg|woff|woff2|ttf|eot)$ {
        expires    365d;
        add_header Cache-Control "public, no-transform";
        try_files  $uri =404;
      }

      # code (css, js) - cache for 1 month
      location ~* \.(css|js)$ {
        expires    30d;
        add_header Cache-Control "public, no-transform";
        try_files  $uri =404;
      }

      # standard routing
      location / {
        try_files $uri $uri/ $uri.html =404;
      }

      error_page       404 /404.html;
      location = /404.html {
        internal;
      }

      # logging / lb config
      real_ip_header   X-Forwarded-For;
      set_real_ip_from 10.0.0.0/8;

      # metrics endpoint for Alloy/Prometheus
      location /metrics {
        stub_status on;
        access_log off; # Keep noise out of our main logs
        allow 127.0.0.1;
        allow 10.0.0.0/8;
        allow 172.16.0.0/12;
        deny all;
      }
    }

It is a better practice to keep default.conf as a standalone file in our repo (e.g., apps/config/default.conf) and inject it like:

kubectl create configmap severed-blog-config \
  -n severed-apps \
  --from-file=default.conf=apps/config/default.conf \
  --dry-run=client -o yaml | kubectl apply -f -

2.2 Deploying the Workload: The Sidecar Pattern

The Deployment ensures the desired state is maintained. We requested replicas: 2, meaning K8s will ensure two instances of the blog are running across our worker nodes.

The Sidecar: We added a second container (nginx-prometheus-exporter) to the same Pod.

Web Container: Serves the blog content.
Exporter Container: Scrapes the Web container's local /metrics endpoint and translates it into Prometheus format on port 9113.

apps/severed-blog.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: severed-blog
  namespace: severed-apps
spec:
  replicas: 2
  selector:
    matchLabels:
      app: severed-blog
  template:
    metadata:
      labels:
        app: severed-blog
    spec:
      containers:
        - name: web
          image: severed-blog:v0.3
          imagePullPolicy: Never
          ports:
            - containerPort: 80
          resources:
            requests:
              cpu: '50m'
              memory: '64Mi'
            limits:
              cpu: '200m'
              memory: '128Mi'
          volumeMounts:
            - name: nginx-config-vol
              mountPath: /etc/nginx/conf.d/default.conf
              subPath: default.conf

        - name: exporter
          image: nginx/nginx-prometheus-exporter:latest
          args:
            - -nginx.scrape-uri=http://localhost:80/metrics
          ports:
            - containerPort: 9113
              name: metrics
          resources:
            requests:
              cpu: '10m'
              memory: '32Mi'
            limits:
              cpu: '50m'
              memory: '64Mi'

      volumes:
        - name: nginx-config-vol
          configMap:
            name: severed-blog-config

The spec.volumes block references our ConfigMap, and volumeMounts places that data exactly where Nginx expects its configuration.

2.2.1 Internal Networking (Services)

Pods are ephemeral; they die and get new IP addresses. If we pointed our Ingress directly at a Pod IP, the site would break every time a pod restarted.

We use a Service to solve this. A Service provides a stable Virtual IP (ClusterIP) and an internal DNS name (severed-blog-service.severed-apps.svc.cluster.local) that load balances traffic to any Pod matching the selector: app: severed-blog.

apps/severed-blog-service.yaml

apiVersion: v1
kind: Service
metadata:
  name: severed-blog-service
  namespace: severed-apps
spec:
  selector:
    app: severed-blog
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
  type: ClusterIP

2.3 Traffic Routing (Ingress)

External users cannot talk to Pods directly. Traffic flows: Internet → Ingress → Service → Pod.

The Service: Acts as an internal LoadBalancer with a stable DNS name.
The Ingress: Acts as a reverse proxy (Traefik) that reads the URL hostname.

apps/severed-ingress.yaml

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: severed-ingress
  namespace: severed-apps
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  rules:
    # ONLY accept traffic for this specific hostname
    - host: blog.localhost
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: severed-blog-service
                port:
                  number: 80

2.4 Auto-Scaling (HPA)

We implemented a Horizontal Pod Autoscaler (HPA) that scales the blog based on three metrics:

CPU: Target 90% of Requests (not Limits).
Memory: Target 80% of Requests.
Traffic (RPS): Target 500 requests per second per pod.

To prevent scaling up and down too fast, we added a Stabilization Window and a strict Scale Up Limit (max 1 pod every 15s). This prevents the cluster from exploding due to 1-second spikes.

apps/severed-blog-hpa.yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: severed-blog-hpa
  namespace: severed-apps
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: severed-blog
  minReplicas: 2 # Never drop below 2 for HA
  maxReplicas: 6 # Maximum number of pods to prevent cluster exhaustion
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 90 # Scale up if CPU Usage exceeds 90%
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80 # Scale up if RAM Usage exceeds 80%
    - type: Pods
      pods:
        metric:
          name: nginx_http_requests_total
        target:
          type: AverageValue
          averageValue: '500' # Scale up if requests per second > 500 per pod
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 60 # 60 sec before removing a pod
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
        - type: Pods
          value: 1
          periodSeconds: 60

2025-12-27-part-3

8.0 KiB Raw Blame History