---
layout: post
title: 'Step 2: The Application Engine & Auto-Scaling'
date: 2025-12-28 04:00:00 -0400
categories:
  - blog_app
highlight: true
---

[[2025-12-27-part-1]]

# 2. The Application Engine & Auto-Scaling

## 2.1 Decoupling Configuration (ConfigMaps)

In Docker, if you need to update an Nginx `default.conf`, you typically `COPY` the file into the image and rebuild it. In Kubernetes, we use a **ConfigMap** to treat configuration as a separate object. By using a ConfigMap, we can update these rules and simply restart the pods to apply changes, no Docker build required.

We use a **ConfigMap** to inject the Nginx configuration.

**`apps/severed-blog-config.yaml`**

```yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: severed-blog-config
  namespace: severed-apps
data:
  default.conf: |
    # 1. Define the custom log format
    log_format observability    '$remote_addr - $remote_user [$time_local] "$request" '
                                '$status $body_bytes_sent "$http_referer" '
                                '"$http_user_agent" "$request_time"';

    server {
      listen           80;
      server_name      localhost;
      root             /usr/share/nginx/html;
      index            index.html index.htm;

      # 2. Apply the format to stdout
      access_log       /dev/stdout observability;
      error_log        /dev/stderr;

      # gzip compression
      gzip             on;
      gzip_types       text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
      gzip_vary        on;
      gzip_min_length  1000;

      # assets (images, fonts, favicons) - cache for 1 Year
      location ~* \.(jpg|jpeg|gif|png|ico|svg|woff|woff2|ttf|eot)$ {
        expires    365d;
        add_header Cache-Control "public, no-transform";
        try_files  $uri =404;
      }

      # code (css, js) - cache for 1 month
      location ~* \.(css|js)$ {
        expires    30d;
        add_header Cache-Control "public, no-transform";
        try_files  $uri =404;
      }

      # standard routing
      location / {
        try_files $uri $uri/ $uri.html =404;
      }

      error_page       404 /404.html;
      location = /404.html {
        internal;
      }

      # logging / lb config
      real_ip_header   X-Forwarded-For;
      set_real_ip_from 10.0.0.0/8;

      # metrics endpoint for Alloy/Prometheus
      location /metrics {
        stub_status on;
        access_log off; # Keep noise out of our main logs
        allow 127.0.0.1;
        allow 10.0.0.0/8;
        allow 172.16.0.0/12;
        deny all;
      }
    }
```

It is a better practice to keep `default.conf` as a standalone file in our repo (e.g., `apps/config/default.conf`) and inject it like:

```shell
kubectl create configmap severed-blog-config \
  -n severed-apps \
  --from-file=default.conf=apps/config/default.conf \
  --dry-run=client -o yaml | kubectl apply -f -
```

## 2.2 Deploying the Workload: The Sidecar Pattern

The **Deployment** ensures the desired state is maintained. We requested `replicas: 2`, meaning K8s will ensure two instances of the blog are running across our worker nodes.

**The Sidecar:** We added a second container (`nginx-prometheus-exporter`) to the same Pod.

1. **Web Container:** Serves the blog content.
2. **Exporter Container:** Scrapes the Web container's local `/metrics` endpoint and translates it into Prometheus format on port `9113`.

**`apps/severed-blog.yaml`**

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: severed-blog
  namespace: severed-apps
spec:
  replicas: 2
  selector:
    matchLabels:
      app: severed-blog
  template:
    metadata:
      labels:
        app: severed-blog
    spec:
      containers:
        - name: web
          image: severed-blog:v0.3
          imagePullPolicy: Never
          ports:
            - containerPort: 80
          resources:
            requests:
              cpu: '50m'
              memory: '64Mi'
            limits:
              cpu: '200m'
              memory: '128Mi'
          volumeMounts:
            - name: nginx-config-vol
              mountPath: /etc/nginx/conf.d/default.conf
              subPath: default.conf

        - name: exporter
          image: nginx/nginx-prometheus-exporter:latest
          args:
            - -nginx.scrape-uri=http://localhost:80/metrics
          ports:
            - containerPort: 9113
              name: metrics
          resources:
            requests:
              cpu: '10m'
              memory: '32Mi'
            limits:
              cpu: '50m'
              memory: '64Mi'

      volumes:
        - name: nginx-config-vol
          configMap:
            name: severed-blog-config
```

The `spec.volumes` block references our ConfigMap, and `volumeMounts` places that data exactly where Nginx expects its
configuration.

### 2.2.1 Internal Networking (Services)

Pods are ephemeral; they die and get new IP addresses. If we pointed our Ingress directly at a Pod IP, the site would break every time a pod restarted.

We use a **Service** to solve this. A Service provides a stable Virtual IP (ClusterIP) and an internal DNS name (`severed-blog-service.severed-apps.svc.cluster.local`) that load balances traffic to any Pod matching the selector: `app: severed-blog`.

**`apps/severed-blog-service.yaml`**

```yaml
apiVersion: v1
kind: Service
metadata:
  name: severed-blog-service
  namespace: severed-apps
spec:
  selector:
    app: severed-blog
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80
  type: ClusterIP
```

## 2.3 Traffic Routing (Ingress)

External users cannot talk to Pods directly. Traffic flows: **Internet → Ingress → Service → Pod**.

1. **The Service:** Acts as an internal LoadBalancer with a stable DNS name.
2. **The Ingress:** Acts as a reverse proxy (Traefik) that reads the URL hostname.

**`apps/severed-ingress.yaml`**

```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: severed-ingress
  namespace: severed-apps
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  rules:
    # ONLY accept traffic for this specific hostname
    - host: blog.localhost
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: severed-blog-service
                port:
                  number: 80
```

## 2.4 Auto-Scaling (HPA)

We implemented a **Horizontal Pod Autoscaler (HPA)** that scales the blog based on three metrics:

1. **CPU:** Target 90% of _Requests_ (not Limits).
2. **Memory:** Target 80% of _Requests_.
3. **Traffic (RPS):** Target 500 requests per second per pod.

To prevent scaling up and down too fast, we added a **Stabilization Window** and a strict **Scale Up Limit** (max 1 pod every 15s). This prevents the cluster from exploding due to 1-second spikes.

**`apps/severed-blog-hpa.yaml`**

```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: severed-blog-hpa
  namespace: severed-apps
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: severed-blog
  minReplicas: 2 # Never drop below 2 for HA
  maxReplicas: 6 # Maximum number of pods to prevent cluster exhaustion
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 90 # Scale up if CPU Usage exceeds 90%
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 80 # Scale up if RAM Usage exceeds 80%
    - type: Pods
      pods:
        metric:
          name: nginx_http_requests_total
        target:
          type: AverageValue
          averageValue: '500' # Scale up if requests per second > 500 per pod
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 60 # 60 sec before removing a pod
      policies:
        - type: Percent
          value: 100
          periodSeconds: 15
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
        - type: Pods
          value: 1
          periodSeconds: 60
```

[[2025-12-27-part-3]]