--- layout: post title: 'Step 2: The Application Engine & Auto-Scaling' date: 2025-12-28 04:00:00 -0400 categories: - blog_app highlight: true --- [[2025-12-27-part-1]] # 2. The Application Engine & Auto-Scaling ## 2.1 Decoupling Configuration (ConfigMaps) In Docker, if you need to update an Nginx `default.conf`, you typically `COPY` the file into the image and rebuild it. In Kubernetes, we use a **ConfigMap** to treat configuration as a separate object. By using a ConfigMap, we can update these rules and simply restart the pods to apply changes, no Docker build required. We use a **ConfigMap** to inject the Nginx configuration. **`apps/severed-blog-config.yaml`** ```yaml apiVersion: v1 kind: ConfigMap metadata: name: severed-blog-config namespace: severed-apps data: default.conf: | # 1. Define the custom log format log_format observability '$remote_addr - $remote_user [$time_local] "$request" ' '$status $body_bytes_sent "$http_referer" ' '"$http_user_agent" "$request_time"'; server { listen 80; server_name localhost; root /usr/share/nginx/html; index index.html index.htm; # 2. Apply the format to stdout access_log /dev/stdout observability; error_log /dev/stderr; # gzip compression gzip on; gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript; gzip_vary on; gzip_min_length 1000; # assets (images, fonts, favicons) - cache for 1 Year location ~* \.(jpg|jpeg|gif|png|ico|svg|woff|woff2|ttf|eot)$ { expires 365d; add_header Cache-Control "public, no-transform"; try_files $uri =404; } # code (css, js) - cache for 1 month location ~* \.(css|js)$ { expires 30d; add_header Cache-Control "public, no-transform"; try_files $uri =404; } # standard routing location / { try_files $uri $uri/ $uri.html =404; } error_page 404 /404.html; location = /404.html { internal; } # logging / lb config real_ip_header X-Forwarded-For; set_real_ip_from 10.0.0.0/8; # metrics endpoint for Alloy/Prometheus location /metrics { stub_status on; access_log off; # Keep noise out of our main logs allow 127.0.0.1; allow 10.0.0.0/8; allow 172.16.0.0/12; deny all; } } ``` It is a better practice to keep `default.conf` as a standalone file in our repo (e.g., `apps/config/default.conf`) and inject it like: ```shell kubectl create configmap severed-blog-config \ -n severed-apps \ --from-file=default.conf=apps/config/default.conf \ --dry-run=client -o yaml | kubectl apply -f - ``` ## 2.2 Deploying the Workload: The Sidecar Pattern The **Deployment** ensures the desired state is maintained. We requested `replicas: 2`, meaning K8s will ensure two instances of the blog are running across our worker nodes. **The Sidecar:** We added a second container (`nginx-prometheus-exporter`) to the same Pod. 1. **Web Container:** Serves the blog content. 2. **Exporter Container:** Scrapes the Web container's local `/metrics` endpoint and translates it into Prometheus format on port `9113`. **`apps/severed-blog.yaml`** ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: severed-blog namespace: severed-apps spec: replicas: 2 selector: matchLabels: app: severed-blog template: metadata: labels: app: severed-blog spec: containers: - name: web image: severed-blog:v0.3 imagePullPolicy: Never ports: - containerPort: 80 resources: requests: cpu: '50m' memory: '64Mi' limits: cpu: '200m' memory: '128Mi' volumeMounts: - name: nginx-config-vol mountPath: /etc/nginx/conf.d/default.conf subPath: default.conf - name: exporter image: nginx/nginx-prometheus-exporter:latest args: - -nginx.scrape-uri=http://localhost:80/metrics ports: - containerPort: 9113 name: metrics resources: requests: cpu: '10m' memory: '32Mi' limits: cpu: '50m' memory: '64Mi' volumes: - name: nginx-config-vol configMap: name: severed-blog-config ``` The `spec.volumes` block references our ConfigMap, and `volumeMounts` places that data exactly where Nginx expects its configuration. ### 2.2.1 Internal Networking (Services) Pods are ephemeral; they die and get new IP addresses. If we pointed our Ingress directly at a Pod IP, the site would break every time a pod restarted. We use a **Service** to solve this. A Service provides a stable Virtual IP (ClusterIP) and an internal DNS name (`severed-blog-service.severed-apps.svc.cluster.local`) that load balances traffic to any Pod matching the selector: `app: severed-blog`. **`apps/severed-blog-service.yaml`** ```yaml apiVersion: v1 kind: Service metadata: name: severed-blog-service namespace: severed-apps spec: selector: app: severed-blog ports: - protocol: TCP port: 80 targetPort: 80 type: ClusterIP ``` ## 2.3 Traffic Routing (Ingress) External users cannot talk to Pods directly. Traffic flows: **Internet → Ingress → Service → Pod**. 1. **The Service:** Acts as an internal LoadBalancer with a stable DNS name. 2. **The Ingress:** Acts as a reverse proxy (Traefik) that reads the URL hostname. **`apps/severed-ingress.yaml`** ```yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: severed-ingress namespace: severed-apps annotations: traefik.ingress.kubernetes.io/router.entrypoints: web spec: rules: # ONLY accept traffic for this specific hostname - host: blog.localhost http: paths: - path: / pathType: Prefix backend: service: name: severed-blog-service port: number: 80 ``` ## 2.4 Auto-Scaling (HPA) We implemented a **Horizontal Pod Autoscaler (HPA)** that scales the blog based on three metrics: 1. **CPU:** Target 90% of _Requests_ (not Limits). 2. **Memory:** Target 80% of _Requests_. 3. **Traffic (RPS):** Target 500 requests per second per pod. To prevent scaling up and down too fast, we added a **Stabilization Window** and a strict **Scale Up Limit** (max 1 pod every 15s). This prevents the cluster from exploding due to 1-second spikes. **`apps/severed-blog-hpa.yaml`** ```yaml apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: severed-blog-hpa namespace: severed-apps spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: severed-blog minReplicas: 2 # Never drop below 2 for HA maxReplicas: 6 # Maximum number of pods to prevent cluster exhaustion metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 90 # Scale up if CPU Usage exceeds 90% - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80 # Scale up if RAM Usage exceeds 80% - type: Pods pods: metric: name: nginx_http_requests_total target: type: AverageValue averageValue: '500' # Scale up if requests per second > 500 per pod behavior: scaleDown: stabilizationWindowSeconds: 60 # 60 sec before removing a pod policies: - type: Percent value: 100 periodSeconds: 15 scaleUp: stabilizationWindowSeconds: 60 policies: - type: Pods value: 1 periodSeconds: 60 ``` [[2025-12-27-part-3]]