8.0 KiB
layout, title, date, categories, highlight
| layout | title | date | categories | highlight | |
|---|---|---|---|---|---|
| post | Step 2: The Application Engine & Auto-Scaling | 2025-12-28 08:00:00 -0400 |
|
true |
2. The Application Engine & Auto-Scaling
2.1 Decoupling Configuration (ConfigMaps)
In Docker, if you need to update an Nginx default.conf, you typically COPY the file into the image and rebuild it. In Kubernetes, we use a ConfigMap to treat configuration as a separate object. By using a ConfigMap, we can update these rules and simply restart the pods to apply changes, no Docker build required.
We use a ConfigMap to inject the Nginx configuration.
apps/severed-blog-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: severed-blog-config
namespace: severed-apps
data:
default.conf: |
# 1. Define the custom log format
log_format observability '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$request_time"';
server {
listen 80;
server_name localhost;
root /usr/share/nginx/html;
index index.html index.htm;
# 2. Apply the format to stdout
access_log /dev/stdout observability;
error_log /dev/stderr;
# gzip compression
gzip on;
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
gzip_vary on;
gzip_min_length 1000;
# assets (images, fonts, favicons) - cache for 1 Year
location ~* \.(jpg|jpeg|gif|png|ico|svg|woff|woff2|ttf|eot)$ {
expires 365d;
add_header Cache-Control "public, no-transform";
try_files $uri =404;
}
# code (css, js) - cache for 1 month
location ~* \.(css|js)$ {
expires 30d;
add_header Cache-Control "public, no-transform";
try_files $uri =404;
}
# standard routing
location / {
try_files $uri $uri/ $uri.html =404;
}
error_page 404 /404.html;
location = /404.html {
internal;
}
# logging / lb config
real_ip_header X-Forwarded-For;
set_real_ip_from 10.0.0.0/8;
# metrics endpoint for Alloy/Prometheus
location /metrics {
stub_status on;
access_log off; # Keep noise out of our main logs
allow 127.0.0.1;
allow 10.0.0.0/8;
allow 172.16.0.0/12;
deny all;
}
}
It is a better practice to keep default.conf as a standalone file in our repo (e.g., apps/config/default.conf) and inject it like:
kubectl create configmap severed-blog-config \
-n severed-apps \
--from-file=default.conf=apps/config/default.conf \
--dry-run=client -o yaml | kubectl apply -f -
2.2 Deploying the Workload: The Sidecar Pattern
The Deployment ensures the desired state is maintained. We requested replicas: 2, meaning K8s will ensure two instances of the blog are running across our worker nodes.
The Sidecar: We added a second container (nginx-prometheus-exporter) to the same Pod.
- Web Container: Serves the blog content.
- Exporter Container: Scrapes the Web container's local
/metricsendpoint and translates it into Prometheus format on port9113.
apps/severed-blog.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: severed-blog
namespace: severed-apps
spec:
replicas: 2
selector:
matchLabels:
app: severed-blog
template:
metadata:
labels:
app: severed-blog
spec:
containers:
- name: web
image: severed-blog:v0.3
imagePullPolicy: Never
ports:
- containerPort: 80
resources:
requests:
cpu: '50m'
memory: '64Mi'
limits:
cpu: '200m'
memory: '128Mi'
volumeMounts:
- name: nginx-config-vol
mountPath: /etc/nginx/conf.d/default.conf
subPath: default.conf
- name: exporter
image: nginx/nginx-prometheus-exporter:latest
args:
- -nginx.scrape-uri=http://localhost:80/metrics
ports:
- containerPort: 9113
name: metrics
resources:
requests:
cpu: '10m'
memory: '32Mi'
limits:
cpu: '50m'
memory: '64Mi'
volumes:
- name: nginx-config-vol
configMap:
name: severed-blog-config
The spec.volumes block references our ConfigMap, and volumeMounts places that data exactly where Nginx expects its
configuration.
2.2.1 Internal Networking (Services)
Pods are ephemeral; they die and get new IP addresses. If we pointed our Ingress directly at a Pod IP, the site would break every time a pod restarted.
We use a Service to solve this. A Service provides a stable Virtual IP (ClusterIP) and an internal DNS name (severed-blog-service.severed-apps.svc.cluster.local) that load balances traffic to any Pod matching the selector: app: severed-blog.
apps/severed-blog-service.yaml
apiVersion: v1
kind: Service
metadata:
name: severed-blog-service
namespace: severed-apps
spec:
selector:
app: severed-blog
ports:
- protocol: TCP
port: 80
targetPort: 80
type: ClusterIP
2.3 Traffic Routing (Ingress)
External users cannot talk to Pods directly. Traffic flows: Internet → Ingress → Service → Pod.
- The Service: Acts as an internal LoadBalancer with a stable DNS name.
- The Ingress: Acts as a reverse proxy (Traefik) that reads the URL hostname.
apps/severed-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: severed-ingress
namespace: severed-apps
annotations:
traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
rules:
# ONLY accept traffic for this specific hostname
- host: blog.localhost
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: severed-blog-service
port:
number: 80
2.4 Auto-Scaling (HPA)
We implemented a Horizontal Pod Autoscaler (HPA) that scales the blog based on three metrics:
- CPU: Target 90% of Requests (not Limits).
- Memory: Target 80% of Requests.
- Traffic (RPS): Target 500 requests per second per pod.
To prevent scaling up and down too fast, we added a Stabilization Window and a strict Scale Up Limit (max 1 pod every 15s). This prevents the cluster from exploding due to 1-second spikes.
apps/severed-blog-hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: severed-blog-hpa
namespace: severed-apps
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: severed-blog
minReplicas: 2 # Never drop below 2 for HA
maxReplicas: 6 # Maximum number of pods to prevent cluster exhaustion
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 90 # Scale up if CPU Usage exceeds 90%
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80 # Scale up if RAM Usage exceeds 80%
- type: Pods
pods:
metric:
name: nginx_http_requests_total
target:
type: AverageValue
averageValue: '500' # Scale up if requests per second > 500 per pod
behavior:
scaleDown:
stabilizationWindowSeconds: 60 # 60 sec before removing a pod
policies:
- type: Percent
value: 100
periodSeconds: 15
scaleUp:
stabilizationWindowSeconds: 60
policies:
- type: Pods
value: 1
periodSeconds: 60