added parts
This commit is contained in:
@@ -1,17 +0,0 @@
|
||||
---
|
||||
layout: post
|
||||
title: Architecture V1 (WIP)
|
||||
date: 2025-12-27 02:00:00 -0400
|
||||
categories:
|
||||
- architectures
|
||||
---
|
||||
|
||||
## Monitoring
|
||||
|
||||
```bash
|
||||
.
|
||||
└── alloy
|
||||
├── config
|
||||
│ └── config.alloy
|
||||
└── docker-compose.yml
|
||||
```
|
||||
24
_posts/blog_app/2025-12-27-concepts.md
Normal file
24
_posts/blog_app/2025-12-27-concepts.md
Normal file
@@ -0,0 +1,24 @@
|
||||
---
|
||||
layout: post
|
||||
title: 'Kubernetes vs Docker'
|
||||
date: 2025-12-27 02:00:00 -0400
|
||||
categories:
|
||||
- blog_app
|
||||
---
|
||||
|
||||
# Kubernetes Concepts Cheat Sheet
|
||||
|
||||
| Object | Docker Equivalent | Kubernetes Purpose |
|
||||
| ----------- | ------------------------------ | ----------------------------------------------------------------- |
|
||||
| Node | The Host Machine | A physical or virtual server in the cluster. |
|
||||
| Pod | A Container | The smallest deployable unit (can contain multiple containers). |
|
||||
| Deployments | `docker-compose up` | Manages the lifecycle and scaling of Pods. |
|
||||
| Services | Network Aliases | Provides a stable DNS name/IP for a group of Pods. |
|
||||
| HPA | Auto-Scaling Group | Automatically scales replicas based on traffic/load. |
|
||||
| Ingress | Nginx Proxy / Traefik | Manages external access to Services via HTTP/HTTPS. |
|
||||
| ConfigMap | `docker run -v config:/etc...` | Decouples configuration files from the container image. |
|
||||
| Secret | Environment Variables (Secure) | Stores sensitive data (passwords, tokens) encoded in Base64. |
|
||||
| DaemonSet | `mode: global` (Swarm) | Ensures one copy of a Pod runs on every Node (logs/monitoring). |
|
||||
| StatefulSet | N/A | Manages apps requiring stable identities and storage (Databases). |
|
||||
|
||||
[[2025-12-27-part-1]]
|
||||
27
_posts/blog_app/2025-12-27-intro.md
Normal file
27
_posts/blog_app/2025-12-27-intro.md
Normal file
@@ -0,0 +1,27 @@
|
||||
---
|
||||
layout: post
|
||||
title: 'Deploying the Severed Blog'
|
||||
date: 2025-12-28 02:00:00 -0400
|
||||
categories:
|
||||
- blog_app
|
||||
highlight: true
|
||||
---
|
||||
|
||||
# Introduction
|
||||
|
||||
We are taking a simple static website, the **Severed Blog**, and engineering a proper infrastructure around it.
|
||||
|
||||
Anyone can run `docker run nginx`. The real engineering challenge is building the **platform** that keeps that application alive, scalable, and observable.
|
||||
|
||||
In this project, we will build a local Kubernetes cluster that mimics a real cloud environment. We will not just deploy the app; we will implement:
|
||||
|
||||
- **High Availability:** Running multiple copies so the site never goes down.
|
||||
- **Auto-Scaling:** Automatically detecting traffic spikes and launching new pods.
|
||||
- **Observability:** Using the LGTM stack (Loki, Grafana, Prometheus) to visualize exactly what is happening inside the cluster.
|
||||
|
||||
The infra code can be found in [here](https://git.severed.ink/Severed/Severed-Infra).
|
||||
The blog code can be found in [here](https://git.severed.ink/Severed/Severed-Blog).
|
||||
|
||||
Let's start by building the foundation.
|
||||
|
||||
[[2025-12-27-part-1]]
|
||||
135
_posts/blog_app/2025-12-27-part-1.md
Normal file
135
_posts/blog_app/2025-12-27-part-1.md
Normal file
@@ -0,0 +1,135 @@
|
||||
---
|
||||
layout: post
|
||||
title: 'Step 1: K3d Cluster Architecture'
|
||||
date: 2025-12-28 03:00:00 -0400
|
||||
categories:
|
||||
- blog_app
|
||||
highlight: true
|
||||
---
|
||||
|
||||
[[2025-12-27-intro]]
|
||||
|
||||
# 1. K3d Cluster Architecture
|
||||
|
||||
In a standard Docker setup, containers share the host's kernel and networking space directly. In Kubernetes, we introduce an abstraction layer: a **Cluster**. For this project, we use **K3d**, which packages **K3s** (a lightweight production-grade K8s distribution) into Docker containers.
|
||||
|
||||
```text
|
||||
Severed-Infra % tree
|
||||
.
|
||||
├── README.md
|
||||
├── apps
|
||||
│ ├── severed-blog-config.yaml
|
||||
│ ├── severed-blog-hpa.yaml
|
||||
│ ├── severed-blog-service.yaml
|
||||
│ ├── severed-blog.yaml
|
||||
│ └── severed-ingress.yaml
|
||||
├── infra
|
||||
│ ├── alloy-env.yaml
|
||||
│ ├── alloy-setup.yaml
|
||||
│ ├── dashboard
|
||||
│ │ ├── dashboard-admin.yaml
|
||||
│ │ ├── permanent-token.yaml
|
||||
│ │ └── traefik-config.yaml
|
||||
│ ├── observer
|
||||
│ │ ├── adapter-values.yaml
|
||||
│ │ ├── dashboard-json.yaml
|
||||
│ │ ├── grafana-ingress.yaml
|
||||
│ │ ├── grafana.yaml
|
||||
│ │ ├── loki.yaml
|
||||
│ │ └── prometheus.yaml
|
||||
│ └── storage
|
||||
│ └── openebs-sc.yaml
|
||||
├── namespaces.yaml
|
||||
└── scripts
|
||||
├── README.md
|
||||
├── access-hub.sh
|
||||
├── deploy-all.sh
|
||||
├── setup-grafana-creds.sh
|
||||
└── tests
|
||||
├── generated-202-404-blog.sh
|
||||
└── stress-blog.sh
|
||||
```
|
||||
|
||||
## 1.1 Multi-Node Simulation
|
||||
|
||||
- **Server (Control Plane):** The master node. Runs the API server, scheduler, and etcd.
|
||||
- **Agents (Workers):** The worker nodes where our application pods run.
|
||||
|
||||
### Setting up the environment
|
||||
|
||||
We map port `8080` to the internal Traefik LoadBalancer to access services via `*.localhost`.
|
||||
|
||||
```bash
|
||||
k3d cluster create severed-cluster \
|
||||
--agents 2 \
|
||||
-p "8080:80@loadbalancer" \
|
||||
-p "8443:443@loadbalancer"
|
||||
```
|
||||
|
||||
## 1.2 Image Registry Lifecycle
|
||||
|
||||
Since our `severed-blog` image is local, we side-load it directly into the cluster's internal image store rather than pushing to Docker Hub.
|
||||
|
||||
```bash
|
||||
docker build -t severed-blog:v0.3 .
|
||||
k3d image import severed-blog:v0.3 -c severed-cluster
|
||||
```
|
||||
|
||||
## 1.3 Namespaces & Storage
|
||||
|
||||
We partition the cluster into logical domains. We also install **OpenEBS** to provide dynamic storage provisioning (PersistentVolumes) for our databases.
|
||||
|
||||
**`namespaces.yaml`**
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: severed-apps
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: monitoring
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: kubernetes-dashboard
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: openebs
|
||||
```
|
||||
|
||||
**`infra/storage/openebs-sc.yaml`**
|
||||
|
||||
```yaml
|
||||
apiVersion: storage.k8s.io/v1
|
||||
kind: StorageClass
|
||||
metadata:
|
||||
name: severed-storage
|
||||
provisioner: openebs.io/local
|
||||
reclaimPolicy: Delete
|
||||
volumeBindingMode: WaitForFirstConsumer
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 1.4. Infrastructure Concepts Cheat Sheet
|
||||
|
||||
| Object | Docker Equivalent | Kubernetes Purpose |
|
||||
| ----------- | ------------------------------ | ----------------------------------------------------------------- |
|
||||
| Node | The Host Machine | A physical or virtual server in the cluster. |
|
||||
| Pod | A Container | The smallest deployable unit (can contain multiple containers). |
|
||||
| Deployments | `docker-compose up` | Manages the lifecycle and scaling of Pods. |
|
||||
| Services | Network Aliases | Provides a stable DNS name/IP for a group of Pods. |
|
||||
| HPA | Auto-Scaling Group | Automatically scales replicas based on traffic/load. |
|
||||
| Ingress | Nginx Proxy / Traefik | Manages external access to Services via HTTP/HTTPS. |
|
||||
| ConfigMap | `docker run -v config:/etc...` | Decouples configuration files from the container image. |
|
||||
| Secret | Environment Variables (Secure) | Stores sensitive data (passwords, tokens) encoded in Base64. |
|
||||
| DaemonSet | `mode: global` (Swarm) | Ensures one copy of a Pod runs on _every_ Node (logs/monitoring). |
|
||||
| StatefulSet | N/A | Manages apps requiring stable identities and storage (Databases). |
|
||||
|
||||
[[2025-12-27-part-2]]
|
||||
285
_posts/blog_app/2025-12-27-part-2.md
Normal file
285
_posts/blog_app/2025-12-27-part-2.md
Normal file
@@ -0,0 +1,285 @@
|
||||
---
|
||||
layout: post
|
||||
title: 'Step 2: The Application Engine & Auto-Scaling'
|
||||
date: 2025-12-28 04:00:00 -0400
|
||||
categories:
|
||||
- blog_app
|
||||
highlight: true
|
||||
---
|
||||
|
||||
[[2025-12-27-part-1]]
|
||||
|
||||
# 2. The Application Engine & Auto-Scaling
|
||||
|
||||
## 2.1 Decoupling Configuration (ConfigMaps)
|
||||
|
||||
In Docker, if you need to update an Nginx `default.conf`, you typically `COPY` the file into the image and rebuild it. In Kubernetes, we use a **ConfigMap** to treat configuration as a separate object. By using a ConfigMap, we can update these rules and simply restart the pods to apply changes, no Docker build required.
|
||||
|
||||
We use a **ConfigMap** to inject the Nginx configuration.
|
||||
|
||||
**`apps/severed-blog-config.yaml`**
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: severed-blog-config
|
||||
namespace: severed-apps
|
||||
data:
|
||||
default.conf: |
|
||||
# 1. Define the custom log format
|
||||
log_format observability '$remote_addr - $remote_user [$time_local] "$request" '
|
||||
'$status $body_bytes_sent "$http_referer" '
|
||||
'"$http_user_agent" "$request_time"';
|
||||
|
||||
server {
|
||||
listen 80;
|
||||
server_name localhost;
|
||||
root /usr/share/nginx/html;
|
||||
index index.html index.htm;
|
||||
|
||||
# 2. Apply the format to stdout
|
||||
access_log /dev/stdout observability;
|
||||
error_log /dev/stderr;
|
||||
|
||||
# gzip compression
|
||||
gzip on;
|
||||
gzip_types text/plain text/css application/json application/javascript text/xml application/xml application/xml+rss text/javascript;
|
||||
gzip_vary on;
|
||||
gzip_min_length 1000;
|
||||
|
||||
# assets (images, fonts, favicons) - cache for 1 Year
|
||||
location ~* \.(jpg|jpeg|gif|png|ico|svg|woff|woff2|ttf|eot)$ {
|
||||
expires 365d;
|
||||
add_header Cache-Control "public, no-transform";
|
||||
try_files $uri =404;
|
||||
}
|
||||
|
||||
# code (css, js) - cache for 1 month
|
||||
location ~* \.(css|js)$ {
|
||||
expires 30d;
|
||||
add_header Cache-Control "public, no-transform";
|
||||
try_files $uri =404;
|
||||
}
|
||||
|
||||
# standard routing
|
||||
location / {
|
||||
try_files $uri $uri/ $uri.html =404;
|
||||
}
|
||||
|
||||
error_page 404 /404.html;
|
||||
location = /404.html {
|
||||
internal;
|
||||
}
|
||||
|
||||
# logging / lb config
|
||||
real_ip_header X-Forwarded-For;
|
||||
set_real_ip_from 10.0.0.0/8;
|
||||
|
||||
# metrics endpoint for Alloy/Prometheus
|
||||
location /metrics {
|
||||
stub_status on;
|
||||
access_log off; # Keep noise out of our main logs
|
||||
allow 127.0.0.1;
|
||||
allow 10.0.0.0/8;
|
||||
allow 172.16.0.0/12;
|
||||
deny all;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
It is a better practice to keep `default.conf` as a standalone file in our repo (e.g., `apps/config/default.conf`) and inject it like:
|
||||
|
||||
```shell
|
||||
kubectl create configmap severed-blog-config \
|
||||
-n severed-apps \
|
||||
--from-file=default.conf=apps/config/default.conf \
|
||||
--dry-run=client -o yaml | kubectl apply -f -
|
||||
```
|
||||
|
||||
## 2.2 Deploying the Workload: The Sidecar Pattern
|
||||
|
||||
The **Deployment** ensures the desired state is maintained. We requested `replicas: 2`, meaning K8s will ensure two instances of the blog are running across our worker nodes.
|
||||
|
||||
**The Sidecar:** We added a second container (`nginx-prometheus-exporter`) to the same Pod.
|
||||
|
||||
1. **Web Container:** Serves the blog content.
|
||||
2. **Exporter Container:** Scrapes the Web container's local `/metrics` endpoint and translates it into Prometheus format on port `9113`.
|
||||
|
||||
**`apps/severed-blog.yaml`**
|
||||
|
||||
```yaml
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: severed-blog
|
||||
namespace: severed-apps
|
||||
spec:
|
||||
replicas: 2
|
||||
selector:
|
||||
matchLabels:
|
||||
app: severed-blog
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: severed-blog
|
||||
spec:
|
||||
containers:
|
||||
- name: web
|
||||
image: severed-blog:v0.3
|
||||
imagePullPolicy: Never
|
||||
ports:
|
||||
- containerPort: 80
|
||||
resources:
|
||||
requests:
|
||||
cpu: '50m'
|
||||
memory: '64Mi'
|
||||
limits:
|
||||
cpu: '200m'
|
||||
memory: '128Mi'
|
||||
volumeMounts:
|
||||
- name: nginx-config-vol
|
||||
mountPath: /etc/nginx/conf.d/default.conf
|
||||
subPath: default.conf
|
||||
|
||||
- name: exporter
|
||||
image: nginx/nginx-prometheus-exporter:latest
|
||||
args:
|
||||
- -nginx.scrape-uri=http://localhost:80/metrics
|
||||
ports:
|
||||
- containerPort: 9113
|
||||
name: metrics
|
||||
resources:
|
||||
requests:
|
||||
cpu: '10m'
|
||||
memory: '32Mi'
|
||||
limits:
|
||||
cpu: '50m'
|
||||
memory: '64Mi'
|
||||
|
||||
volumes:
|
||||
- name: nginx-config-vol
|
||||
configMap:
|
||||
name: severed-blog-config
|
||||
```
|
||||
|
||||
The `spec.volumes` block references our ConfigMap, and `volumeMounts` places that data exactly where Nginx expects its
|
||||
configuration.
|
||||
|
||||
### 2.2.1 Internal Networking (Services)
|
||||
|
||||
Pods are ephemeral; they die and get new IP addresses. If we pointed our Ingress directly at a Pod IP, the site would break every time a pod restarted.
|
||||
|
||||
We use a **Service** to solve this. A Service provides a stable Virtual IP (ClusterIP) and an internal DNS name (`severed-blog-service.severed-apps.svc.cluster.local`) that load balances traffic to any Pod matching the selector: `app: severed-blog`.
|
||||
|
||||
**`apps/severed-blog-service.yaml`**
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: severed-blog-service
|
||||
namespace: severed-apps
|
||||
spec:
|
||||
selector:
|
||||
app: severed-blog
|
||||
ports:
|
||||
- protocol: TCP
|
||||
port: 80
|
||||
targetPort: 80
|
||||
type: ClusterIP
|
||||
```
|
||||
|
||||
## 2.3 Traffic Routing (Ingress)
|
||||
|
||||
External users cannot talk to Pods directly. Traffic flows: **Internet → Ingress → Service → Pod**.
|
||||
|
||||
1. **The Service:** Acts as an internal LoadBalancer with a stable DNS name.
|
||||
2. **The Ingress:** Acts as a reverse proxy (Traefik) that reads the URL hostname.
|
||||
|
||||
**`apps/severed-ingress.yaml`**
|
||||
|
||||
```yaml
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: severed-ingress
|
||||
namespace: severed-apps
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
# ONLY accept traffic for this specific hostname
|
||||
- host: blog.localhost
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: severed-blog-service
|
||||
port:
|
||||
number: 80
|
||||
```
|
||||
|
||||
## 2.4 Auto-Scaling (HPA)
|
||||
|
||||
We implemented a **Horizontal Pod Autoscaler (HPA)** that scales the blog based on three metrics:
|
||||
|
||||
1. **CPU:** Target 90% of _Requests_ (not Limits).
|
||||
2. **Memory:** Target 80% of _Requests_.
|
||||
3. **Traffic (RPS):** Target 500 requests per second per pod.
|
||||
|
||||
To prevent scaling up and down too fast, we added a **Stabilization Window** and a strict **Scale Up Limit** (max 1 pod every 15s). This prevents the cluster from exploding due to 1-second spikes.
|
||||
|
||||
**`apps/severed-blog-hpa.yaml`**
|
||||
|
||||
```yaml
|
||||
apiVersion: autoscaling/v2
|
||||
kind: HorizontalPodAutoscaler
|
||||
metadata:
|
||||
name: severed-blog-hpa
|
||||
namespace: severed-apps
|
||||
spec:
|
||||
scaleTargetRef:
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: severed-blog
|
||||
minReplicas: 2 # Never drop below 2 for HA
|
||||
maxReplicas: 6 # Maximum number of pods to prevent cluster exhaustion
|
||||
metrics:
|
||||
- type: Resource
|
||||
resource:
|
||||
name: cpu
|
||||
target:
|
||||
type: Utilization
|
||||
averageUtilization: 90 # Scale up if CPU Usage exceeds 90%
|
||||
- type: Resource
|
||||
resource:
|
||||
name: memory
|
||||
target:
|
||||
type: Utilization
|
||||
averageUtilization: 80 # Scale up if RAM Usage exceeds 80%
|
||||
- type: Pods
|
||||
pods:
|
||||
metric:
|
||||
name: nginx_http_requests_total
|
||||
target:
|
||||
type: AverageValue
|
||||
averageValue: '500' # Scale up if requests per second > 500 per pod
|
||||
behavior:
|
||||
scaleDown:
|
||||
stabilizationWindowSeconds: 60 # 60 sec before removing a pod
|
||||
policies:
|
||||
- type: Percent
|
||||
value: 100
|
||||
periodSeconds: 15
|
||||
scaleUp:
|
||||
stabilizationWindowSeconds: 60
|
||||
policies:
|
||||
- type: Pods
|
||||
value: 1
|
||||
periodSeconds: 60
|
||||
```
|
||||
|
||||
[[2025-12-27-part-3]]
|
||||
663
_posts/blog_app/2025-12-27-part-3.md
Normal file
663
_posts/blog_app/2025-12-27-part-3.md
Normal file
@@ -0,0 +1,663 @@
|
||||
---
|
||||
layout: post
|
||||
title: 'Step 3: Observability (LGTM, KSM)'
|
||||
date: 2025-12-28 05:00:00 -0400
|
||||
categories:
|
||||
- blog_app
|
||||
highlight: true
|
||||
---
|
||||
|
||||
[[2025-12-27-part-2]]
|
||||
|
||||
# 3. Observability: The LGTM Stack
|
||||
|
||||
In a distributed cluster, logs and metrics are scattered across different pods and nodes. We centralized monitoring using the LGTM Stack (Loki, Grafana, Prometheus) plus **Kube State Metrics** and the **Prometheus Adapter** to centralize our logs and metrics.
|
||||
|
||||
## 3.1 The Databases (StatefulSets)
|
||||
|
||||
- **Prometheus:** Scrapes metrics. We updated the config to scrape **Kube State Metrics** via its internal DNS Service.
|
||||
- **Loki:** Aggregates logs. Configured with a 168h (7-day) retention period.
|
||||
|
||||
**`infra/observer/prometheus.yaml`**
|
||||
|
||||
```yaml
|
||||
# Configuration
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: prometheus-config
|
||||
namespace: monitoring
|
||||
data:
|
||||
prometheus.yml: |
|
||||
global:
|
||||
scrape_interval: 15s
|
||||
evaluation_interval: 15s
|
||||
storage:
|
||||
tsdb:
|
||||
out_of_order_time_window: 1m
|
||||
|
||||
scrape_configs:
|
||||
# 1. Scrape Prometheus itself (Health Check)
|
||||
- job_name: 'prometheus'
|
||||
static_configs:
|
||||
- targets: ['localhost:9090']
|
||||
|
||||
# 2. Scrape Kube State Metrics (KSM)
|
||||
# We use the internal DNS: service-name.namespace.svc.cluster.local:port
|
||||
- job_name: 'kube-state-metrics'
|
||||
static_configs:
|
||||
- targets: ['kube-state-metrics.monitoring.svc.cluster.local:8080']
|
||||
|
||||
---
|
||||
# Service
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: prometheus
|
||||
namespace: monitoring
|
||||
spec:
|
||||
type: ClusterIP
|
||||
selector:
|
||||
app: prometheus
|
||||
ports:
|
||||
- port: 9090
|
||||
targetPort: 9090
|
||||
|
||||
---
|
||||
# The Database (StatefulSet)
|
||||
apiVersion: apps/v1
|
||||
kind: StatefulSet
|
||||
metadata:
|
||||
name: prometheus
|
||||
namespace: monitoring
|
||||
spec:
|
||||
serviceName: prometheus
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: prometheus
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: prometheus
|
||||
spec:
|
||||
containers:
|
||||
- name: prometheus
|
||||
image: prom/prometheus:latest
|
||||
args:
|
||||
- '--config.file=/etc/prometheus/prometheus.yml'
|
||||
- '--web.enable-remote-write-receiver'
|
||||
- '--storage.tsdb.path=/prometheus'
|
||||
- '--web.console.libraries=/usr/share/prometheus/console_libraries'
|
||||
- '--web.console.templates=/usr/share/prometheus/consoles'
|
||||
ports:
|
||||
- containerPort: 9090
|
||||
volumeMounts:
|
||||
- name: config
|
||||
mountPath: /etc/prometheus
|
||||
- name: data
|
||||
mountPath: /prometheus
|
||||
volumes:
|
||||
- name: config
|
||||
configMap:
|
||||
name: prometheus-config
|
||||
volumeClaimTemplates:
|
||||
- metadata:
|
||||
name: data
|
||||
spec:
|
||||
accessModes: ['ReadWriteOnce']
|
||||
storageClassName: 'openebs-hostpath'
|
||||
resources:
|
||||
requests:
|
||||
storage: 5Gi
|
||||
```
|
||||
|
||||
**`infra/observer/loki.yaml`**
|
||||
|
||||
```yaml
|
||||
# --- Configuration ---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: loki-config
|
||||
namespace: monitoring
|
||||
data:
|
||||
local-config.yaml: |
|
||||
auth_enabled: false
|
||||
server:
|
||||
http_listen_port: 3100
|
||||
common:
|
||||
path_prefix: /loki
|
||||
storage:
|
||||
filesystem:
|
||||
chunks_directory: /loki/chunks
|
||||
rules_directory: /loki/rules
|
||||
replication_factor: 1
|
||||
ring:
|
||||
instance_addr: 127.0.0.1
|
||||
kvstore:
|
||||
store: inmemory
|
||||
schema_config:
|
||||
configs:
|
||||
- from: 2020-10-24
|
||||
store: tsdb
|
||||
object_store: filesystem
|
||||
schema: v13
|
||||
index:
|
||||
prefix: index_
|
||||
period: 24h
|
||||
|
||||
---
|
||||
# --- Storage Service (Headless) ---
|
||||
# Required for StatefulSets to maintain stable DNS entries.
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: loki
|
||||
namespace: monitoring
|
||||
spec:
|
||||
type: ClusterIP
|
||||
selector:
|
||||
app: loki
|
||||
ports:
|
||||
- port: 3100
|
||||
targetPort: 3100
|
||||
name: http-metrics
|
||||
|
||||
---
|
||||
# --- The Database (StatefulSet) ---
|
||||
apiVersion: apps/v1
|
||||
kind: StatefulSet
|
||||
metadata:
|
||||
name: loki
|
||||
namespace: monitoring
|
||||
spec:
|
||||
serviceName: loki
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: loki
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: loki
|
||||
spec:
|
||||
containers:
|
||||
- name: loki
|
||||
image: grafana/loki:latest
|
||||
args:
|
||||
- -config.file=/etc/loki/local-config.yaml
|
||||
ports:
|
||||
- containerPort: 3100
|
||||
name: http-metrics
|
||||
volumeMounts:
|
||||
- name: config
|
||||
mountPath: /etc/loki
|
||||
- name: data
|
||||
mountPath: /loki
|
||||
volumes:
|
||||
- name: config
|
||||
configMap:
|
||||
name: loki-config
|
||||
# Persistent Storage
|
||||
volumeClaimTemplates:
|
||||
- metadata:
|
||||
name: data
|
||||
spec:
|
||||
accessModes: ['ReadWriteOnce']
|
||||
storageClassName: 'openebs-hostpath'
|
||||
resources:
|
||||
requests:
|
||||
storage: 5Gi
|
||||
```
|
||||
|
||||
## 3.2 The Bridge: Prometheus Adapter & KSM
|
||||
|
||||
Standard HPA only understands CPU and Memory. To scale on **Requests Per Second**, we needed two extra components.
|
||||
|
||||
**Helm (Package Manager)**
|
||||
You will notice `kube-state-metrics` and `prometheus-adapter` are missing from our file tree. That is because we install them using **Helm**. Helm allows us to install complex, pre-packaged applications ("Charts") without writing thousands of lines of YAML. We only provide a `values.yaml` file to override specific settings.
|
||||
|
||||
1. **Kube State Metrics (KSM):** A service that listens to the Kubernetes API and generates metrics about the state of objects (e.g., `kube_pod_created`).
|
||||
2. **Prometheus Adapter:** Installs via Helm. We use `infra/observer/adapter-values.yaml` to configure how it translates Prometheus queries into Kubernetes metrics.
|
||||
|
||||
**`infra/observer/adapter-values.yaml`**
|
||||
|
||||
```yaml
|
||||
prometheus:
|
||||
url: http://prometheus.monitoring.svc.cluster.local
|
||||
port: 9090
|
||||
|
||||
rules:
|
||||
custom:
|
||||
- seriesQuery: 'nginx_http_requests_total{pod!="",namespace!=""}'
|
||||
resources:
|
||||
overrides:
|
||||
namespace: { resource: 'namespace' }
|
||||
pod: { resource: 'pod' }
|
||||
name:
|
||||
matches: '^(.*)_total'
|
||||
as: 'nginx_http_requests_total'
|
||||
metricsQuery: 'rate(<<.Series>>{<<.LabelMatchers>>}[1m])'
|
||||
```
|
||||
|
||||
## 3.3 The Agent: Grafana Alloy (DaemonSets)
|
||||
|
||||
We need to collect logs from every node in the cluster.
|
||||
|
||||
- **DaemonSet vs. Deployment:** A Deployment ensures _n_ replicas exist somewhere. A **DaemonSet** ensures exactly **one** Pod runs on **every** Node. This is perfect for infrastructure agents (logging, networking, monitoring).
|
||||
- **Downward API:** We need to inject the Pod's own name and namespace into its environment variables so it knows "who it is."
|
||||
|
||||
**`infra/alloy-env.yaml`**
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: monitoring-env
|
||||
namespace: monitoring
|
||||
data:
|
||||
LOKI_URL: 'http://loki.monitoring.svc:3100/loki/api/v1/push'
|
||||
PROM_URL: 'http://prometheus.monitoring.svc:9090/api/v1/write'
|
||||
```
|
||||
|
||||
**`infra/alloy-setup.yaml`**
|
||||
|
||||
```yaml
|
||||
# --- RBAC configuration ---
|
||||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: alloy-sa
|
||||
namespace: monitoring
|
||||
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRole
|
||||
metadata:
|
||||
name: alloy-cluster-role
|
||||
rules:
|
||||
# 1. Standard API Access
|
||||
- apiGroups: ['']
|
||||
resources: ['nodes', 'nodes/proxy', 'services', 'endpoints', 'pods']
|
||||
verbs: ['get', 'list', 'watch']
|
||||
# 2. ALLOW METRICS ACCESS (Crucial for cAdvisor/Kubelet)
|
||||
- apiGroups: ['']
|
||||
resources: ['nodes/stats', 'nodes/metrics']
|
||||
verbs: ['get']
|
||||
# 3. Log Access
|
||||
- apiGroups: ['']
|
||||
resources: ['pods/log']
|
||||
verbs: ['get', 'list', 'watch']
|
||||
# 4. Non-Resource URLs (Sometimes needed for /metrics endpoints)
|
||||
- nonResourceURLs: ['/metrics']
|
||||
verbs: ['get']
|
||||
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRoleBinding
|
||||
metadata:
|
||||
name: alloy-cluster-binding
|
||||
roleRef:
|
||||
apiGroup: rbac.authorization.k8s.io
|
||||
kind: ClusterRole
|
||||
name: alloy-cluster-role
|
||||
subjects:
|
||||
- kind: ServiceAccount
|
||||
name: alloy-sa
|
||||
namespace: monitoring
|
||||
|
||||
---
|
||||
# --- Alloy pipeline configuration ---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: alloy-config
|
||||
namespace: monitoring
|
||||
data:
|
||||
config.alloy: |
|
||||
// 1. Discovery: Find all pods
|
||||
discovery.kubernetes "k8s_pods" {
|
||||
role = "pod"
|
||||
}
|
||||
|
||||
// 2. Relabeling: Filter and Label "severed-blog" pods
|
||||
discovery.relabel "blog_pods" {
|
||||
targets = discovery.kubernetes.k8s_pods.targets
|
||||
|
||||
rule {
|
||||
action = "keep"
|
||||
source_labels = ["__meta_kubernetes_pod_label_app"]
|
||||
regex = "severed-blog"
|
||||
}
|
||||
|
||||
// Explicitly set 'pod' and 'namespace' labels for the Adapter
|
||||
rule {
|
||||
action = "replace"
|
||||
source_labels = ["__meta_kubernetes_pod_name"]
|
||||
target_label = "pod"
|
||||
}
|
||||
|
||||
rule {
|
||||
action = "replace"
|
||||
source_labels = ["__meta_kubernetes_namespace"]
|
||||
target_label = "namespace"
|
||||
}
|
||||
|
||||
// Route to the sidecar exporter port
|
||||
rule {
|
||||
action = "replace"
|
||||
source_labels = ["__address__"]
|
||||
target_label = "__address__"
|
||||
regex = "([^:]+)(?::\\d+)?"
|
||||
replacement = "$1:9113"
|
||||
}
|
||||
}
|
||||
|
||||
// 3. Direct Nginx Scraper
|
||||
prometheus.scrape "nginx_scraper" {
|
||||
targets = discovery.relabel.blog_pods.output
|
||||
forward_to = [prometheus.remote_write.metrics_service.receiver]
|
||||
job_name = "integrations/nginx"
|
||||
}
|
||||
|
||||
// 4. Host Metrics (Unix Exporter)
|
||||
prometheus.exporter.unix "host" {
|
||||
rootfs_path = "/host/root"
|
||||
sysfs_path = "/host/sys"
|
||||
procfs_path = "/host/proc"
|
||||
}
|
||||
|
||||
prometheus.scrape "host_scraper" {
|
||||
targets = prometheus.exporter.unix.host.targets
|
||||
forward_to = [prometheus.remote_write.metrics_service.receiver]
|
||||
}
|
||||
|
||||
// 5. Remote Write: Send to Prometheus
|
||||
prometheus.remote_write "metrics_service" {
|
||||
endpoint {
|
||||
url = sys.env("PROM_URL")
|
||||
}
|
||||
}
|
||||
|
||||
// 6. Logs Pipeline: Send to Loki
|
||||
loki.source.kubernetes "pod_logs" {
|
||||
targets = discovery.relabel.blog_pods.output
|
||||
forward_to = [loki.write.default.receiver]
|
||||
}
|
||||
|
||||
loki.write "default" {
|
||||
endpoint {
|
||||
url = sys.env("LOKI_URL")
|
||||
}
|
||||
}
|
||||
|
||||
// 7. Kubelet Scraper (cAdvisor for Container Metrics)
|
||||
discovery.kubernetes "k8s_nodes" {
|
||||
role = "node"
|
||||
}
|
||||
|
||||
prometheus.scrape "kubelet_cadvisor" {
|
||||
targets = discovery.kubernetes.k8s_nodes.targets
|
||||
scheme = "https"
|
||||
metrics_path = "/metrics/cadvisor"
|
||||
job_name = "integrations/kubernetes/cadvisor"
|
||||
|
||||
tls_config {
|
||||
insecure_skip_verify = true
|
||||
}
|
||||
bearer_token_file = "/var/run/secrets/kubernetes.io/serviceaccount/token"
|
||||
|
||||
forward_to = [prometheus.remote_write.metrics_service.receiver]
|
||||
}
|
||||
|
||||
---
|
||||
# --- Agent Deployment (DaemonSet) ---
|
||||
apiVersion: apps/v1
|
||||
kind: DaemonSet
|
||||
metadata:
|
||||
name: alloy
|
||||
namespace: monitoring
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
name: alloy
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
name: alloy
|
||||
spec:
|
||||
serviceAccountName: alloy-sa
|
||||
hostNetwork: true
|
||||
hostPID: true
|
||||
dnsPolicy: ClusterFirstWithHostNet
|
||||
containers:
|
||||
- name: alloy
|
||||
image: grafana/alloy:latest
|
||||
args:
|
||||
- run
|
||||
- --server.http.listen-addr=0.0.0.0:12345
|
||||
- --storage.path=/var/lib/alloy/data
|
||||
- /etc/alloy/config.alloy
|
||||
envFrom:
|
||||
- configMapRef:
|
||||
name: monitoring-env
|
||||
optional: false
|
||||
volumeMounts:
|
||||
- name: config
|
||||
mountPath: /etc/alloy
|
||||
- name: logs
|
||||
mountPath: /var/log
|
||||
- name: proc
|
||||
mountPath: /host/proc
|
||||
readOnly: true
|
||||
- name: sys
|
||||
mountPath: /host/sys
|
||||
readOnly: true
|
||||
- name: root
|
||||
mountPath: /host/root
|
||||
readOnly: true
|
||||
volumes:
|
||||
- name: config
|
||||
configMap:
|
||||
name: alloy-config
|
||||
- name: logs
|
||||
hostPath:
|
||||
path: /var/log
|
||||
- name: proc
|
||||
hostPath:
|
||||
path: /proc
|
||||
- name: sys
|
||||
hostPath:
|
||||
path: /sys
|
||||
- name: root
|
||||
hostPath:
|
||||
path: /
|
||||
```
|
||||
|
||||
## 3.4 Visualization: Grafana
|
||||
|
||||
We deployed Grafana with pre-loaded dashboards via ConfigMaps.
|
||||
|
||||
**Key Dashboards Created:**
|
||||
|
||||
1. **Cluster Health:** CPU/Memory saturation.
|
||||
2. **HPA Live Status:** A custom table showing the _real_ scaling drivers (RPS, CPU Request %) vs the HPA's reaction.
|
||||
|
||||
**`infra/observer/grafana.yaml`**
|
||||
|
||||
```yaml
|
||||
# 1. Datasources (Connection to Loki/Prom)
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: grafana-datasources
|
||||
namespace: monitoring
|
||||
data:
|
||||
datasources.yaml: |
|
||||
apiVersion: 1
|
||||
datasources:
|
||||
- name: Prometheus
|
||||
type: prometheus
|
||||
access: proxy
|
||||
url: http://prometheus.monitoring.svc:9090
|
||||
isDefault: false
|
||||
- name: Loki
|
||||
type: loki
|
||||
access: proxy
|
||||
url: http://loki.monitoring.svc:3100
|
||||
isDefault: true
|
||||
|
||||
---
|
||||
# 2. Dashboard Provider (Tells Grafana to load from /var/lib/grafana/dashboards)
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: grafana-dashboard-provider
|
||||
namespace: monitoring
|
||||
data:
|
||||
dashboard-provider.yaml: |
|
||||
apiVersion: 1
|
||||
providers:
|
||||
- name: 'Severed Dashboards'
|
||||
orgId: 1
|
||||
folder: ''
|
||||
type: file
|
||||
disableDeletion: false
|
||||
updateIntervalSeconds: 10 # Allow editing in UI, but it resets on restart
|
||||
options:
|
||||
path: /var/lib/grafana/dashboards
|
||||
|
||||
---
|
||||
# 3. Service
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: grafana-service
|
||||
namespace: monitoring
|
||||
spec:
|
||||
type: LoadBalancer
|
||||
selector:
|
||||
app: grafana
|
||||
ports:
|
||||
- protocol: TCP
|
||||
port: 3000
|
||||
targetPort: 3000
|
||||
|
||||
---
|
||||
# 4. Deployment (The App)
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: grafana
|
||||
namespace: monitoring
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: grafana
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: grafana
|
||||
spec:
|
||||
containers:
|
||||
- name: grafana
|
||||
image: grafana/grafana:latest
|
||||
ports:
|
||||
- containerPort: 3000
|
||||
|
||||
env:
|
||||
- name: GF_SECURITY_ADMIN_USER
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: grafana-secrets
|
||||
key: admin-user
|
||||
- name: GF_SECURITY_ADMIN_PASSWORD
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: grafana-secrets
|
||||
key: admin-password
|
||||
|
||||
- name: GF_AUTH_ANONYMOUS_ENABLED
|
||||
value: 'true'
|
||||
- name: GF_AUTH_ANONYMOUS_ORG_ROLE
|
||||
value: 'Viewer'
|
||||
- name: GF_AUTH_ANONYMOUS_ORG_NAME
|
||||
value: 'Main Org.'
|
||||
|
||||
volumeMounts:
|
||||
- name: grafana-datasources
|
||||
mountPath: /etc/grafana/provisioning/datasources
|
||||
- name: grafana-dashboard-provider
|
||||
mountPath: /etc/grafana/provisioning/dashboards
|
||||
- name: grafana-dashboards-json
|
||||
mountPath: /var/lib/grafana/dashboards
|
||||
- name: grafana-storage
|
||||
mountPath: /var/lib/grafana
|
||||
volumes:
|
||||
- name: grafana-datasources
|
||||
configMap:
|
||||
name: grafana-datasources
|
||||
- name: grafana-dashboard-provider
|
||||
configMap:
|
||||
name: grafana-dashboard-provider
|
||||
- name: grafana-dashboards-json
|
||||
configMap:
|
||||
name: grafana-dashboards-json
|
||||
- name: grafana-storage
|
||||
emptyDir: {}
|
||||
```
|
||||
|
||||
In the Deployment above, you see references to `grafana-secrets`. However, this file is **not** in our git repository.
|
||||
|
||||
```yaml
|
||||
- name: GF_SECURITY_ADMIN_PASSWORD
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: grafana-secrets # <--- where is this?
|
||||
key: admin-password
|
||||
```
|
||||
|
||||
We don't commit it to version control. In our `deploy-all.sh` script, we generate this secret imperatively using `kubectl create secret generic`. In a real production environment, we would use tools like **ExternalSecrets** or **SealedSecrets** to inject these safely.
|
||||
|
||||
**`dashboard-json.yaml`**
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: grafana-dashboards-json
|
||||
namespace: monitoring
|
||||
data:
|
||||
severed-health.json: |
|
||||
...
|
||||
```
|
||||
|
||||
Just like our blog, we need an Ingress to access Grafana. Notice we map a different hostname (`grafana.localhost`) to the Grafana service port (`3000`).
|
||||
|
||||
**`infra/observer/grafana-ingress.yaml`**
|
||||
|
||||
```yaml
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: grafana-ingress
|
||||
namespace: monitoring
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- host: grafana.localhost
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: grafana-service # ...send them to Grafana
|
||||
port:
|
||||
number: 3000
|
||||
```
|
||||
|
||||
[[2025-12-27-part-4]]
|
||||
94
_posts/blog_app/2025-12-27-part-4.md
Normal file
94
_posts/blog_app/2025-12-27-part-4.md
Normal file
@@ -0,0 +1,94 @@
|
||||
---
|
||||
layout: post
|
||||
title: 'Step 4: RBAC & Security'
|
||||
date: 2025-12-28 06:00:00 -0400
|
||||
categories:
|
||||
- blog_app
|
||||
highlight: true
|
||||
---
|
||||
|
||||
[[2025-12-27-part-3]]
|
||||
|
||||
# 4. Cluster Management & Security
|
||||
|
||||
## 4.1 RBAC: Admin user
|
||||
|
||||
In Kubernetes, a **ServiceAccount** is an identity for a process or a human to talk to the API. We created an `admin-user` but identities have no power by default. We must link them to a **ClusterRole** (a set of permissions) using a **ClusterRoleBinding**.
|
||||
|
||||
- **ServiceAccount**: Creates the `admin-user` identity in the dashboard namespace.
|
||||
- **ClusterRoleBinding**: Grants this specific user the `cluster-admin` role (Full access to the entire cluster).
|
||||
|
||||
**`infra/dashboard/dashboard-admin.yaml`**:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: ServiceAccount
|
||||
metadata:
|
||||
name: admin-user
|
||||
namespace: kubernetes-dashboard
|
||||
---
|
||||
apiVersion: rbac.authorization.k8s.io/v1
|
||||
kind: ClusterRoleBinding
|
||||
metadata:
|
||||
name: admin-user
|
||||
roleRef:
|
||||
apiGroup: rbac.authorization.k8s.io
|
||||
kind: ClusterRole
|
||||
name: cluster-admin
|
||||
subjects:
|
||||
- kind: ServiceAccount
|
||||
name: admin-user
|
||||
namespace: kubernetes-dashboard
|
||||
```
|
||||
|
||||
## 4.2 Authentication: Permanent Tokens
|
||||
|
||||
Modern Kubernetes no longer generates tokens automatically for ServiceAccounts. To log into the UI, we need a static, long-lived credential.
|
||||
|
||||
**`infra/dashboard/permanent-token.yaml`**:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: admin-user-token
|
||||
namespace: kubernetes-dashboard
|
||||
annotations:
|
||||
kubernetes.io/service-account.name: 'admin-user'
|
||||
type: kubernetes.io/service-account-token
|
||||
```
|
||||
|
||||
This creates a **Secret** of type `kubernetes.io/service-account-token`. By adding the annotation `kubernetes.io/service-account.name: "admin-user"`, K8s automatically populates the Secret with a signed JWT token that we can use to bypass the login screen.
|
||||
|
||||
## 4.3 Localhost: Ingress & Cookies
|
||||
|
||||
The Kubernetes Dashboard requires HTTPS, which creates issues with self-signed certificates on `localhost`. We need to reconfigure **Traefik** (the internal reverse proxy bundled with K3s) to allow insecure backends.
|
||||
|
||||
**Helm & CRDs**
|
||||
K3s installs Traefik using **Helm** (the Kubernetes Package Manager). Usually, you manage Helm via CLI (`helm install`). However, K3s includes a **Helm Controller** that lets us manage charts using YAML files called **HelmChartConfigs** (a Custom Resource Definition or CRD).
|
||||
|
||||
This allows us to reconfigure a complex Helm deployment using a simple declarative file.
|
||||
|
||||
**`infra/dashboard/traefik-config.yaml`**
|
||||
|
||||
```yaml
|
||||
apiVersion: helm.cattle.io/v1
|
||||
kind: HelmChartConfig
|
||||
metadata:
|
||||
name: traefik
|
||||
namespace: kube-system
|
||||
spec:
|
||||
valuesContent: |-
|
||||
additionalArguments:
|
||||
# Tell Traefik to ignore SSL errors when talking to internal services
|
||||
- "--serversTransport.insecureSkipVerify=true"
|
||||
```
|
||||
|
||||
## 4.4. Stress Testing & Verification
|
||||
|
||||
We used **Apache Bench (`ab`)** to generate massive concurrency capable of triggering the HPA. This test results in tens of thousands of requests which triggers the RPS rule in out HPA configuration.
|
||||
|
||||
```bash
|
||||
# Generate 50 concurrent users for 5 minutes
|
||||
ab -k -c 50 -t 300 -H "Host: blog.localhost" http://127.0.0.1:8080/
|
||||
```
|
||||
@@ -7,9 +7,7 @@ categories:
|
||||
highlight: true
|
||||
---
|
||||
|
||||
This blog serves as the public documentation for **Severed**. While the main site provides the high-level vision,
|
||||
this space is dedicated to the technical source-of-truth for the experiments, infrastructure-as-code, and proprietary
|
||||
tooling that are used within the cluster.
|
||||
This blog serves as the public documentation for **Severed**. This space is dedicated to the technical source-of-truth for the experiments, infrastructure-as-code, and proprietary tooling that are used.
|
||||
|
||||
### Ecosystem
|
||||
|
||||
@@ -23,31 +21,24 @@ The following services are currently active within the `severed.ink` network:
|
||||
|
||||
### Core Infrastructure
|
||||
|
||||
The ecosystem is powered by a **Home Server Cluster** managed via a **Kubernetes (k3s)** distribution. This setup
|
||||
prioritizes local sovereignty and GitOps principles.
|
||||
The ecosystem is powered by a hybrid **Home Server Cluster** managed via a **Kubernetes (k3s)** distribution and AWS services. We prioritize local sovereignty and GitOps principles.
|
||||
|
||||
- **CI Pipeline:** Automated build and test suites are orchestrated by a private Jenkins server utilizing self-hosted
|
||||
runners.
|
||||
- **CI Pipeline:** Automated build and test suites are orchestrated by a private Jenkins server utilizing self-hosted runners.
|
||||
- **GitOps & Deployment:** Automated synchronization and state enforcement via **ArgoCD**.
|
||||
- **Data Layer:** Persistent storage managed by **PostgreSQL**.
|
||||
- **Telemetry:** Full-stack observability provided by **Prometheus** (metrics) and **Loki** (logs) via **Grafana**.
|
||||
- **Security Layer:** Push/Pull GitOps operations require an active connection to a **WireGuard (VPN)** for remote
|
||||
access.
|
||||
- **Security Layer:** Push/Pull GitOps operations require an active connection to a **WireGuard (VPN)** for remote access.
|
||||
|
||||
### Roadmap
|
||||
|
||||
Engineering efforts are currently focused on the following milestones:
|
||||
Efforts are currently focused on the following milestones:
|
||||
|
||||
1. **OSS Strategy:** Transitioning from a hybrid of AWS managed services toward a ~100% Open Source Software (OSS) stack.
|
||||
2. **High Availability (HA):** Implementing a "Cloud RAID-1" failover mechanism. In the event of home cluster
|
||||
instability, traffic automatically routes to a secondary cloud-instantiated Kubernetes cluster as a temporary
|
||||
failover.
|
||||
3. **Data Resilience:** Automating PostgreSQL backup strategies to ensure parity between the primary cluster and the
|
||||
cloud-based failover.
|
||||
4. **Storage Infrastructure:** Integrating a dedicated **TrueNAS** node to move from local SATA/NVMe reliance to a
|
||||
centralized, redundant storage architecture.
|
||||
2. **High Availability (HA):** Implementing a "Cloud RAID-1" failover mechanism. In the event of home cluster instability, traffic automatically routes to a secondary cloud-instantiated Kubernetes cluster as a temporary failover.
|
||||
3. **Data Resilience:** Automating PostgreSQL backup strategies to ensure parity between the primary cluster and the cloud-based failover.
|
||||
4. **Storage Infrastructure:** Integrating a dedicated **TrueNAS** node to move from local SATA/NVMe reliance to a centralized, redundant storage architecture.
|
||||
|
||||
### Terminal Redirect
|
||||
### Redirect
|
||||
|
||||
For the full technical portfolio and expertise highlights, visit the main site:
|
||||
|
||||
|
||||
Reference in New Issue
Block a user