NaviauxLab is where I document infrastructure experiments, homelab builds, and the lessons learned when things don't go according to plan. Mostly Kubernetes, networking, and automation. This site runs on the same cluster I'm writing about.
The things I'm spending my free time building, breaking, and thinking about.
Running a production-grade K8s cluster on a single laptop. Talos Linux, Cilium, Longhorn, Flux — the full stack, no managed services. Figuring out what breaks when you don't have a cloud provider safety net.
Everything in Git, everything declarative. Flux watches the repo, SOPS encrypts the secrets, and the cluster converges to the desired state. The goal: change anything with a commit and a push.
Default-deny everything. CiliumNetworkPolicy for egress, standard NetworkPolicy for ingress, and the hard lesson that Cilium evaluates ports after DNAT. Every pod earns its network access.
Prometheus, Grafana, Alertmanager, and Uptime Kuma. Building dashboards that actually tell you something useful, writing PromQL that doesn't lie, and monitoring the monitoring stack itself.
Running my own RSS reader, uptime monitor, and this portfolio site — all on the homelab cluster. Cloudflare Tunnel for public access with zero open ports. Owning the stack from DNS to disk.
Every stage gets a build log, every mistake becomes a lesson. 52 documented lessons so far. The goal isn't just to build — it's to build in a way someone else (or future me) could reproduce.
What I'm actively using in the lab. Not a wish list — these are the tools behind the projects below.
Things I've built, broken, and documented. Each one started with "how hard can it be?"
A production-grade Kubernetes cluster running on a single Dell laptop. Talos Linux as the immutable OS, Cilium for CNI and L2 load balancing, Longhorn for persistent storage, Flux for GitOps, and cert-manager with Let's Encrypt DNS-01 for real TLS certificates. Cloudflare Tunnel for zero-port public access. Every component deployed, debugged, and documented — 16 build stages, 52 lessons learned, full disaster recovery runbook. This site runs on it.
Deployed 14 NetworkPolicies at once. Everything broke. Helm upgrades failed, monitoring went dark, and it took two hours to figure out that Cilium's eBPF evaluates ipBlock rules differently than standard Kubernetes. Wrote a full root cause analysis and a three-level explainer (8th grade, high school, and college level). The fix: CiliumNetworkPolicy with toEntities for API access instead of ipBlock CIDR matching.
Full kube-prometheus-stack deployment with custom Grafana dashboards, PromQL queries that actually work, and the discovery that the monitoring stack itself was about to OOM at 282Mi against a 256Mi limit. Learned that kube_pod_status_phase emits zero-value series for every phase — and that "141 pods not ready" was actually a query bug, not a cluster problem.
Exposing homelab services to the internet without opening a single router port. Cloudflare Tunnel connects outbound via QUIC, all traffic routes through Traefik, and new services only need a new Ingress resource. DNS-01 validation for Let's Encrypt certificates via the Cloudflare API. Four public services running with production TLS and no inbound firewall rules.
Lessons learned the hard way. Posts coming soon.
Everything that went wrong, why it went wrong, and what I'd do differently. From "MetalLB + Talos = pain" to "Cilium evaluates ports after DNAT." A field guide for anyone building a K8s homelab on bare metal.
Nobody writes a blog post about the migration that went perfectly. But maybe they should. A defense of unglamorous infrastructure decisions and the engineers who make them.
Running Flux, SOPS, Kustomize, and a full CI/CD pipeline for a single-node homelab. Overkill or the only sane way to manage infrastructure? A cost-benefit analysis after 16 stages of building.
Interested in the homelab build, have a question about something I wrote, or just want to talk infrastructure? I'm here for it.
Or find me elsewhere.