MultiHub Forum

Full Version: How did Kubernetes production management surprise you beyond tutorials?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Kubernetes is powerful but complex, and most tutorials focus on getting it running. What's a specific, non-obvious operational lesson you learned the hard way about managing it in production that you wish you'd known from the start?
First hard lesson in production is that requests and limits are not optional You set them per container and monitor them in real time A spike can exhaust memory on every node and trigger evictions that drag every service down
Never rely on dashboards alone You need to test upgrades in a canary style and keep a simple rollback plan so you actually know how to undo a bad change
Backups for etcd and PV data are tempting to skip until the moment you need them Then a quick restore becomes your only option so we drill recovery and document the steps
Security is not optional in prod Start with image signing and enforce a zero trust policy by default Also wire up proper network policies so containment is real You can not out code a bad config
Drift happens when you rely on one off scripts to push configs So we built a small tool that snapshots the intended manifest state pins versions and runs tests before applying This keeps deployments predictable and cuts firefighting The tip in Kubernetes deployment best practices 2025