I was recently reading this post telling everyone (strenuously) to turn off CPU limits in #Kubernetes. I could not disagree more, for most production environments.

There are a few caveats where I do think it makes sense. If you have all your teams:

  • That are extremely aware of the performance characteristics of all of their services.
  • Have benchmarks that will show a CPU performance degradation before it ships.
  • Have monitoring and alerting when their services are regularly running over their requested CPU, and have the processes in place to take action on that.

I doubt that most people are in that situation. Maybe in a large company where a team has 1-2 services to manage. 

The problem for 95% of everyone else is that allowing services to use "free CPU" means you don't really have any forcing function when a change hogs up a bunch of CPU. You also end up with Heisenbugs that only happen when a certain set of services happen to be co-deployed on the same nodes and a certain situation occurs.

27 years in the industry–many of them in ops–tells me that the money saved from widely over-subscribing CPUs is not worth the developer and ops time required to debug and support these things. And most organizations don't have the built-in maturity to have that make sense. CPU is expensive. But dev time is much more so.

So, "for the love of God" as they say in the post, please use both requests and limits unless you can check off all those points above.

These arrived today from O'Reilly. 5 print samples of the new edition!
#Docker #Kubernetes

Docker Up and Running 3rd Edition print samples

One major win is that we always maintained our own deployment config format in a centralized repo. That has made it really light weight to build tooling to deploy to either #Mesos or to #Kubernetes during the migration. 

We also have service discovery that spans both clusters, with instances of the same app available from either cluster. We can roll back and forth between Mesos or Kubernetes without any changes to configs. That really lowers the risk of the migration.

There are about 125 services to move. Most of them will be a simple redeploy. 

We’ve been moving all of our infrastructure from the rock solid #Mesos (with Singularity) platform others and I built up over years, to #Kubernetes. It’s going well and all our stuff works.

But, you sort of forget all of the operational improvements and optimizations you have made until you move to a new platform.  Lots of little things to resolve, rough edges or bad ergonomics that we will need to address.  Shout out to HubSpot’s Singularity scheduler (RIP) on Mesos for being so great to operate.  Set a high bar. 

We wanted to have a service that would optionally tail logs from #Kubernetes for apps we deploy and report them over UDP syslog—in an existing JSON log format that we use from #Mesos.

  • It should make the log scrape/relay decision based on Annotations on the Pods.
  • It should rate limit by *pod* and not by host/node so that we don't overrun our log provider (e.g. when someone forgets to turn of debug logging) or starve other apps on the same node from being able to send their logs.
  • It should report rate limiting to our metrics system so we can track which pods are getting limited.

There was nothing that we could find that was able to do all of that. So I spent the last two days writing it in #Golang and we're doing initial deployment of that as a DaemonSet. Seems to work nicely 🎉

I've been adding a basic #Kubernetes API discovery mode to my long-lived Sidecar service discovery system. This will enable us at Community.com to span the Sidecar cluster across Kubernetes and #Mesos, and allow us to migrate services one at a time.


Want to host your own #Kubernetes cluster in the cloud? Here's how I set up MicroK8s and run it for €9/month. Hope it helps someone out.


#Linux #Hosting #DevOps

I will shortly be doing a write up on my long form technical blog https://relistan.com about getting a single node microK8s #Kubernetes cluster up and running, and deploying a simple app to it with full HTTPS support. I just did this for my own edification and to get http://nameifier.apps.k.matthias.org and https://k.matthias.org up and running.