We've been utilizing Kubernetes at Buffer for the last two years to fuel our migration from a monolithic architecture to a microservices architecture. Here are some high-level learnings we've had along our journey and where we're headed next!
2. What we’ll cover
- Why we chose Kubernetes
- A timeline of our journey with Kubernetes so far
- What’s next for us
- Hopes & Dreams
- Quick advice
3. Who is this guy talking right now?
- Dan Farrelly
- Director of Engineering & Technical Lead @ Buffer
- Leading our move from monolith to microservices since 2016
- @djfarrelly
- dan@buffer.com
5. Why Kubernetes
- Moving from monolithic application to service oriented architecture
- Great community & project velocity (v1.3 in July 2016)
- Enable our Self-service DevOps/NoOps vision for our remote team
- Increase Developer velocity
- Reliability
7. Baby steps - Mid 2016
- Kubernetes 1.3 Cluster via kube-up.sh
- One massive shared private repo with all-the-yaml’s
for collaboration & backup
- Everyone had to have k8s credentials and access to
this yaml repo
- This was slow
- We learned a lot
8. First real attempt - Fall 2016
- Created and shared service yaml’s using massive repo still
- kubectl patch -p ‘{ “spec”: ... }’ to automate deployments
- Repo got out of date after deploys
- Built a deploy service to handle Slackbot commands & trigger deploys
pipeline on Jenkins
- Accelerated development of services on k8s → Massive learning
- Started developing our in-house best practices
9.
10. The Helm Era - 2017
- Helm 2 released in late 2016
- Charts in each service’s repo fixed issues w/ apply vs. patch and
mono-repo
- Extended our deploy service
- Continuous deployment
- On-demand per branch staging deployments using Helm & Ingress
- New clusters managed via kops
12. Where we are now
- Stats
- 3 Clusters (v1.8.x main production & dev, v1.3 legacy, data jobs)
- 140-160+ services running 700-800 pods on ~30 nodes
- Focus on moving to Helm fully
- Using Heptio’s Ark for backups
- Deprecating our yaml-repo!
13. Where we’re headed - Charts
- Charts in services’ repos challenges:
- Fragmentation, hard to keep best practices consistent, slow to roll out
changes across many services
- Creating our own company-standard, extensible charts (web service, worker,
cron, etc.) w/ health check side-cars and other best practices baked in
- Experimenting with Chartmuseum (Helm Chart Repository)
14. Where we’re headed - Cluster
- Adopting Istio to secure our service-to-service traffic
- Increase adoption of request tracing w/ Jaeger
- Moving to a single production/dev cluster (est. 50-60 nodes Fall ‘18)
- Improving our strategy for disaster recovery using Heptio’s Ark
15. Our long term Kubernetes outlook
- Developers shouldn’t need cluster access to do their jobs
- Developers configure via well documented Helm charts & values.yaml files
- Enable initial deploys via tooling not manual cli initial deploys
- Providing more visibility into services for developers:
- Traces, Failed deploy & crash loop notifications, smarter monitoring
16. Quick advice for those early in their k8s journey
- Learn Vanilla k8s first...but then definitely use Helm, Istio, etc.
- Pay attention to kube-dns - Monitoring & auto-scaling
- Practice cluster upgrades early and do them often
- Resource limits, resource limits, resource limits