SlideShare a Scribd company logo
1 of 36
Running Cassandra on Apache Mesos
across multiple datacenters at Uber
Abhishek Verma (verma@uber.com)
About me
● MS (2010) and PhD (2012) in Computer Science from University of Illinois
at Urbana-Champaign
● 2 years at Google, worked on Borg and Omega and first author of the
Borg paper
● ~ 1 year at TCS Research, Mumbai
● Currently at Uber working on running Cassandra on Mesos
© DataStax, All Rights Reserved. 2
“Transportation as reliable as running water,
everywhere, for everyone”
“Transportation as reliable as running
water, everywhere, for everyone”
99.99%
“Transportation as reliable as running water,
everywhere, for everyone”
efficient
Cluster Management @ Uber
● Statically partitioned machines across different services
● Move from custom deployment system to everything running on Mesos
● Gain efficiency by increasing machine utilization
○ Co-locate services on the same machine
○ Can lead to 30% fewer machines1
● Build stateful service frameworks to run on Mesos
© DataStax, All Rights Reserved. 6
“Large-scale cluster management at Google with Borg”, EuroSys 2015
Apache Mesos
7
● Mesos abstracts CPU, memory, storage away from machines
○ program like it’s a single pool of resources
● Linear scalability
● High availability
● Native support for launching containers
● Pluggable resource isolation
● Two level scheduling
Apache Cassandra
8
● Horizontal scalability
○ Scales reads and writes linearly as new nodes are added
● High availability
○ Fault tolerant with tunable consistency levels
● Low latency, solid performance
● Operational simplicity
○ Homogeneous cluster, no SPOF
● Rich data model
Uber
● Abhishek Verma
● Karthik Gandhi
● Matthias Eichstaedt
● Varun Gupta
● Zhitao Li
DC/OS Cassandra Service
9
Mesosphere
● Chris Lambert
● Gabriel Hartmann
● Keith Chambers
● Kenneth Owens
● Mohit Soni
https://github.com/mesosphere/dcos-cassandra-service
Cassandra service architecture
10
Framework
dcos-cassandra-service
Mesos agent
Mesos master
(Leader)
Web interface
Control plane API
C*Cluster 1 C*Cluster 2
Aurora (DC1)
Mesos master
(Standby)
C*Node
1a
C*Node
2a
Mesos agent
C*Node
1b
C*Node
2b
Mesos agent
C*Node
1c
Aurora (DC2)
Deployment system
DC2
ZK ZK
ZK
ZooKeeper
quorum
Client App
uses CQL
interface
CQL CQL CQL CQL CQL
. . .
Cassandra Mesos primitives
11
● Mesos containerizer
● Override 5 ports in configuration (storage_port,
ssl_storage_port, native_transport_port, rpc_port, jmx_port)
● Use persistent volumes
○ Data stored outside of the sandbox directory
○ Offered to the same task if it crashes and restarts
● Use dynamic reservation
Custom seed provider
12
Node 1
10.0.0.1
http://scheduler/seeds
{
isSeed: true
seeds: [ ]
}
Node 1
10.0.0.1
Node 2
10.0.0.2
Node 3
10.0.0.3
Node 2
10.0.0.2
{
isSeed: true
seeds: [ 10.0.0.1]
}
{
isSeed: false
seeds: [ 10.0.0.1,
10.0.0.2]
}
Node 3
10.0.0.3
Number of Nodes = 3
Number of Seeds = 2
Cassandra Service: Features
13
● Custom seed provider
● Increasing cluster size
● Changing Cassandra configuration
● Replacing a dead node
● Backup/Restore
● Cleanup
● Repair
Plan, Phases and Blocks
14
● Plan
○ Phases
■ Reconciliation
■ Deployment
■ Backup
■ Restore
■ Cleanup
■ Repair
Spinning up a new Cassandra cluster
15
https://www.youtube.com/watch?v=gbYmjtDKSzs
Automate Cassandra operations
16
● Repair
○ Synchronize all data across replicas
■ Last write wins
○ Anti-entropy mechanism
○ Repair primary key range node-by-node
● Cleanup
○ Remove data whose ownership has changed
■ Because of addition or removal of nodes
Cleanup operation
17
https://www.youtube.com/watch?v=VxRLSl8MpYI
Failure scenarios
18
● Executor failure
○ Restarted automatically
● Cassandra daemon failure
○ Restarted automatically
● Node failure
○ Manual REST endpoint to replace node
● Scheduling framework failure
○ Existing nodes keep running, new nodes cannot be added
Experiments
19
Cluster startup
20
For each node in the cluster:
1.Receive and accept offer
2.Launch task
3.Fetch executor, JRE, Cassandra binaries from S3/HDFS
4.Launch executor
5.Launch Cassandra daemon
6.Wait for it’s mode to transition STARTING -> JOINING -> NORMAL
Cluster startup time
21
Framework can start ~ one new node per minute
Tuning JVM Garbage collection
22
Changed from CMS to G1 garbage collector
Left: https://github.com/apache/cassandra/blob/cassandra-2.2/conf/cassandra-env.sh#L213
Right: https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_tune_jvm_c.html?scroll=concept_ds_sv5_k4w_dk__tuning-java-garbage-collection
Tuning JVM Garbage collection
23
Metric CMS G1
G1 : CMS
Factor
op rate 1951 13765 7.06
latency mean (ms) 3.6 0.4 9.00
latency median (ms) 0.3 0.3 1.00
latency 95th percentile (ms) 0.6 0.4 1.50
latency 99th percentile (ms) 1 0.5 2.00
latency 99.9th percentile (ms) 11.6 0.7 16.57
latency max (ms) 13496.9 4626.9 2.92
G1 garbage collector is much better without any tuning
Using cassandra-stress, 32 threads client
Cluster Setup
24
● 3 nodes
● Local DC
● 24 cores, 128 GB RAM, 2TB SAS drives
● Cassandra running on bare metal
● Cassandra running in a Mesos container
Bare metal Mesos
Read Latency
25
Mean: 0.38 ms
P95: 0.74 ms
P99: 0.91 ms
Mean: 0.44 ms
P95: 0.76 ms
P99: 0.98 ms
Bare metal Mesos
Read Throughput
26
Bare metal Mesos
Write Latency
27
Mean: 0.43 ms
P95: 0.94 ms
P99: 1.05 ms
Mean: 0.48 ms
P95: 0.93 ms
P99: 1.26 ms
Bare metal Mesos
Write Throughput
28
Running across datacenters
29
● Four datacenters
○ Each running dcos-cassandra-service instance
○ Sync datacenter phase
■ Periodically exchange seeds with external dcs
● Cassandra nodes gossip topology
○ Discover nodes in other datacenters
Asynchronous cross-dc replication latency
30
● Write a row to dc1 using consistency level LOCAL_ONE
○ Write timestamp to a file when operation completed
● Spin in a loop to read the same row using consistency LOCAL_ONE in dc2
○ Write timestamp to a file when operation completed
● Difference between the two gives asynchronous replication latency
○ p50 : 44.69ms, p95 : 46.38ms, p99:47.44ms
● Round trip ping latency
○ 77.8ms
Cassandra on Mesos in Production
31
● ~20 clusters replicating across two datacenters (west and east coast)
● ~300 machines across two datacenters
● Largest 2 clusters: more than a million writes/sec and ~100k reads/sec
● Mean read latency: 13ms and write latency: 25ms
● Mostly use LOCAL_QUORUM consistency level
Questions?
32
verma@uber.com
Cluster startup
33
For each node in the cluster:
1.Receive and accept offer
2.Launch task
3.Fetch executor, JRE, Cassandra binaries from S3/HDFS
4.Launch executor
5.Launch Cassandra daemon
6.Wait for it’s mode to transition STARTING -> JOINING -> NORMAL
Aurora hogging offers
Aurora hogs offers
34
● Aurora designed to be the only framework running on Mesos and
controlling all the machines
● Holds on to all received offers
○ Does not accept or reject them
● Mesos waits for --offer_timeout time duration and rescinds offer
● --offer_timeout config
○ Duration of time before an offer is rescinded from a framework. This helps fairness when
running frameworks that hold on to offers, or frameworks that accidentally drop offers. If
not set, offers do not timeout.
Long term solution: dynamic reservations
35
● Dynamically reserve all the machines resources to the “cassandra”
role
● Resources are offered only to cassandra frameworks
● Improves node startup time: 30s/node
● Node failure replacement or updates are much faster
Using the Cassandra cluster
36
https://www.youtube.com/watch?v=qgqO39DteHo

More Related Content

What's hot

From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planningconfluent
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache KafkaPaul Brebner
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka IntroductionAmita Mirajkar
 
Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to KafkaAkash Vacher
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
 
How to Lock Down Apache Kafka and Keep Your Streams Safe
How to Lock Down Apache Kafka and Keep Your Streams SafeHow to Lock Down Apache Kafka and Keep Your Streams Safe
How to Lock Down Apache Kafka and Keep Your Streams Safeconfluent
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 
Kafka as Message Broker
Kafka as Message BrokerKafka as Message Broker
Kafka as Message BrokerHaluan Irsad
 
Hardening Kafka Replication
Hardening Kafka Replication Hardening Kafka Replication
Hardening Kafka Replication confluent
 
Cassandra under the hood
Cassandra under the hoodCassandra under the hood
Cassandra under the hoodAndriy Rymar
 
Data Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectData Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectKaufman Ng
 
Securing Kafka
Securing Kafka Securing Kafka
Securing Kafka confluent
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practicesconfluent
 

What's hot (20)

Kafka 101
Kafka 101Kafka 101
Kafka 101
 
Apache Kafka
Apache KafkaApache Kafka
Apache Kafka
 
Kafka/SMM Crash Course
Kafka/SMM Crash CourseKafka/SMM Crash Course
Kafka/SMM Crash Course
 
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity PlanningFrom Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
From Message to Cluster: A Realworld Introduction to Kafka Capacity Planning
 
Kafka presentation
Kafka presentationKafka presentation
Kafka presentation
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
A visual introduction to Apache Kafka
A visual introduction to Apache KafkaA visual introduction to Apache Kafka
A visual introduction to Apache Kafka
 
Apache Kafka Introduction
Apache Kafka IntroductionApache Kafka Introduction
Apache Kafka Introduction
 
Introduction to Kafka
Introduction to KafkaIntroduction to Kafka
Introduction to Kafka
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
How to Lock Down Apache Kafka and Keep Your Streams Safe
How to Lock Down Apache Kafka and Keep Your Streams SafeHow to Lock Down Apache Kafka and Keep Your Streams Safe
How to Lock Down Apache Kafka and Keep Your Streams Safe
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
kafka
kafkakafka
kafka
 
Kafka as Message Broker
Kafka as Message BrokerKafka as Message Broker
Kafka as Message Broker
 
Hardening Kafka Replication
Hardening Kafka Replication Hardening Kafka Replication
Hardening Kafka Replication
 
Cassandra under the hood
Cassandra under the hoodCassandra under the hood
Cassandra under the hood
 
Data Pipelines with Kafka Connect
Data Pipelines with Kafka ConnectData Pipelines with Kafka Connect
Data Pipelines with Kafka Connect
 
Securing Kafka
Securing Kafka Securing Kafka
Securing Kafka
 
Kafka 101 and Developer Best Practices
Kafka 101 and Developer Best PracticesKafka 101 and Developer Best Practices
Kafka 101 and Developer Best Practices
 
Apache Kafka Best Practices
Apache Kafka Best PracticesApache Kafka Best Practices
Apache Kafka Best Practices
 

Viewers also liked

Building Real-Time Applications with Android and WebSockets
Building Real-Time Applications with Android and WebSocketsBuilding Real-Time Applications with Android and WebSockets
Building Real-Time Applications with Android and WebSocketsSergi Almar i Graupera
 
Just Add Reality: Managing Logistics with the Uber Developer Platform
Just Add Reality: Managing Logistics with the Uber Developer PlatformJust Add Reality: Managing Logistics with the Uber Developer Platform
Just Add Reality: Managing Logistics with the Uber Developer PlatformApigee | Google Cloud
 
"Building Data Foundations and Analytics Tools Across The Product" by Crystal...
"Building Data Foundations and Analytics Tools Across The Product" by Crystal..."Building Data Foundations and Analytics Tools Across The Product" by Crystal...
"Building Data Foundations and Analytics Tools Across The Product" by Crystal...Tech in Asia ID
 
Taxi Startup Presentation for Taxi Company
Taxi Startup Presentation for Taxi CompanyTaxi Startup Presentation for Taxi Company
Taxi Startup Presentation for Taxi CompanyEugene Suslo
 
Open-source Infrastructure at Lyft
Open-source Infrastructure at LyftOpen-source Infrastructure at Lyft
Open-source Infrastructure at LyftDaniel Hochman
 
Uber's new mobile architecture
Uber's new mobile architectureUber's new mobile architecture
Uber's new mobile architectureDhaval Patel
 
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...Daniel Hochman
 
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron SchildkroutKafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkroutconfluent
 
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...DataStax
 

Viewers also liked (11)

Building Real-Time Applications with Android and WebSockets
Building Real-Time Applications with Android and WebSocketsBuilding Real-Time Applications with Android and WebSockets
Building Real-Time Applications with Android and WebSockets
 
Just Add Reality: Managing Logistics with the Uber Developer Platform
Just Add Reality: Managing Logistics with the Uber Developer PlatformJust Add Reality: Managing Logistics with the Uber Developer Platform
Just Add Reality: Managing Logistics with the Uber Developer Platform
 
"Building Data Foundations and Analytics Tools Across The Product" by Crystal...
"Building Data Foundations and Analytics Tools Across The Product" by Crystal..."Building Data Foundations and Analytics Tools Across The Product" by Crystal...
"Building Data Foundations and Analytics Tools Across The Product" by Crystal...
 
Taxi Startup Presentation for Taxi Company
Taxi Startup Presentation for Taxi CompanyTaxi Startup Presentation for Taxi Company
Taxi Startup Presentation for Taxi Company
 
Open-source Infrastructure at Lyft
Open-source Infrastructure at LyftOpen-source Infrastructure at Lyft
Open-source Infrastructure at Lyft
 
Uber's new mobile architecture
Uber's new mobile architectureUber's new mobile architecture
Uber's new mobile architecture
 
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
Geospatial Indexing at Scale: The 15 Million QPS Redis Architecture Powering ...
 
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron SchildkroutKafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
 
31 - IDNOG03 - Bergas Bimo Branarto (GOJEK) - Scaling Gojek
31 - IDNOG03 - Bergas Bimo Branarto (GOJEK) - Scaling Gojek31 - IDNOG03 - Bergas Bimo Branarto (GOJEK) - Scaling Gojek
31 - IDNOG03 - Bergas Bimo Branarto (GOJEK) - Scaling Gojek
 
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
Maintaining Consistency Across Data Centers (Randy Fradin, BlackRock) | Cassa...
 
Culture
CultureCulture
Culture
 

Similar to Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* Summit 2016

From swarm to swam-mode in the CERN container service
From swarm to swam-mode in the CERN container serviceFrom swarm to swam-mode in the CERN container service
From swarm to swam-mode in the CERN container serviceSpyros Trigazis
 
OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating System
OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating SystemOSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating System
OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating SystemNETWAYS
 
Fully fault tolerant real time data pipeline with docker and mesos
Fully fault tolerant real time data pipeline with docker and mesos Fully fault tolerant real time data pipeline with docker and mesos
Fully fault tolerant real time data pipeline with docker and mesos Rahul Kumar
 
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...Joe Stein
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsFederico Michele Facca
 
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE PlatformsFIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE PlatformsFIWARE
 
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...Spark Summit
 
Containerization - The DevOps Revolution
Containerization - The DevOps RevolutionContainerization - The DevOps Revolution
Containerization - The DevOps RevolutionYulian Slobodyan
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...Amazon Web Services
 
Container orchestration in geo-distributed cloud computing platforms
Container orchestration in geo-distributed cloud computing platformsContainer orchestration in geo-distributed cloud computing platforms
Container orchestration in geo-distributed cloud computing platformsFogGuru MSCA Project
 
Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18
Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18
Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18Olga Zinkevych
 
Mosix Cluster
Mosix ClusterMosix Cluster
Mosix ClusterAbhay Pai
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageMayaData Inc
 
SMACK Stack 1.1
SMACK Stack 1.1SMACK Stack 1.1
SMACK Stack 1.1Joe Stein
 
Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017
Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017
Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017Codemotion
 
Introduction to mesos
Introduction to mesosIntroduction to mesos
Introduction to mesosOmid Vahdaty
 
MANTL Data Platform, Microservices and BigData Services
MANTL Data Platform, Microservices and BigData ServicesMANTL Data Platform, Microservices and BigData Services
MANTL Data Platform, Microservices and BigData ServicesCisco DevNet
 
Containerized Data Persistence on Mesos
Containerized Data Persistence on MesosContainerized Data Persistence on Mesos
Containerized Data Persistence on MesosJoe Stein
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystemAlex Thompson
 
Enhancing and Preparing TIMES for High Performance Computing
Enhancing and Preparing TIMES for High Performance ComputingEnhancing and Preparing TIMES for High Performance Computing
Enhancing and Preparing TIMES for High Performance ComputingIEA-ETSAP
 

Similar to Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* Summit 2016 (20)

From swarm to swam-mode in the CERN container service
From swarm to swam-mode in the CERN container serviceFrom swarm to swam-mode in the CERN container service
From swarm to swam-mode in the CERN container service
 
OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating System
OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating SystemOSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating System
OSDC 2015: Bernd Mathiske | Why the Datacenter Needs an Operating System
 
Fully fault tolerant real time data pipeline with docker and mesos
Fully fault tolerant real time data pipeline with docker and mesos Fully fault tolerant real time data pipeline with docker and mesos
Fully fault tolerant real time data pipeline with docker and mesos
 
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
Making Distributed Data Persistent Services Elastic (Without Losing All Your ...
 
Docker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platformsDocker Swarm secrets for creating great FIWARE platforms
Docker Swarm secrets for creating great FIWARE platforms
 
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE PlatformsFIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
FIWARE Tech Summit - Docker Swarm Secrets for Creating Great FIWARE Platforms
 
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
 
Containerization - The DevOps Revolution
Containerization - The DevOps RevolutionContainerization - The DevOps Revolution
Containerization - The DevOps Revolution
 
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
SRV402 Deep Dive on Amazon EC2 Instances, Featuring Performance Optimization ...
 
Container orchestration in geo-distributed cloud computing platforms
Container orchestration in geo-distributed cloud computing platformsContainer orchestration in geo-distributed cloud computing platforms
Container orchestration in geo-distributed cloud computing platforms
 
Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18
Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18
Dataservices based on mesos and kafka kostiantyn bokhan dataconf 21 04 18
 
Mosix Cluster
Mosix ClusterMosix Cluster
Mosix Cluster
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
SMACK Stack 1.1
SMACK Stack 1.1SMACK Stack 1.1
SMACK Stack 1.1
 
Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017
Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017
Jörg Schad - NO ONE PUTS Java IN THE CONTAINER - Codemotion Milan 2017
 
Introduction to mesos
Introduction to mesosIntroduction to mesos
Introduction to mesos
 
MANTL Data Platform, Microservices and BigData Services
MANTL Data Platform, Microservices and BigData ServicesMANTL Data Platform, Microservices and BigData Services
MANTL Data Platform, Microservices and BigData Services
 
Containerized Data Persistence on Mesos
Containerized Data Persistence on MesosContainerized Data Persistence on Mesos
Containerized Data Persistence on Mesos
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystem
 
Enhancing and Preparing TIMES for High Performance Computing
Enhancing and Preparing TIMES for High Performance ComputingEnhancing and Preparing TIMES for High Performance Computing
Enhancing and Preparing TIMES for High Performance Computing
 

More from DataStax

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?DataStax
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...DataStax
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsDataStax
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphDataStax
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyDataStax
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...DataStax
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache KafkaDataStax
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseDataStax
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0DataStax
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...DataStax
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesDataStax
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDataStax
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudDataStax
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceDataStax
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...DataStax
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...DataStax
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...DataStax
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)DataStax
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsDataStax
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingDataStax
 

More from DataStax (20)

Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?Is Your Enterprise Ready to Shine This Holiday Season?
Is Your Enterprise Ready to Shine This Holiday Season?
 
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
 
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid EnvironmentsRunning DataStax Enterprise in VMware Cloud and Hybrid Environments
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
 
Best Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise GraphBest Practices for Getting to Production with DataStax Enterprise Graph
Best Practices for Getting to Production with DataStax Enterprise Graph
 
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step JourneyWebinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
 
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...Webinar  |  How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
 
Webinar | Better Together: Apache Cassandra and Apache Kafka
Webinar  |  Better Together: Apache Cassandra and Apache KafkaWebinar  |  Better Together: Apache Cassandra and Apache Kafka
Webinar | Better Together: Apache Cassandra and Apache Kafka
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
 
Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0Introduction to Apache Cassandra™ + What’s New in 4.0
Introduction to Apache Cassandra™ + What’s New in 4.0
 
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
 
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud RealitiesWebinar  |  Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
 
Designing a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for DummiesDesigning a Distributed Cloud Database for Dummies
Designing a Distributed Cloud Database for Dummies
 
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid CloudHow to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
 
How to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerceHow to Evaluate Cloud Databases for eCommerce
How to Evaluate Cloud Databases for eCommerce
 
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
 
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
 
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)
 
An Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking ApplicationsAn Operational Data Layer is Critical for Transformative Banking Applications
An Operational Data Layer is Critical for Transformative Banking Applications
 
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design ThinkingBecoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
 

Recently uploaded

Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingShane Coughlan
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfDrew Moseley
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringHironori Washizaki
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Anthony Dahanne
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsSafe Software
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptxVinzoCenzo
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf31events.com
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorTier1 app
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxRTS corp
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsJean Silva
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shardsChristopher Curtin
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonApplitools
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 

Recently uploaded (20)

Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full RecordingOpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
OpenChain Education Work Group Monthly Meeting - 2024-04-10 - Full Recording
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Comparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdfComparing Linux OS Image Update Models - EOSS 2024.pdf
Comparing Linux OS Image Update Models - EOSS 2024.pdf
 
Machine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their EngineeringMachine Learning Software Engineering Patterns and Their Engineering
Machine Learning Software Engineering Patterns and Their Engineering
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptx
 
Sending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdfSending Calendar Invites on SES and Calendarsnack.pdf
Sending Calendar Invites on SES and Calendarsnack.pdf
 
Effectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryErrorEffectively Troubleshoot 9 Types of OutOfMemoryError
Effectively Troubleshoot 9 Types of OutOfMemoryError
 
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptxReal-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
Real-time Tracking and Monitoring with Cargo Cloud Solutions.pptx
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
Strategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero resultsStrategies for using alternative queries to mitigate zero results
Strategies for using alternative queries to mitigate zero results
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards2024 DevNexus Patterns for Resiliency: Shuffle shards
2024 DevNexus Patterns for Resiliency: Shuffle shards
 
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + KobitonLeveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
Leveraging AI for Mobile App Testing on Real Devices | Applitools + Kobiton
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 

Cassandra on Mesos Across Multiple Datacenters at Uber (Abhishek Verma) | C* Summit 2016

  • 1. Running Cassandra on Apache Mesos across multiple datacenters at Uber Abhishek Verma (verma@uber.com)
  • 2. About me ● MS (2010) and PhD (2012) in Computer Science from University of Illinois at Urbana-Champaign ● 2 years at Google, worked on Borg and Omega and first author of the Borg paper ● ~ 1 year at TCS Research, Mumbai ● Currently at Uber working on running Cassandra on Mesos © DataStax, All Rights Reserved. 2
  • 3. “Transportation as reliable as running water, everywhere, for everyone”
  • 4. “Transportation as reliable as running water, everywhere, for everyone” 99.99%
  • 5. “Transportation as reliable as running water, everywhere, for everyone” efficient
  • 6. Cluster Management @ Uber ● Statically partitioned machines across different services ● Move from custom deployment system to everything running on Mesos ● Gain efficiency by increasing machine utilization ○ Co-locate services on the same machine ○ Can lead to 30% fewer machines1 ● Build stateful service frameworks to run on Mesos © DataStax, All Rights Reserved. 6 “Large-scale cluster management at Google with Borg”, EuroSys 2015
  • 7. Apache Mesos 7 ● Mesos abstracts CPU, memory, storage away from machines ○ program like it’s a single pool of resources ● Linear scalability ● High availability ● Native support for launching containers ● Pluggable resource isolation ● Two level scheduling
  • 8. Apache Cassandra 8 ● Horizontal scalability ○ Scales reads and writes linearly as new nodes are added ● High availability ○ Fault tolerant with tunable consistency levels ● Low latency, solid performance ● Operational simplicity ○ Homogeneous cluster, no SPOF ● Rich data model
  • 9. Uber ● Abhishek Verma ● Karthik Gandhi ● Matthias Eichstaedt ● Varun Gupta ● Zhitao Li DC/OS Cassandra Service 9 Mesosphere ● Chris Lambert ● Gabriel Hartmann ● Keith Chambers ● Kenneth Owens ● Mohit Soni https://github.com/mesosphere/dcos-cassandra-service
  • 10. Cassandra service architecture 10 Framework dcos-cassandra-service Mesos agent Mesos master (Leader) Web interface Control plane API C*Cluster 1 C*Cluster 2 Aurora (DC1) Mesos master (Standby) C*Node 1a C*Node 2a Mesos agent C*Node 1b C*Node 2b Mesos agent C*Node 1c Aurora (DC2) Deployment system DC2 ZK ZK ZK ZooKeeper quorum Client App uses CQL interface CQL CQL CQL CQL CQL . . .
  • 11. Cassandra Mesos primitives 11 ● Mesos containerizer ● Override 5 ports in configuration (storage_port, ssl_storage_port, native_transport_port, rpc_port, jmx_port) ● Use persistent volumes ○ Data stored outside of the sandbox directory ○ Offered to the same task if it crashes and restarts ● Use dynamic reservation
  • 12. Custom seed provider 12 Node 1 10.0.0.1 http://scheduler/seeds { isSeed: true seeds: [ ] } Node 1 10.0.0.1 Node 2 10.0.0.2 Node 3 10.0.0.3 Node 2 10.0.0.2 { isSeed: true seeds: [ 10.0.0.1] } { isSeed: false seeds: [ 10.0.0.1, 10.0.0.2] } Node 3 10.0.0.3 Number of Nodes = 3 Number of Seeds = 2
  • 13. Cassandra Service: Features 13 ● Custom seed provider ● Increasing cluster size ● Changing Cassandra configuration ● Replacing a dead node ● Backup/Restore ● Cleanup ● Repair
  • 14. Plan, Phases and Blocks 14 ● Plan ○ Phases ■ Reconciliation ■ Deployment ■ Backup ■ Restore ■ Cleanup ■ Repair
  • 15. Spinning up a new Cassandra cluster 15 https://www.youtube.com/watch?v=gbYmjtDKSzs
  • 16. Automate Cassandra operations 16 ● Repair ○ Synchronize all data across replicas ■ Last write wins ○ Anti-entropy mechanism ○ Repair primary key range node-by-node ● Cleanup ○ Remove data whose ownership has changed ■ Because of addition or removal of nodes
  • 18. Failure scenarios 18 ● Executor failure ○ Restarted automatically ● Cassandra daemon failure ○ Restarted automatically ● Node failure ○ Manual REST endpoint to replace node ● Scheduling framework failure ○ Existing nodes keep running, new nodes cannot be added
  • 20. Cluster startup 20 For each node in the cluster: 1.Receive and accept offer 2.Launch task 3.Fetch executor, JRE, Cassandra binaries from S3/HDFS 4.Launch executor 5.Launch Cassandra daemon 6.Wait for it’s mode to transition STARTING -> JOINING -> NORMAL
  • 21. Cluster startup time 21 Framework can start ~ one new node per minute
  • 22. Tuning JVM Garbage collection 22 Changed from CMS to G1 garbage collector Left: https://github.com/apache/cassandra/blob/cassandra-2.2/conf/cassandra-env.sh#L213 Right: https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_tune_jvm_c.html?scroll=concept_ds_sv5_k4w_dk__tuning-java-garbage-collection
  • 23. Tuning JVM Garbage collection 23 Metric CMS G1 G1 : CMS Factor op rate 1951 13765 7.06 latency mean (ms) 3.6 0.4 9.00 latency median (ms) 0.3 0.3 1.00 latency 95th percentile (ms) 0.6 0.4 1.50 latency 99th percentile (ms) 1 0.5 2.00 latency 99.9th percentile (ms) 11.6 0.7 16.57 latency max (ms) 13496.9 4626.9 2.92 G1 garbage collector is much better without any tuning Using cassandra-stress, 32 threads client
  • 24. Cluster Setup 24 ● 3 nodes ● Local DC ● 24 cores, 128 GB RAM, 2TB SAS drives ● Cassandra running on bare metal ● Cassandra running in a Mesos container
  • 25. Bare metal Mesos Read Latency 25 Mean: 0.38 ms P95: 0.74 ms P99: 0.91 ms Mean: 0.44 ms P95: 0.76 ms P99: 0.98 ms
  • 26. Bare metal Mesos Read Throughput 26
  • 27. Bare metal Mesos Write Latency 27 Mean: 0.43 ms P95: 0.94 ms P99: 1.05 ms Mean: 0.48 ms P95: 0.93 ms P99: 1.26 ms
  • 28. Bare metal Mesos Write Throughput 28
  • 29. Running across datacenters 29 ● Four datacenters ○ Each running dcos-cassandra-service instance ○ Sync datacenter phase ■ Periodically exchange seeds with external dcs ● Cassandra nodes gossip topology ○ Discover nodes in other datacenters
  • 30. Asynchronous cross-dc replication latency 30 ● Write a row to dc1 using consistency level LOCAL_ONE ○ Write timestamp to a file when operation completed ● Spin in a loop to read the same row using consistency LOCAL_ONE in dc2 ○ Write timestamp to a file when operation completed ● Difference between the two gives asynchronous replication latency ○ p50 : 44.69ms, p95 : 46.38ms, p99:47.44ms ● Round trip ping latency ○ 77.8ms
  • 31. Cassandra on Mesos in Production 31 ● ~20 clusters replicating across two datacenters (west and east coast) ● ~300 machines across two datacenters ● Largest 2 clusters: more than a million writes/sec and ~100k reads/sec ● Mean read latency: 13ms and write latency: 25ms ● Mostly use LOCAL_QUORUM consistency level
  • 33. Cluster startup 33 For each node in the cluster: 1.Receive and accept offer 2.Launch task 3.Fetch executor, JRE, Cassandra binaries from S3/HDFS 4.Launch executor 5.Launch Cassandra daemon 6.Wait for it’s mode to transition STARTING -> JOINING -> NORMAL Aurora hogging offers
  • 34. Aurora hogs offers 34 ● Aurora designed to be the only framework running on Mesos and controlling all the machines ● Holds on to all received offers ○ Does not accept or reject them ● Mesos waits for --offer_timeout time duration and rescinds offer ● --offer_timeout config ○ Duration of time before an offer is rescinded from a framework. This helps fairness when running frameworks that hold on to offers, or frameworks that accidentally drop offers. If not set, offers do not timeout.
  • 35. Long term solution: dynamic reservations 35 ● Dynamically reserve all the machines resources to the “cassandra” role ● Resources are offered only to cassandra frameworks ● Improves node startup time: 30s/node ● Node failure replacement or updates are much faster
  • 36. Using the Cassandra cluster 36 https://www.youtube.com/watch?v=qgqO39DteHo