SlideShare a Scribd company logo
1 of 41
Download to read offline
Scaling 40x on the ObjectRocket MongoDB Platform
Jon Hyman & Kenny Gorman
MongoDB World, June 25, 2014
NYC
@appboy @objectrocket @jon_hyman @kennygorman
A LITTLE BIT ABOUT
JON & APPBOY
Jon Hyman
CIO :: @jon_hyman
!
Appboy is a marketing
automation platform for apps
Harvard
Bridgewater
A LITTLE BIT ABOUT
KENNY &
OBJECTROCKET
Kenny Gorman
Co-Founder & Chief
Architect ::
@kennygorman
!
ObjectRocket is a highly
available, sharded, unbelievably
fast MongoDB as a service
ObjectRocket
eBay
Shutterfly
Agenda
• Evolution of Appboy’s MongoDB
installation as we grew to handle
billions of data points per month
!
• Operational MongoDB issues we
worked through
MongoDB Evolution:
March, 2013
Mar May July Sept Nov Jan
Apr Jun Aug Oct Dec Feb
Mar
What did Appboy look like in March, 2013?
•~2.5 million events per day tracking 8 million users
• Event storage: every data point as a new document
• Single, unsharded replica set on AWS (m2.xlarge)
• Mostly long-tail customers; biggest app had 2M users
What did Appboy look like in March, 2013?
•~2.5 million events per day tracking 8 million users
• Event storage: every data point as a new document
• Single, unsharded replica set on AWS (m2.xlarge)
• Mostly long-tail customers; biggest app had 2M users
!
Growing a lot on disk. :-(
!
Started running into locking issues (30-40%). :-(
MongoDB Evolution:
April, 2013
Mar May July Sept Nov Jan
Apr Jun Aug Oct Dec Feb
Mar
Scaled 	

vertically
What happened in April, 2013?
• First enterprise client signs
• More than 50 million users
• They estimated sending us over 1 billion data points per
month
What happened in April, 2013?
• First enterprise client signs
• More than 50 million users
• They estimated sending us over 1 billion data points per
month
!
“Btw, we’re going live next month”
MongoDB Evolution:
April, 2013: holy crap!
ObjectRocket: Getting Started
• The landscape of a simple configuration
• It’s all about choosing shard keys
• Locks - you know you love them
20%
80%
What are we going to do?
• Contain growth from data points:
• Shifted to Amazon Redshift for “raw data”
• Moved MongoDB to storing pre-aggregated analytics for
time series data

• Figure out sharding ASAP
• Moved to ObjectRocket, worked on shard key selection
• Sharding was hard:
• Tough to figure out the right shard key, make tradeoffs
• Rewrite a lot of application code to include shard keys in
queries, inserts, adjust to life without unique indexes
Shard key selections
• Users
• Had multiple ways to identify a user
• Device identifier, “external user id”, BSON ID
• Often performed large scans of user bases
Shard key selections
• Users
• Had multiple ways to identify a user
• Device identifier, “external user id”, BSON ID
• Often performed large scans of user bases
!
{_id: “hashed”}
!
• Cache secondary identifiers to BSON ID to reduce scatter-
gather queries
• Doing scatter gathers goes against conventional wisdom
Shard key selections
• Pre-aggregated analytics
• Always query history for a single app
• 1 document per day per app per metric
!
{app_id: 1}
MongoDB Evolution:
May - October, 2013
Mar May July Sept Nov Jan
Apr Jun Aug Oct Dec Feb
Mar
Scaled 	

vertically
Start sharding
Everything 	

sharded
What did Appboy look like in May - October, 2013?
• textPlus goes live, as do other customers
• > 1 billion events per month, doing great!
• 4, 100GB shards on ObjectRocket
MongoDB Evolution:
November, 2013
Mar May July Sept Nov Jan
Apr Jun Aug Oct Dec Feb
Mar
Scaled 	

vertically
Start sharding
Everything 	

sharded
Various customer	

launches
What happened in November, 2013?
• One of the largest European soccer apps
What happened in November, 2013?
• One of the largest European soccer apps
• Soccer games crushed us: 15 million data points per hour
just from this app!
• Lock percentage ran high, a single shard was pegged
• Real-time analytics processing got severely delayed,
adding more servers did not help (in fact, it made things
worse)
What happened in November, 2013?
• One of the largest European soccer apps
• Soccer games crushed us: 15 million data points per hour
just from this app!
• Lock percentage ran high, a single shard was pegged
• Real-time analytics processing got severely delayed,
adding more servers did not help (in fact, it made things
worse)
Why a single shard?
Shard key selections
• Pre-aggregated analytics
• Always query history for a single app
• 1 document per day per app per metric
!
{app_id: 1}
Shard key selections
• Pre-aggregated analytics
• Always query history for a single app
• 1 document per day per app per metric
!
{app_id: 1}
ObjectRocket: Capacity, Growth
• Concurrency
• Did I mention locks?
• Cache management
• Compaction
• The shell game
• Indexing at scale
How to fix this?
• Fundamentally, all updates are going to a single document
• Can’t shard out a single document
• Asked ObjectRocket for their suggestions
How to fix this?
• Fundamentally, all updates are going to a single document
• Can’t shard out a single document
• Asked ObjectRocket for their suggestions
!
Introduce write buffering
Write buffering
• Buffer writes to something that can be sharded out, then
flush to MongoDB
• Need something transactional, so MongoDB was out for this
• Decided on multiple Redis instances:
• Redis has native hash data structure with atomic hash
increments, works nicely with MongoDB in this use-case
Write buffering
Incoming data Flush to MongoDB
Write buffering
• Wrote write buffering over a weekend to buffer writes to
MongoDB every 3 seconds
!
Pre-aggregated analytics bottleneck was solved!
MongoDB Evolution:
January, 2014
Mar May July Sept Nov Jan
Apr Jun Aug Oct Dec Feb
Mar
Scaled 	

vertically
Start sharding
Everything 	

sharded
Various customer	

launches
Bad shard key	

hit upper limit
Added 	

write buffering
What did Appboy look like in January, 2014?
• > 3 billion events per month
• 4, 100GB shards on ObjectRocket
• Performance started to have really bad bursty behavior:
sometimes user experience would slow down to what we
thought was unacceptable for our customers
Why was performance getting worse?
• Appboy customers send millions of messages in a single campaign,
most are sending hundreds of thousands to millions of messages
each week
• Campaign times tend to cluster together across all Appboy
customers: evenings, Saturday/Sunday afternoons, etc.

A lot of enormous read activity
Why was performance getting worse?
• Appboy customers send millions of messages in a single campaign,
most are sending hundreds of thousands to millions of messages
each week
• Campaign times tend to cluster together across all Appboy
customers: evenings, Saturday/Sunday afternoons, etc.

A lot of enormous read activity
Reads and writes and more reads start conflicting :-(
!
• Users visiting our dashboard during simultaneous large campaign
sends would have sporadic poor performance
ObjectRocket: Splits
• Split out collections to different MongoDB clusters
AfterBefore
What did Appboy look like in February, 2014?
• Splits helped
• > 4 billion events per month
• We needed more
What did Appboy look like in February, 2014?
• Splits helped
• > 4 billion events per month
• We needed more





Isolation
ObjectRocket: Isolation
• Isolate large enterprise customers on their own MongoDB
databases/clusters
• Appboy built this in March, 2014
Enterprise customer
Long-tail customer
Mar May July Sept Nov Jan
Apr Jun Aug Oct Dec Feb
Mar
Scaled 	

vertically
Start sharding
Everything 	

sharded
Various customer	

launches
Bad shard key	

hit upper limit
Added 	

write buffering
Start 	

splitting DBs
Isolation
Summary
What’s next?
• Figure out capacity planning
• Continue down isolation path
0
15000000
30000000
45000000
60000000
Thanks!
jon@appboy.com
!
kgorman@objectrocket.com
@appboy @objectrocket @jon_hyman @kennygorman

More Related Content

Similar to How Appboy’s Marketing Automation for Apps Platform Grew 40x on the ObjectRocket MongoDB Platform

Pre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDBPre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDBRackspace
 
Dominoapplikationen im Wandel der Zeit: Alles neu mit HCL Nomad Web
Dominoapplikationen im Wandel der Zeit: Alles neu mit HCL Nomad WebDominoapplikationen im Wandel der Zeit: Alles neu mit HCL Nomad Web
Dominoapplikationen im Wandel der Zeit: Alles neu mit HCL Nomad Webpanagenda
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Open Analytics
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenChristopher Whitaker
 
The Times They Are a-Changin’: Domino Applications in the New World of HCL No...
The Times They Are a-Changin’: Domino Applications in the New World of HCL No...The Times They Are a-Changin’: Domino Applications in the New World of HCL No...
The Times They Are a-Changin’: Domino Applications in the New World of HCL No...panagenda
 
ReliefWeb's Journey from RSS Feed to Public API
ReliefWeb's Journey from RSS Feed to Public APIReliefWeb's Journey from RSS Feed to Public API
ReliefWeb's Journey from RSS Feed to Public APIPhase2
 
MongoDB, ANTS, and the IC
MongoDB, ANTS, and the ICMongoDB, ANTS, and the IC
MongoDB, ANTS, and the ICMongoDB
 
DIGIT Noe 2016 - Overview of front end development today
DIGIT Noe 2016 - Overview of front end development todayDIGIT Noe 2016 - Overview of front end development today
DIGIT Noe 2016 - Overview of front end development todayBojan Veljanovski
 
Web APIs: The future of software
Web APIs: The future of softwareWeb APIs: The future of software
Web APIs: The future of softwareReuven Lerner
 
From 100s to 100s of Millions
From 100s to 100s of MillionsFrom 100s to 100s of Millions
From 100s to 100s of MillionsErik Onnen
 
Big Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriBig Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriDemi Ben-Ari
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldRandy Shoup
 
Liferay and Big Data
Liferay and Big DataLiferay and Big Data
Liferay and Big DataMiguel Pastor
 
Choosing the best JavaScript framework/library/toolkit
Choosing the best JavaScript framework/library/toolkitChoosing the best JavaScript framework/library/toolkit
Choosing the best JavaScript framework/library/toolkitHristo Chakarov
 
How to build an awesome mobile APP
How to build an awesome mobile APPHow to build an awesome mobile APP
How to build an awesome mobile APPBSP Media Group
 
How to build an awesome mobile APP
How to build an awesome mobile APPHow to build an awesome mobile APP
How to build an awesome mobile APPBSP Media Group
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's ArchitectureTony Tam
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014ALTER WAY
 

Similar to How Appboy’s Marketing Automation for Apps Platform Grew 40x on the ObjectRocket MongoDB Platform (20)

Pre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDBPre-Aggregated Analytics And Social Feeds Using MongoDB
Pre-Aggregated Analytics And Social Feeds Using MongoDB
 
Dominoapplikationen im Wandel der Zeit: Alles neu mit HCL Nomad Web
Dominoapplikationen im Wandel der Zeit: Alles neu mit HCL Nomad WebDominoapplikationen im Wandel der Zeit: Alles neu mit HCL Nomad Web
Dominoapplikationen im Wandel der Zeit: Alles neu mit HCL Nomad Web
 
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...
 
Open Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe OlsenOpen Data Summit Presentation by Joe Olsen
Open Data Summit Presentation by Joe Olsen
 
The Times They Are a-Changin’: Domino Applications in the New World of HCL No...
The Times They Are a-Changin’: Domino Applications in the New World of HCL No...The Times They Are a-Changin’: Domino Applications in the New World of HCL No...
The Times They Are a-Changin’: Domino Applications in the New World of HCL No...
 
Php ey final
Php ey finalPhp ey final
Php ey final
 
ReliefWeb's Journey from RSS Feed to Public API
ReliefWeb's Journey from RSS Feed to Public APIReliefWeb's Journey from RSS Feed to Public API
ReliefWeb's Journey from RSS Feed to Public API
 
MongoDB, ANTS, and the IC
MongoDB, ANTS, and the ICMongoDB, ANTS, and the IC
MongoDB, ANTS, and the IC
 
DIGIT Noe 2016 - Overview of front end development today
DIGIT Noe 2016 - Overview of front end development todayDIGIT Noe 2016 - Overview of front end development today
DIGIT Noe 2016 - Overview of front end development today
 
2014 Picking a Platform by Anand Kulkarni
2014 Picking a Platform by Anand Kulkarni2014 Picking a Platform by Anand Kulkarni
2014 Picking a Platform by Anand Kulkarni
 
Web APIs: The future of software
Web APIs: The future of softwareWeb APIs: The future of software
Web APIs: The future of software
 
From 100s to 100s of Millions
From 100s to 100s of MillionsFrom 100s to 100s of Millions
From 100s to 100s of Millions
 
Big Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-AriBig Data made easy in the era of the Cloud - Demi Ben-Ari
Big Data made easy in the era of the Cloud - Demi Ben-Ari
 
Effective Microservices In a Data-centric World
Effective Microservices In a Data-centric WorldEffective Microservices In a Data-centric World
Effective Microservices In a Data-centric World
 
Liferay and Big Data
Liferay and Big DataLiferay and Big Data
Liferay and Big Data
 
Choosing the best JavaScript framework/library/toolkit
Choosing the best JavaScript framework/library/toolkitChoosing the best JavaScript framework/library/toolkit
Choosing the best JavaScript framework/library/toolkit
 
How to build an awesome mobile APP
How to build an awesome mobile APPHow to build an awesome mobile APP
How to build an awesome mobile APP
 
How to build an awesome mobile APP
How to build an awesome mobile APPHow to build an awesome mobile APP
How to build an awesome mobile APP
 
Inside Wordnik's Architecture
Inside Wordnik's ArchitectureInside Wordnik's Architecture
Inside Wordnik's Architecture
 
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
Séminaire Big Data Alter Way - Elasticsearch - octobre 2014
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

Recently uploaded

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 

Recently uploaded (20)

Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 

How Appboy’s Marketing Automation for Apps Platform Grew 40x on the ObjectRocket MongoDB Platform

  • 1. Scaling 40x on the ObjectRocket MongoDB Platform Jon Hyman & Kenny Gorman MongoDB World, June 25, 2014 NYC @appboy @objectrocket @jon_hyman @kennygorman
  • 2. A LITTLE BIT ABOUT JON & APPBOY Jon Hyman CIO :: @jon_hyman ! Appboy is a marketing automation platform for apps Harvard Bridgewater
  • 3. A LITTLE BIT ABOUT KENNY & OBJECTROCKET Kenny Gorman Co-Founder & Chief Architect :: @kennygorman ! ObjectRocket is a highly available, sharded, unbelievably fast MongoDB as a service ObjectRocket eBay Shutterfly
  • 4. Agenda • Evolution of Appboy’s MongoDB installation as we grew to handle billions of data points per month ! • Operational MongoDB issues we worked through
  • 5. MongoDB Evolution: March, 2013 Mar May July Sept Nov Jan Apr Jun Aug Oct Dec Feb Mar
  • 6. What did Appboy look like in March, 2013? •~2.5 million events per day tracking 8 million users • Event storage: every data point as a new document • Single, unsharded replica set on AWS (m2.xlarge) • Mostly long-tail customers; biggest app had 2M users
  • 7. What did Appboy look like in March, 2013? •~2.5 million events per day tracking 8 million users • Event storage: every data point as a new document • Single, unsharded replica set on AWS (m2.xlarge) • Mostly long-tail customers; biggest app had 2M users ! Growing a lot on disk. :-( ! Started running into locking issues (30-40%). :-(
  • 8. MongoDB Evolution: April, 2013 Mar May July Sept Nov Jan Apr Jun Aug Oct Dec Feb Mar Scaled vertically
  • 9. What happened in April, 2013? • First enterprise client signs • More than 50 million users • They estimated sending us over 1 billion data points per month
  • 10. What happened in April, 2013? • First enterprise client signs • More than 50 million users • They estimated sending us over 1 billion data points per month ! “Btw, we’re going live next month”
  • 12. ObjectRocket: Getting Started • The landscape of a simple configuration • It’s all about choosing shard keys • Locks - you know you love them 20% 80%
  • 13. What are we going to do? • Contain growth from data points: • Shifted to Amazon Redshift for “raw data” • Moved MongoDB to storing pre-aggregated analytics for time series data
 • Figure out sharding ASAP • Moved to ObjectRocket, worked on shard key selection • Sharding was hard: • Tough to figure out the right shard key, make tradeoffs • Rewrite a lot of application code to include shard keys in queries, inserts, adjust to life without unique indexes
  • 14. Shard key selections • Users • Had multiple ways to identify a user • Device identifier, “external user id”, BSON ID • Often performed large scans of user bases
  • 15. Shard key selections • Users • Had multiple ways to identify a user • Device identifier, “external user id”, BSON ID • Often performed large scans of user bases ! {_id: “hashed”} ! • Cache secondary identifiers to BSON ID to reduce scatter- gather queries • Doing scatter gathers goes against conventional wisdom
  • 16. Shard key selections • Pre-aggregated analytics • Always query history for a single app • 1 document per day per app per metric ! {app_id: 1}
  • 17. MongoDB Evolution: May - October, 2013 Mar May July Sept Nov Jan Apr Jun Aug Oct Dec Feb Mar Scaled vertically Start sharding Everything sharded
  • 18. What did Appboy look like in May - October, 2013? • textPlus goes live, as do other customers • > 1 billion events per month, doing great! • 4, 100GB shards on ObjectRocket
  • 19. MongoDB Evolution: November, 2013 Mar May July Sept Nov Jan Apr Jun Aug Oct Dec Feb Mar Scaled vertically Start sharding Everything sharded Various customer launches
  • 20. What happened in November, 2013? • One of the largest European soccer apps
  • 21. What happened in November, 2013? • One of the largest European soccer apps • Soccer games crushed us: 15 million data points per hour just from this app! • Lock percentage ran high, a single shard was pegged • Real-time analytics processing got severely delayed, adding more servers did not help (in fact, it made things worse)
  • 22. What happened in November, 2013? • One of the largest European soccer apps • Soccer games crushed us: 15 million data points per hour just from this app! • Lock percentage ran high, a single shard was pegged • Real-time analytics processing got severely delayed, adding more servers did not help (in fact, it made things worse) Why a single shard?
  • 23. Shard key selections • Pre-aggregated analytics • Always query history for a single app • 1 document per day per app per metric ! {app_id: 1}
  • 24. Shard key selections • Pre-aggregated analytics • Always query history for a single app • 1 document per day per app per metric ! {app_id: 1}
  • 25. ObjectRocket: Capacity, Growth • Concurrency • Did I mention locks? • Cache management • Compaction • The shell game • Indexing at scale
  • 26. How to fix this? • Fundamentally, all updates are going to a single document • Can’t shard out a single document • Asked ObjectRocket for their suggestions
  • 27. How to fix this? • Fundamentally, all updates are going to a single document • Can’t shard out a single document • Asked ObjectRocket for their suggestions ! Introduce write buffering
  • 28. Write buffering • Buffer writes to something that can be sharded out, then flush to MongoDB • Need something transactional, so MongoDB was out for this • Decided on multiple Redis instances: • Redis has native hash data structure with atomic hash increments, works nicely with MongoDB in this use-case
  • 29. Write buffering Incoming data Flush to MongoDB
  • 30. Write buffering • Wrote write buffering over a weekend to buffer writes to MongoDB every 3 seconds ! Pre-aggregated analytics bottleneck was solved!
  • 31. MongoDB Evolution: January, 2014 Mar May July Sept Nov Jan Apr Jun Aug Oct Dec Feb Mar Scaled vertically Start sharding Everything sharded Various customer launches Bad shard key hit upper limit Added write buffering
  • 32. What did Appboy look like in January, 2014? • > 3 billion events per month • 4, 100GB shards on ObjectRocket • Performance started to have really bad bursty behavior: sometimes user experience would slow down to what we thought was unacceptable for our customers
  • 33. Why was performance getting worse? • Appboy customers send millions of messages in a single campaign, most are sending hundreds of thousands to millions of messages each week • Campaign times tend to cluster together across all Appboy customers: evenings, Saturday/Sunday afternoons, etc.
 A lot of enormous read activity
  • 34. Why was performance getting worse? • Appboy customers send millions of messages in a single campaign, most are sending hundreds of thousands to millions of messages each week • Campaign times tend to cluster together across all Appboy customers: evenings, Saturday/Sunday afternoons, etc.
 A lot of enormous read activity Reads and writes and more reads start conflicting :-( ! • Users visiting our dashboard during simultaneous large campaign sends would have sporadic poor performance
  • 35. ObjectRocket: Splits • Split out collections to different MongoDB clusters AfterBefore
  • 36. What did Appboy look like in February, 2014? • Splits helped • > 4 billion events per month • We needed more
  • 37. What did Appboy look like in February, 2014? • Splits helped • > 4 billion events per month • We needed more
 
 
 Isolation
  • 38. ObjectRocket: Isolation • Isolate large enterprise customers on their own MongoDB databases/clusters • Appboy built this in March, 2014 Enterprise customer Long-tail customer
  • 39. Mar May July Sept Nov Jan Apr Jun Aug Oct Dec Feb Mar Scaled vertically Start sharding Everything sharded Various customer launches Bad shard key hit upper limit Added write buffering Start splitting DBs Isolation Summary
  • 40. What’s next? • Figure out capacity planning • Continue down isolation path 0 15000000 30000000 45000000 60000000