SlideShare a Scribd company logo
1 of 38
1
Kannan Muthukkaruppan & Karthik Ranganathan
Jun/20/2013
How Big Data Technologies
Power Facebook
How Big Data Technologies Power Facebook
Karthik Ranganathan
September, 2013
2
Introduction
Email: karthik@nutanix.com
Twitter: @KarthikR
Current: Member of Technical Staff, Nutanix
Background: Technical Engineering Lead at Facebook. Co-
built Cassandra for Facebook Inbox Search and improved
performance and resiliency of Hbase for Facebook
Messages and Search Indexing.
3
Agenda
 Big data at Facebook
 HBase use cases
• OLTP
• Analytics
 Operating at scale
 The Nutanix solution
4
Big Data at Facebook
 OLTP
• User databases (MySQL)
• Photos (Haystack)
• Facebook Messages, Operational Data Store (HBase)
 Warehouse
• Hive Analytics
• Graph Search Indexing
5
HBase in a nutshell
 Apache project, modeled after BigTable
 Distributed, large scale data store
 Built on top of Hadoop DFS (HDFS)
 Efficient at random reads and writes
6
FB’s Largest Hbase Application
Facebook Messages
7
The New Facebook Messages
8
Why HBase?
 Evaluated a bunch of different options
• MySQL, Cassandra, building a custom storage system for
messages
 Horizontal Scalability
 Automatic failover and load balancing
 Optimized for write-heavy workloads
 HDFS already battle-tested at Facebook
 HBase’s strong consistency model
9
Quick stats (as of Nov 2011)
 Traffic to HBase
• Billions of messages per day
• 75B+ rpc’s per day
 Usage pattern
• 55% reads, 45% writes
• Average write: 16 KV’s to multiple CF’s
10
Data Sizes
 7PB+ online data
• ~21PB with replication
• LZO compressed
• Excludes backups
 Growth rate
• 500TB+ per month
• ~20PB of raw disk per year!
11
Growing with size
 Constant need of features with growth
 Read and write path improvements
• Performance optimizations
• IOPS reduction
• New database file format
 Intelligent data and compute placement
• Shard level block placement
• Locality based load-balancing
12
Other OLTP use cases of HBase
 Operational Data Store
 Multi-tenant KeyValue store
 Site integrity – fighting spam
13
Warehouse use cases of HBase
 Graph Search Indexing
• Complex application logic
• Multiple verticals
 Hive over HBase
• Realtime data ingest
• Enables real-time analytics
14
Real-time monitoring and anomaly detection
Operational Data Store
15
ODS: Facebook’s #1 Debugging Tool
 Collects metrics from
production servers
 Supports complex
aggregations and
transformations
 Really well-designed UI
16
Quick stats
 Traffic to HBase
• 150B+ ops per day
 Usage pattern
• Heavy reads of recent data
• Frequent MR jobs for rollups
• TTL to expire older data
17
Real-time Analytics
Facebook Insights
18
Real-time URL/Domain Insights
 Deep analytics for websites
• Facebook widgets
 Massive scale
• Billions of URL’s
• Millions of increments/sec
19
Detailed Insights
 Tracks many metrics
• Clicks, likes, shares, impr
essions
• Referral traffic
 Detailed breakdown
• Age
buckets, gender, location
20
Controlled Multi-tenancy
Generic KeyValue Store
21
A Multi-tenant solution on HBase
 Generic Key-Value store
• Multiple apps on the same cluster
• Transparent schema design
• Simple API
put(appid, key, value)
value = get(appid, key)
22
Architecture
HBase
put(appid, key, value)
Memcache
get(appid, key)
Read
Write
23
Multi-tenancy Issues
 Not a self-service model
• Each app is reviewed
 Global and per-app metrics
• Monitor RPCs by type, latencies, errors
• Friendly names for apps
 If things went wrong
• Per-app kill switch
24
Powering FB’s Semantic Search Engine
Graph Search Indexing
25
Framework to build search indexes
 Multiple, independent input sources
 HBase stores document info
 Output is the search index image
rowKey = document id
value = terms, document data
26
Architecture
HBase cluster
Document
source 2
Document
source 1
MR
cluster
…
Image files
…
27
Do’s and Do-Not’s From Experience
Operating at Scale
28
Design for failures(!)
 Architect for failures and manageability
 No single point of failure
• Killing any process is legit
 Minimize manual intervention
• Especially for frequent failures
 Uptime is important
• Rolling upgrades are the norm
• Need to survive rack failures
29
Dashboard and Metrics
 Single place to graph/report everything
 RPC calls
 SLA misses
• Latencies, p99, Errors
• Per-request profiling
 Cluster and node health
 Network Utilization
30
Health Checks
 Constantly monitor nodes
 Auto-exclude nodes on failure
• Machine not ssh-able
• Hardware failures (HDD failure, etc)
• Do NOT exclude on rack failures
 Auto-include nodes once repaired
 Rate limit remediation of nodes
31
In a nutshell…
 Use commodity hardware
 Scaling out is #1
 Efficiency is #2
• though pretty close behind scale-out
 Design for failures
• Frequent failures must be auto handled
 Metrics, Metrics, Metrics!
32
Overview through comparison
The Nutanix Solution
33
Nutanix compared with HBase
 Evaluated a bunch of different options
• MySQL, Cassandra, building a custom storage system for
messages
 Horizontal Scalability
 Just add more nodes to scale out
 Automatic failover and load balancing
 When a node goes down, others take its place automatically
 Load of node that went down is distributed to many others
34
Nutanix compared with HBase
philosophy
 Optimized for write-heavy workloads
 Optimized for virtualized environments
 Read and write heavy workloads
 Transparent use of flash to boost perf
 HDFS already battle-tested at Facebook
 Nutanix is also quite battle-tested
 HBase’s strong consistency model
 Nutanix is also strongly consistent
35
Other aspects of Nutanix
 Architected for failures and manageability
 No single point of failure
 Minimal manual intervention for frequent failures
 Uptime is important
 Rolling upgrades are the norm
• Need to survive rack failures
 Single place to graph/report everything
 Prism UI to report and manage the entire cluster
 Constantly monitor nodes
 Auto-exclude nodes on failure
36
In a nutshell about Nutanix…
 Runs on commodity hardware
 Scaling out is #1
 Drop in scale out for nodes
 Efficiency is #2
 Constant work on perf improvements
 Design for failures
 Frequent failures auto handled
 Alerts in UI for many other states
 Metrics, Metrics, Metrics!
 Prism UI gives insights into the cluster health
37
Questions?
38NUTANIX INC. – CONFIDENTIAL AND PROPRIETARY

More Related Content

What's hot

Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...✔ Eric David Benari, PMP
 
HP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big DataHP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big DataRob Winters
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenChristoph Adler
 
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, BlazegraphDatabase Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph✔ Eric David Benari, PMP
 
(ATS3-PLAT08) Optimizing Protocol Performance
(ATS3-PLAT08) Optimizing Protocol Performance(ATS3-PLAT08) Optimizing Protocol Performance
(ATS3-PLAT08) Optimizing Protocol PerformanceBIOVIA
 
The Holy Grail of Data Analytics
The Holy Grail of Data AnalyticsThe Holy Grail of Data Analytics
The Holy Grail of Data AnalyticsDan Lynn
 
Data Caching Evolution - the SafePeak deck from webcast 2014-04-24
Data Caching Evolution - the SafePeak deck from webcast 2014-04-24Data Caching Evolution - the SafePeak deck from webcast 2014-04-24
Data Caching Evolution - the SafePeak deck from webcast 2014-04-24Vladi Vexler
 
Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachDataWorks Summit
 
Presentacion day f-core v1.2.1.2-technical - english
Presentacion day f-core v1.2.1.2-technical - englishPresentacion day f-core v1.2.1.2-technical - english
Presentacion day f-core v1.2.1.2-technical - englishJose Luis Sanchez del Coso
 
Yellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time AnalyticsYellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time AnalyticsYellowbrick Data
 
Solution Brief: Commvault & Red Hat Storage
Solution Brief: Commvault & Red Hat StorageSolution Brief: Commvault & Red Hat Storage
Solution Brief: Commvault & Red Hat StorageMarcel Hergaarden
 
Telco analytics at scale
Telco analytics at scaleTelco analytics at scale
Telco analytics at scaledatamantra
 
Reducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off OracleReducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off OracleEDB
 
Customer Education Webcast: New Features in Data Integration and Streaming CDC
Customer Education Webcast: New Features in Data Integration and Streaming CDCCustomer Education Webcast: New Features in Data Integration and Streaming CDC
Customer Education Webcast: New Features in Data Integration and Streaming CDCPrecisely
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsHisham Arafat
 
Database Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, Couchbase
Database Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, CouchbaseDatabase Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, Couchbase
Database Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, Couchbase✔ Eric David Benari, PMP
 

What's hot (20)

Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
Database Camp 2016 @ United Nations, NYC - Michael Glukhovsky, Co-Founder, Re...
 
A deep dive into neuton
A deep dive into neutonA deep dive into neuton
A deep dive into neuton
 
HP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big DataHP Discover: Real Time Insights from Big Data
HP Discover: Real Time Insights from Big Data
 
AdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für AdministratorenAdminCamp 2018 - ApplicationInsights für Administratoren
AdminCamp 2018 - ApplicationInsights für Administratoren
 
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, BlazegraphDatabase Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
Database Camp 2016 @ United Nations, NYC - Brad Bebee, CEO, Blazegraph
 
(ATS3-PLAT08) Optimizing Protocol Performance
(ATS3-PLAT08) Optimizing Protocol Performance(ATS3-PLAT08) Optimizing Protocol Performance
(ATS3-PLAT08) Optimizing Protocol Performance
 
sitMAI, Helping a Friend
sitMAI, Helping a FriendsitMAI, Helping a Friend
sitMAI, Helping a Friend
 
The Holy Grail of Data Analytics
The Holy Grail of Data AnalyticsThe Holy Grail of Data Analytics
The Holy Grail of Data Analytics
 
Data Caching Evolution - the SafePeak deck from webcast 2014-04-24
Data Caching Evolution - the SafePeak deck from webcast 2014-04-24Data Caching Evolution - the SafePeak deck from webcast 2014-04-24
Data Caching Evolution - the SafePeak deck from webcast 2014-04-24
 
Free Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s ApproachFree Servers to Build Big Data System on: Bing’s Approach
Free Servers to Build Big Data System on: Bing’s Approach
 
Presentacion day f-core v1.2.1.2-technical - english
Presentacion day f-core v1.2.1.2-technical - englishPresentacion day f-core v1.2.1.2-technical - english
Presentacion day f-core v1.2.1.2-technical - english
 
Yellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time AnalyticsYellowbrick Webcast with DBTA for Real-Time Analytics
Yellowbrick Webcast with DBTA for Real-Time Analytics
 
Solution Brief: Commvault & Red Hat Storage
Solution Brief: Commvault & Red Hat StorageSolution Brief: Commvault & Red Hat Storage
Solution Brief: Commvault & Red Hat Storage
 
Telco analytics at scale
Telco analytics at scaleTelco analytics at scale
Telco analytics at scale
 
Reducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off OracleReducing the Risks of Migrating Off Oracle
Reducing the Risks of Migrating Off Oracle
 
SAP HANA Overview
SAP HANA OverviewSAP HANA Overview
SAP HANA Overview
 
Customer Education Webcast: New Features in Data Integration and Streaming CDC
Customer Education Webcast: New Features in Data Integration and Streaming CDCCustomer Education Webcast: New Features in Data Integration and Streaming CDC
Customer Education Webcast: New Features in Data Integration and Streaming CDC
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platforms
 
Database Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, Couchbase
Database Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, CouchbaseDatabase Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, Couchbase
Database Camp 2016 @ United Nations, NYC - Bob Wiederhold, CEO, Couchbase
 

Viewers also liked

Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...
Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...
Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...Nick Howlett
 
Facebook marketing event - Big data & social
Facebook marketing event - Big data & socialFacebook marketing event - Big data & social
Facebook marketing event - Big data & socialIskander Smit
 
Big data luiss Facebook and epistemology
Big data luiss Facebook and epistemologyBig data luiss Facebook and epistemology
Big data luiss Facebook and epistemologyTeresa Numerico
 
You are not Facebook or Google? Why you should still care about Big Data and ...
You are not Facebook or Google? Why you should still care about Big Data and ...You are not Facebook or Google? Why you should still care about Big Data and ...
You are not Facebook or Google? Why you should still care about Big Data and ...Kai Wähner
 
「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數
「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數
「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數Yuan CHAO
 
巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據
巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據
巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據Yuan CHAO
 
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息Etu Solution
 
資料科學計劃的成果與展望
資料科學計劃的成果與展望資料科學計劃的成果與展望
資料科學計劃的成果與展望Johnson Hsieh
 
豆瓣数据架构实践
豆瓣数据架构实践豆瓣数据架构实践
豆瓣数据架构实践Xupeng Yun
 
優化宅的日常-數據分析篇
優化宅的日常-數據分析篇優化宅的日常-數據分析篇
優化宅的日常-數據分析篇Wanju Wang
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyNati Shalom
 
那些你知道的,但還沒看過的 Big Data 風景 ─ 致 Hadooper
那些你知道的,但還沒看過的 Big Data 風景 ─ 致 Hadooper那些你知道的,但還沒看過的 Big Data 風景 ─ 致 Hadooper
那些你知道的,但還沒看過的 Big Data 風景 ─ 致 HadooperFred Chiang
 
Facebook Marketing Intelligence
Facebook Marketing IntelligenceFacebook Marketing Intelligence
Facebook Marketing IntelligenceGuido Picus
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldabaux singapore
 

Viewers also liked (14)

Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...
Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...
Highly-Engaged-a-Quantitative-Study-of-Facebook-and-News-Usage-in-the-Pacific...
 
Facebook marketing event - Big data & social
Facebook marketing event - Big data & socialFacebook marketing event - Big data & social
Facebook marketing event - Big data & social
 
Big data luiss Facebook and epistemology
Big data luiss Facebook and epistemologyBig data luiss Facebook and epistemology
Big data luiss Facebook and epistemology
 
You are not Facebook or Google? Why you should still care about Big Data and ...
You are not Facebook or Google? Why you should still care about Big Data and ...You are not Facebook or Google? Why you should still care about Big Data and ...
You are not Facebook or Google? Why you should still care about Big Data and ...
 
「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數
「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數
「大數據」時代的「小問題」-- 以數據分析的手法處理虛擬歌手聲源參數
 
巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據
巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據
巨量資料分析輕鬆上手_教您玩大強子對撞機公開數據
 
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
Big Data Taiwan 2014 Track2-3: QlikView 與 Big Data ─ 從 Big Data 裡獲取重要信息
 
資料科學計劃的成果與展望
資料科學計劃的成果與展望資料科學計劃的成果與展望
資料科學計劃的成果與展望
 
豆瓣数据架构实践
豆瓣数据架构实践豆瓣数据架构实践
豆瓣数据架构实践
 
優化宅的日常-數據分析篇
優化宅的日常-數據分析篇優化宅的日常-數據分析篇
優化宅的日常-數據分析篇
 
Big Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case StudyBig Data Real Time Analytics - A Facebook Case Study
Big Data Real Time Analytics - A Facebook Case Study
 
那些你知道的,但還沒看過的 Big Data 風景 ─ 致 Hadooper
那些你知道的,但還沒看過的 Big Data 風景 ─ 致 Hadooper那些你知道的,但還沒看過的 Big Data 風景 ─ 致 Hadooper
那些你知道的,但還沒看過的 Big Data 風景 ─ 致 Hadooper
 
Facebook Marketing Intelligence
Facebook Marketing IntelligenceFacebook Marketing Intelligence
Facebook Marketing Intelligence
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 

Similar to Datacenter@Night: How Big Data Technologies Power Facebook

Pacemaker hadoop infrastructure and soft serve experience
Pacemaker   hadoop infrastructure and soft serve experiencePacemaker   hadoop infrastructure and soft serve experience
Pacemaker hadoop infrastructure and soft serve experienceVitaliy Bashun
 
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data ArchitectHadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data ArchitectSoftServe
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdataTom Rogers
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Perficient, Inc.
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSatish Mohan
 
Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)Thomas W. Dinsmore
 
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop WarehouseData Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop WarehouseDataWorks Summit
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which DataWorks Summit
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Altan Khendup
 
Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Dave Nielsen
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics PlatformN Masahiro
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Precisely
 
CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4Michael Kehoe
 
high performance databases
high performance databaseshigh performance databases
high performance databasesmahdi_92
 
Birst for SAP HANA
Birst for SAP HANABirst for SAP HANA
Birst for SAP HANABirst
 
Trafodion overview
Trafodion overviewTrafodion overview
Trafodion overviewRohit Jain
 

Similar to Datacenter@Night: How Big Data Technologies Power Facebook (20)

Pacemaker hadoop infrastructure and soft serve experience
Pacemaker   hadoop infrastructure and soft serve experiencePacemaker   hadoop infrastructure and soft serve experience
Pacemaker hadoop infrastructure and soft serve experience
 
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data ArchitectHadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
 
Apache drill
Apache drillApache drill
Apache drill
 
Foxvalley bigdata
Foxvalley bigdataFoxvalley bigdata
Foxvalley bigdata
 
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
Big Data Open Source Tools and Trends: Enable Real-Time Business Intelligence...
 
Simple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform ConceptSimple, Modular and Extensible Big Data Platform Concept
Simple, Modular and Extensible Big Data Platform Concept
 
Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)Advanced Analytics and Big Data (August 2014)
Advanced Analytics and Big Data (August 2014)
 
Architecting Your First Big Data Implementation
Architecting Your First Big Data ImplementationArchitecting Your First Big Data Implementation
Architecting Your First Big Data Implementation
 
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop WarehouseData Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
Data Driving Yahoo Mail Growth and Evolution with a 50 PB Hadoop Warehouse
 
Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which Hadoop and the Data Warehouse: When to Use Which
Hadoop and the Data Warehouse: When to Use Which
 
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
Data Apps with the Lambda Architecture - with Real Work Examples on Merging B...
 
Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Microservices - Is it time to breakup?
Microservices - Is it time to breakup?
 
Intro to Big Data
Intro to Big DataIntro to Big Data
Intro to Big Data
 
Technologies for Data Analytics Platform
Technologies for Data Analytics PlatformTechnologies for Data Analytics Platform
Technologies for Data Analytics Platform
 
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
 
CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4CouchbasetoHadoop_Matt_Michael_Justin v4
CouchbasetoHadoop_Matt_Michael_Justin v4
 
high performance databases
high performance databaseshigh performance databases
high performance databases
 
Graph Day 2017 Spring Boot
Graph Day 2017 Spring BootGraph Day 2017 Spring Boot
Graph Day 2017 Spring Boot
 
Birst for SAP HANA
Birst for SAP HANABirst for SAP HANA
Birst for SAP HANA
 
Trafodion overview
Trafodion overviewTrafodion overview
Trafodion overview
 

More from Digicomp Academy AG

Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019
Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019
Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019Digicomp Academy AG
 
Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...
Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...
Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...Digicomp Academy AG
 
Innovation durch kollaboration gennex 2018
Innovation durch kollaboration gennex 2018Innovation durch kollaboration gennex 2018
Innovation durch kollaboration gennex 2018Digicomp Academy AG
 
Roger basler meetup_digitale-geschaeftsmodelle-entwickeln_handout
Roger basler meetup_digitale-geschaeftsmodelle-entwickeln_handoutRoger basler meetup_digitale-geschaeftsmodelle-entwickeln_handout
Roger basler meetup_digitale-geschaeftsmodelle-entwickeln_handoutDigicomp Academy AG
 
Roger basler meetup_21082018_work-smarter-not-harder_handout
Roger basler meetup_21082018_work-smarter-not-harder_handoutRoger basler meetup_21082018_work-smarter-not-harder_handout
Roger basler meetup_21082018_work-smarter-not-harder_handoutDigicomp Academy AG
 
Xing expertendialog zu nudge unit x
Xing expertendialog zu nudge unit xXing expertendialog zu nudge unit x
Xing expertendialog zu nudge unit xDigicomp Academy AG
 
Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?
Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?
Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?Digicomp Academy AG
 
IPv6 Security Talk mit Joe Klein
IPv6 Security Talk mit Joe KleinIPv6 Security Talk mit Joe Klein
IPv6 Security Talk mit Joe KleinDigicomp Academy AG
 
Agiles Management - Wie geht das?
Agiles Management - Wie geht das?Agiles Management - Wie geht das?
Agiles Management - Wie geht das?Digicomp Academy AG
 
Gewinnen Sie Menschen und Ziele - Referat von Andi Odermatt
Gewinnen Sie Menschen und Ziele - Referat von Andi OdermattGewinnen Sie Menschen und Ziele - Referat von Andi Odermatt
Gewinnen Sie Menschen und Ziele - Referat von Andi OdermattDigicomp Academy AG
 
Querdenken mit Kreativitätsmethoden – XING Expertendialog
Querdenken mit Kreativitätsmethoden – XING ExpertendialogQuerdenken mit Kreativitätsmethoden – XING Expertendialog
Querdenken mit Kreativitätsmethoden – XING ExpertendialogDigicomp Academy AG
 
Xing LearningZ: Digitale Geschäftsmodelle entwickeln
Xing LearningZ: Digitale Geschäftsmodelle entwickelnXing LearningZ: Digitale Geschäftsmodelle entwickeln
Xing LearningZ: Digitale Geschäftsmodelle entwickelnDigicomp Academy AG
 
Swiss IPv6 Council: The Cisco-Journey to an IPv6-only Building
Swiss IPv6 Council: The Cisco-Journey to an IPv6-only BuildingSwiss IPv6 Council: The Cisco-Journey to an IPv6-only Building
Swiss IPv6 Council: The Cisco-Journey to an IPv6-only BuildingDigicomp Academy AG
 
UX – Schlüssel zum Erfolg im Digital Business
UX – Schlüssel zum Erfolg im Digital BusinessUX – Schlüssel zum Erfolg im Digital Business
UX – Schlüssel zum Erfolg im Digital BusinessDigicomp Academy AG
 
Die IPv6 Journey der ETH Zürich
Die IPv6 Journey der ETH Zürich Die IPv6 Journey der ETH Zürich
Die IPv6 Journey der ETH Zürich Digicomp Academy AG
 
Xing LearningZ: Die 10 + 1 Trends im (E-)Commerce
Xing LearningZ: Die 10 + 1 Trends im (E-)CommerceXing LearningZ: Die 10 + 1 Trends im (E-)Commerce
Xing LearningZ: Die 10 + 1 Trends im (E-)CommerceDigicomp Academy AG
 
Zahlen Battle: klassische werbung vs.online-werbung-somexcloud
Zahlen Battle: klassische werbung vs.online-werbung-somexcloudZahlen Battle: klassische werbung vs.online-werbung-somexcloud
Zahlen Battle: klassische werbung vs.online-werbung-somexcloudDigicomp Academy AG
 
General data protection regulation-slides
General data protection regulation-slidesGeneral data protection regulation-slides
General data protection regulation-slidesDigicomp Academy AG
 

More from Digicomp Academy AG (20)

Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019
Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019
Becoming Agile von Christian Botta – Personal Swiss Vortrag 2019
 
Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...
Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...
Swiss IPv6 Council – Case Study - Deployment von IPv6 in einer Container Plat...
 
Innovation durch kollaboration gennex 2018
Innovation durch kollaboration gennex 2018Innovation durch kollaboration gennex 2018
Innovation durch kollaboration gennex 2018
 
Roger basler meetup_digitale-geschaeftsmodelle-entwickeln_handout
Roger basler meetup_digitale-geschaeftsmodelle-entwickeln_handoutRoger basler meetup_digitale-geschaeftsmodelle-entwickeln_handout
Roger basler meetup_digitale-geschaeftsmodelle-entwickeln_handout
 
Roger basler meetup_21082018_work-smarter-not-harder_handout
Roger basler meetup_21082018_work-smarter-not-harder_handoutRoger basler meetup_21082018_work-smarter-not-harder_handout
Roger basler meetup_21082018_work-smarter-not-harder_handout
 
Xing expertendialog zu nudge unit x
Xing expertendialog zu nudge unit xXing expertendialog zu nudge unit x
Xing expertendialog zu nudge unit x
 
Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?
Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?
Responsive Organisation auf Basis der Holacracy – nur ein Hype oder die Zukunft?
 
IPv6 Security Talk mit Joe Klein
IPv6 Security Talk mit Joe KleinIPv6 Security Talk mit Joe Klein
IPv6 Security Talk mit Joe Klein
 
Agiles Management - Wie geht das?
Agiles Management - Wie geht das?Agiles Management - Wie geht das?
Agiles Management - Wie geht das?
 
Gewinnen Sie Menschen und Ziele - Referat von Andi Odermatt
Gewinnen Sie Menschen und Ziele - Referat von Andi OdermattGewinnen Sie Menschen und Ziele - Referat von Andi Odermatt
Gewinnen Sie Menschen und Ziele - Referat von Andi Odermatt
 
Querdenken mit Kreativitätsmethoden – XING Expertendialog
Querdenken mit Kreativitätsmethoden – XING ExpertendialogQuerdenken mit Kreativitätsmethoden – XING Expertendialog
Querdenken mit Kreativitätsmethoden – XING Expertendialog
 
Xing LearningZ: Digitale Geschäftsmodelle entwickeln
Xing LearningZ: Digitale Geschäftsmodelle entwickelnXing LearningZ: Digitale Geschäftsmodelle entwickeln
Xing LearningZ: Digitale Geschäftsmodelle entwickeln
 
Swiss IPv6 Council: The Cisco-Journey to an IPv6-only Building
Swiss IPv6 Council: The Cisco-Journey to an IPv6-only BuildingSwiss IPv6 Council: The Cisco-Journey to an IPv6-only Building
Swiss IPv6 Council: The Cisco-Journey to an IPv6-only Building
 
UX – Schlüssel zum Erfolg im Digital Business
UX – Schlüssel zum Erfolg im Digital BusinessUX – Schlüssel zum Erfolg im Digital Business
UX – Schlüssel zum Erfolg im Digital Business
 
Minenfeld IPv6
Minenfeld IPv6Minenfeld IPv6
Minenfeld IPv6
 
Was ist design thinking
Was ist design thinkingWas ist design thinking
Was ist design thinking
 
Die IPv6 Journey der ETH Zürich
Die IPv6 Journey der ETH Zürich Die IPv6 Journey der ETH Zürich
Die IPv6 Journey der ETH Zürich
 
Xing LearningZ: Die 10 + 1 Trends im (E-)Commerce
Xing LearningZ: Die 10 + 1 Trends im (E-)CommerceXing LearningZ: Die 10 + 1 Trends im (E-)Commerce
Xing LearningZ: Die 10 + 1 Trends im (E-)Commerce
 
Zahlen Battle: klassische werbung vs.online-werbung-somexcloud
Zahlen Battle: klassische werbung vs.online-werbung-somexcloudZahlen Battle: klassische werbung vs.online-werbung-somexcloud
Zahlen Battle: klassische werbung vs.online-werbung-somexcloud
 
General data protection regulation-slides
General data protection regulation-slidesGeneral data protection regulation-slides
General data protection regulation-slides
 

Recently uploaded

Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadIslamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadAyesha Khan
 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfJos Voskuil
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...lizamodels9
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...ictsugar
 
Call Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any TimeCall Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any Timedelhimodelshub1
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckHajeJanKamps
 
Marketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent ChirchirMarketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent Chirchirictsugar
 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Riya Pathan
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...ssuserf63bd7
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Servicecallgirls2057
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Pereraictsugar
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...lizamodels9
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCRashishs7044
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCRashishs7044
 
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCRashishs7044
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy Verified Accounts
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyotictsugar
 

Recently uploaded (20)

Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in IslamabadIslamabad Escorts | Call 03070433345 | Escort Service in Islamabad
Islamabad Escorts | Call 03070433345 | Escort Service in Islamabad
 
Digital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdfDigital Transformation in the PLM domain - distrib.pdf
Digital Transformation in the PLM domain - distrib.pdf
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
 
Call Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any TimeCall Girls Miyapur 7001305949 all area service COD available Any Time
Call Girls Miyapur 7001305949 all area service COD available Any Time
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
 
Marketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent ChirchirMarketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent Chirchir
 
Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737Independent Call Girls Andheri Nightlaila 9967584737
Independent Call Girls Andheri Nightlaila 9967584737
 
Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...
 
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort ServiceCall US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
Call US-88OO1O2216 Call Girls In Mahipalpur Female Escort Service
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Perera
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
 
8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR8447779800, Low rate Call girls in Saket Delhi NCR
8447779800, Low rate Call girls in Saket Delhi NCR
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
 
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
8447779800, Low rate Call girls in Uttam Nagar Delhi NCR
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
Buy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail AccountsBuy gmail accounts.pdf Buy Old Gmail Accounts
Buy gmail accounts.pdf Buy Old Gmail Accounts
 
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyot
 

Datacenter@Night: How Big Data Technologies Power Facebook

  • 1. 1 Kannan Muthukkaruppan & Karthik Ranganathan Jun/20/2013 How Big Data Technologies Power Facebook How Big Data Technologies Power Facebook Karthik Ranganathan September, 2013
  • 2. 2 Introduction Email: karthik@nutanix.com Twitter: @KarthikR Current: Member of Technical Staff, Nutanix Background: Technical Engineering Lead at Facebook. Co- built Cassandra for Facebook Inbox Search and improved performance and resiliency of Hbase for Facebook Messages and Search Indexing.
  • 3. 3 Agenda  Big data at Facebook  HBase use cases • OLTP • Analytics  Operating at scale  The Nutanix solution
  • 4. 4 Big Data at Facebook  OLTP • User databases (MySQL) • Photos (Haystack) • Facebook Messages, Operational Data Store (HBase)  Warehouse • Hive Analytics • Graph Search Indexing
  • 5. 5 HBase in a nutshell  Apache project, modeled after BigTable  Distributed, large scale data store  Built on top of Hadoop DFS (HDFS)  Efficient at random reads and writes
  • 6. 6 FB’s Largest Hbase Application Facebook Messages
  • 8. 8 Why HBase?  Evaluated a bunch of different options • MySQL, Cassandra, building a custom storage system for messages  Horizontal Scalability  Automatic failover and load balancing  Optimized for write-heavy workloads  HDFS already battle-tested at Facebook  HBase’s strong consistency model
  • 9. 9 Quick stats (as of Nov 2011)  Traffic to HBase • Billions of messages per day • 75B+ rpc’s per day  Usage pattern • 55% reads, 45% writes • Average write: 16 KV’s to multiple CF’s
  • 10. 10 Data Sizes  7PB+ online data • ~21PB with replication • LZO compressed • Excludes backups  Growth rate • 500TB+ per month • ~20PB of raw disk per year!
  • 11. 11 Growing with size  Constant need of features with growth  Read and write path improvements • Performance optimizations • IOPS reduction • New database file format  Intelligent data and compute placement • Shard level block placement • Locality based load-balancing
  • 12. 12 Other OLTP use cases of HBase  Operational Data Store  Multi-tenant KeyValue store  Site integrity – fighting spam
  • 13. 13 Warehouse use cases of HBase  Graph Search Indexing • Complex application logic • Multiple verticals  Hive over HBase • Realtime data ingest • Enables real-time analytics
  • 14. 14 Real-time monitoring and anomaly detection Operational Data Store
  • 15. 15 ODS: Facebook’s #1 Debugging Tool  Collects metrics from production servers  Supports complex aggregations and transformations  Really well-designed UI
  • 16. 16 Quick stats  Traffic to HBase • 150B+ ops per day  Usage pattern • Heavy reads of recent data • Frequent MR jobs for rollups • TTL to expire older data
  • 18. 18 Real-time URL/Domain Insights  Deep analytics for websites • Facebook widgets  Massive scale • Billions of URL’s • Millions of increments/sec
  • 19. 19 Detailed Insights  Tracks many metrics • Clicks, likes, shares, impr essions • Referral traffic  Detailed breakdown • Age buckets, gender, location
  • 21. 21 A Multi-tenant solution on HBase  Generic Key-Value store • Multiple apps on the same cluster • Transparent schema design • Simple API put(appid, key, value) value = get(appid, key)
  • 23. 23 Multi-tenancy Issues  Not a self-service model • Each app is reviewed  Global and per-app metrics • Monitor RPCs by type, latencies, errors • Friendly names for apps  If things went wrong • Per-app kill switch
  • 24. 24 Powering FB’s Semantic Search Engine Graph Search Indexing
  • 25. 25 Framework to build search indexes  Multiple, independent input sources  HBase stores document info  Output is the search index image rowKey = document id value = terms, document data
  • 27. 27 Do’s and Do-Not’s From Experience Operating at Scale
  • 28. 28 Design for failures(!)  Architect for failures and manageability  No single point of failure • Killing any process is legit  Minimize manual intervention • Especially for frequent failures  Uptime is important • Rolling upgrades are the norm • Need to survive rack failures
  • 29. 29 Dashboard and Metrics  Single place to graph/report everything  RPC calls  SLA misses • Latencies, p99, Errors • Per-request profiling  Cluster and node health  Network Utilization
  • 30. 30 Health Checks  Constantly monitor nodes  Auto-exclude nodes on failure • Machine not ssh-able • Hardware failures (HDD failure, etc) • Do NOT exclude on rack failures  Auto-include nodes once repaired  Rate limit remediation of nodes
  • 31. 31 In a nutshell…  Use commodity hardware  Scaling out is #1  Efficiency is #2 • though pretty close behind scale-out  Design for failures • Frequent failures must be auto handled  Metrics, Metrics, Metrics!
  • 33. 33 Nutanix compared with HBase  Evaluated a bunch of different options • MySQL, Cassandra, building a custom storage system for messages  Horizontal Scalability  Just add more nodes to scale out  Automatic failover and load balancing  When a node goes down, others take its place automatically  Load of node that went down is distributed to many others
  • 34. 34 Nutanix compared with HBase philosophy  Optimized for write-heavy workloads  Optimized for virtualized environments  Read and write heavy workloads  Transparent use of flash to boost perf  HDFS already battle-tested at Facebook  Nutanix is also quite battle-tested  HBase’s strong consistency model  Nutanix is also strongly consistent
  • 35. 35 Other aspects of Nutanix  Architected for failures and manageability  No single point of failure  Minimal manual intervention for frequent failures  Uptime is important  Rolling upgrades are the norm • Need to survive rack failures  Single place to graph/report everything  Prism UI to report and manage the entire cluster  Constantly monitor nodes  Auto-exclude nodes on failure
  • 36. 36 In a nutshell about Nutanix…  Runs on commodity hardware  Scaling out is #1  Drop in scale out for nodes  Efficiency is #2  Constant work on perf improvements  Design for failures  Frequent failures auto handled  Alerts in UI for many other states  Metrics, Metrics, Metrics!  Prism UI gives insights into the cluster health
  • 38. 38NUTANIX INC. – CONFIDENTIAL AND PROPRIETARY