SlideShare a Scribd company logo
1 of 33
Download to read offline
National Engineering
& Technical Operations
How Comcast Turns Big Data into Real-Time
Operational Insights
Patrick Shumate
CDN Engineer
VSS CDN Engineering
Patrick Shumate CDN Engineering @ Comcast
–  Data nerd supporting Content Delivery
–  Avid cyclist
–  Home brewer
Brett Sheppard Big Data @ Splunk
–  Data nerd supporting Big Data Enterprise Architectures
–  Avid runner
–  Home drinker
How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 20142
Speakers
Methods and Process (operating on data)
CDN Operations
Sochi Winter Olympic Games
Agenda
How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 20143
Methods
Experimentation / Inquisition
Define KPI
Model Steady State
Predict Capacity
Effect without Causation
How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 20144
Procedures
Track
Alarm (real time)
Report (coffee time)
Visualize
Paper-cuts vs. Antennas
How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 20145
Comcast IPCDN Summary
●  Comcast Content Router
–  Stateless
–  DNS Round Robin
●  Rascal Health Monitoring
●  12 Monkeys Configuration Management
●  ATS Caches
●  Splunk Machine Data (Log) Collection and Analytics
6 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
The Comcast Content Router (CCR)
●  Tomcat Java application built in-house
●  Multiple VMs around the country in DNS Round Robin
●  Routes “by” DNS, HTTP 302, or REST
●  Can route based on:
–  Regexp on URL host name (DNS and HTTP 302 redirect)
–  Regexp on URL Path and headers (HTTP 302 redirect)
–  Client location
●  Coverage Zone File from network
●  Geo IP lookup
–  Edge cache health
–  Edge cache load
7 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Rascal
●  HTTP GETs vital stats from each cache every 5 seconds
–  Modified stats_over_http plugin on caches exposes app & system stats
●  Determines and exposes state of caches to CRs
●  Can allow for real time monitoring / graphing of CDN
●  Can Expose 5 min avg/min/max to NE&TO Service Performance DB
●  Redundant by having 2 instances running independent of each other
–  CRs pick one randomly
8 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Configuration Management
●  Twelve Monkeys tool built in-house
●  Web based jQuery UI
●  Mojolicious Perl framework
●  MySQL database
●  REST interfaces
●  Integrated into standard Ops methods and best practices from day one
●  Monitoring from Health Protocol through Rascal server
9 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
The Caches - Software
●  Any HTTP 1.1 Compliant cache will work
●  We chose Apache Traffic Server (ATS)
–  Top Level Apache project (NOT httpd!)
–  Extremely scalable and proven
–  Very good with our VOD load
–  Efficient storage subsystem uses raw disks
–  Extensible through plugin API
–  Vibrant development community
–  Added handful of plugins for specific use cases
10 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Machine Data Files and Reporting
●  Splunk>
●  The only commercial product we use
●  Well defined interfaces - No vendor lock-in possible
●  ipCDN usage metrics by delivery service
11 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Demos
Splunk is a Different Approach for Raw Unstructured Big Data
13
Built	by	IT	pros	for	IT	pros	
One	code	base	
Open	architecture	
Flexible	and	extensible	
Scales	to	big	data	
Transparent	support	
It’s	all	about	the	technical	and	business	user	from	novice	to	guru	
Laptop	to	datacenter,	agent	to	server,	native	to	virtual	indexes	
Files	versus	database,	REST	API,	scriptable,	SDKs	
Any	data,	any	format,	different	views,	built	to	be	extended	
Not	filtered,	not	“dumbed”	down,	not	locked	into	a	fixed	schema	
Public	documentation,	public	roadmap,	real	engineers	on	IRC	
	
	
	
How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Inside Search-time Knowledge Extraction
14 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
And	user-defined	fields	
Automatically	discovered	fields	
...	enable	statistics	and	precise	search	on	specific	fields:
Real-time Analytics with Managed Forwarders
15 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Data	
Parsing	Queue	
Parsing	Pipeline	
•  Source,	event	typing	
•  Character	set	
normalization	
•  Line	breaking	
•  Timestamp	identification	
•  Regex	transforms	
Indexing	
Pipeline	
Real-time	
Buffer	
Raw	data	
Index	Files	
Real-time	
Search	
Process	
Monitor	Input	
Index	Queue	
TCP/UDP	Input	
Scripted	Input	
Splunk	
Index
Data Models and Pivot
16
•  Describe	how	underlying	data	is	
represented	and	accessed	
•  Drag-and-drop	interface	for	
non-specialists	to	analyze	raw,	
unstructured	data		
•  Click	to	visualize	any	chart	type;	
reports	dynamically	update	
when	fields	change	
Select	fields	from		
data	model	
Time	window	
All	chart	types	available	in	the	chart	toolbox	
Save	report		
to	share	
Data	models:	hierarchical	object	view	of	underlying	data	
Add	constraints	to	
filter	out	events		
How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Integration Methods
17
Dashboards and Views
•  Simple	XML,	
JavaScript,	
Django	
•  REST	API		
•  iframe	embed	
User Interface (UI) Extensibility
•  Interactive	
dashboards	and	
user	workflows		
•  Custom	styling,	
behavior	&	visuals	
•  Integrate	charts,	dashboards	and	query	results	into	other	applications	
•  Workflows	can	trigger	an	action	in	an	external	system	or	use	REST	endpoints	
•  ODBC	driver	to	integrate	with	Tableau	and	other	3rd-party	visualization	software		
How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Winter Olympic Games 2014 in Sochi
Sports! Wait how many time zones?
Events - on-demand
How quick can we get it “on menu”
How do we track, troubleshoot, and triage
18 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
A Good Day in Content
19 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Credit: Flickr User DVIDSHUB, via CC
Credit: defense.gov
Credit:hotlightsandcoldsteel.com
What it Feels Like to Broadcast the Olympics
How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 201420
Ingesting Data from Sochi
21 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Working with Multiple Providers for Sports Programming
22 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
23 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
High-Definition and Standard-Definition Content Receipt Status
Ingest Tracking
24 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Demos
The Nouns
Splunk Forwarders
Flume ( Kafka)
Hadoop / Hive
scripted inputs / outputs
ETL to time series > Charts > wikis = dashboards
API mining
26 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Turn Diverse Raw Unstructured Data into Operational Intelligence
27 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Search Commands and Graphing
28 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Operational Dashboards
Presentation title (optional)2929 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
Demos
Costs/ Benefit
MTTR
Automation
Reduction in skillset
Fewer admins
More SME
Presentation title (optional)31
National Engineering 

& Technical Operations
How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympics Content Broadcasting

More Related Content

What's hot

Money Heist - A Stream Processing Original! | Meha Pandey and Shengze Yu, Net...
Money Heist - A Stream Processing Original! | Meha Pandey and Shengze Yu, Net...Money Heist - A Stream Processing Original! | Meha Pandey and Shengze Yu, Net...
Money Heist - A Stream Processing Original! | Meha Pandey and Shengze Yu, Net...HostedbyConfluent
 
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...sparktc
 
You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard
You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard
You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard confluent
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsVMware Tanzu
 
Cloud Experience: Data-driven Applications Made Simple and Fast
Cloud Experience: Data-driven Applications Made Simple and FastCloud Experience: Data-driven Applications Made Simple and Fast
Cloud Experience: Data-driven Applications Made Simple and FastDatabricks
 
SIEM Modernization: Build a Situationally Aware Organization with Apache Kafka®
SIEM Modernization: Build a Situationally Aware Organization with Apache Kafka®SIEM Modernization: Build a Situationally Aware Organization with Apache Kafka®
SIEM Modernization: Build a Situationally Aware Organization with Apache Kafka®confluent
 
Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams confluent
 
Real-Time Health Score Application using Apache Spark on Kubernetes
Real-Time Health Score Application using Apache Spark on KubernetesReal-Time Health Score Application using Apache Spark on Kubernetes
Real-Time Health Score Application using Apache Spark on KubernetesDatabricks
 
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...HostedbyConfluent
 
Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...
Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...
Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...Big Data Spain
 
Real time big data stream processing
Real time big data stream processing Real time big data stream processing
Real time big data stream processing Luay AL-Assadi
 
Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...
Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...
Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...Spark Summit
 
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...HostedbyConfluent
 
Chris D'Agostino | Kafka Summit 2018 Keynote (Building an Enterprise Streamin...
Chris D'Agostino | Kafka Summit 2018 Keynote (Building an Enterprise Streamin...Chris D'Agostino | Kafka Summit 2018 Keynote (Building an Enterprise Streamin...
Chris D'Agostino | Kafka Summit 2018 Keynote (Building an Enterprise Streamin...confluent
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark Summit
 
Data Warehousing Patterns for Hadoop
Data Warehousing Patterns for HadoopData Warehousing Patterns for Hadoop
Data Warehousing Patterns for HadoopMichelle Ufford
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineTrieu Nguyen
 
One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)Simon Harrer
 
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...Codemotion
 

What's hot (20)

Money Heist - A Stream Processing Original! | Meha Pandey and Shengze Yu, Net...
Money Heist - A Stream Processing Original! | Meha Pandey and Shengze Yu, Net...Money Heist - A Stream Processing Original! | Meha Pandey and Shengze Yu, Net...
Money Heist - A Stream Processing Original! | Meha Pandey and Shengze Yu, Net...
 
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
How Spark Enables the Internet of Things: Efficient Integration of Multiple ...
 
You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard
You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard
You Must Construct Additional Pipelines: Pub-Sub on Kafka at Blizzard
 
Cloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive ApplicationsCloud-Native Patterns for Data-Intensive Applications
Cloud-Native Patterns for Data-Intensive Applications
 
Cloud Experience: Data-driven Applications Made Simple and Fast
Cloud Experience: Data-driven Applications Made Simple and FastCloud Experience: Data-driven Applications Made Simple and Fast
Cloud Experience: Data-driven Applications Made Simple and Fast
 
SIEM Modernization: Build a Situationally Aware Organization with Apache Kafka®
SIEM Modernization: Build a Situationally Aware Organization with Apache Kafka®SIEM Modernization: Build a Situationally Aware Organization with Apache Kafka®
SIEM Modernization: Build a Situationally Aware Organization with Apache Kafka®
 
Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams Building Pinterest Real-Time Ads Platform Using Kafka Streams
Building Pinterest Real-Time Ads Platform Using Kafka Streams
 
Real-Time Health Score Application using Apache Spark on Kubernetes
Real-Time Health Score Application using Apache Spark on KubernetesReal-Time Health Score Application using Apache Spark on Kubernetes
Real-Time Health Score Application using Apache Spark on Kubernetes
 
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
Apache Pinot Case Study: Building Distributed Analytics Systems Using Apache ...
 
Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...
Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...
Apache Flink for IoT: How Event-Time Processing Enables Easy and Accurate Ana...
 
Real time big data stream processing
Real time big data stream processing Real time big data stream processing
Real time big data stream processing
 
Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...
Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...
Regulatory Reporting of Asset Trading Using Apache Spark-(Sudipto Shankar Das...
 
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
Real-time Analytics with Upsert Using Apache Kafka and Apache Pinot | Yupeng ...
 
Chris D'Agostino | Kafka Summit 2018 Keynote (Building an Enterprise Streamin...
Chris D'Agostino | Kafka Summit 2018 Keynote (Building an Enterprise Streamin...Chris D'Agostino | Kafka Summit 2018 Keynote (Building an Enterprise Streamin...
Chris D'Agostino | Kafka Summit 2018 Keynote (Building an Enterprise Streamin...
 
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
 
Data Warehousing Patterns for Hadoop
Data Warehousing Patterns for HadoopData Warehousing Patterns for Hadoop
Data Warehousing Patterns for Hadoop
 
Building Reactive Real-time Data Pipeline
Building Reactive Real-time Data PipelineBuilding Reactive Real-time Data Pipeline
Building Reactive Real-time Data Pipeline
 
One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)One Kubernetes to rule them all (ZEUS 2019 Keynote)
One Kubernetes to rule them all (ZEUS 2019 Keynote)
 
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
Fast Cars, Big Data - How Streaming Can Help Formula 1 - Tugdual Grall - Code...
 
The Evolution of Big Data Pipelines at Intuit
The Evolution of Big Data Pipelines at Intuit The Evolution of Big Data Pipelines at Intuit
The Evolution of Big Data Pipelines at Intuit
 

Similar to How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympics Content Broadcasting

Big and fast data strategy 2017 jr
Big and fast data strategy 2017 jrBig and fast data strategy 2017 jr
Big and fast data strategy 2017 jrJonathan Raspaud
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainMapR Technologies
 
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSAWS User Group Kochi
 
StreamCentral for the IT Professional
StreamCentral for the IT ProfessionalStreamCentral for the IT Professional
StreamCentral for the IT ProfessionalRaheel Retiwalla
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database RoundtableEric Kavanagh
 
Big Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platformBig Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platformCaserta
 
Cloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow AnalysisCloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow AnalysisAlex Henthorn-Iwane
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analyticskgshukla
 
IS-4082, Real-Time insight in Big Data – Even faster using HSA, by Norbert He...
IS-4082, Real-Time insight in Big Data – Even faster using HSA, by Norbert He...IS-4082, Real-Time insight in Big Data – Even faster using HSA, by Norbert He...
IS-4082, Real-Time insight in Big Data – Even faster using HSA, by Norbert He...AMD Developer Central
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLSingleStore
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsStreamsets Inc.
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduCloudera, Inc.
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization Denodo
 
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025Nicola Sandoli
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...Amazon Web Services
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyAlluxio, Inc.
 

Similar to How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympics Content Broadcasting (20)

Big and fast data strategy 2017 jr
Big and fast data strategy 2017 jrBig and fast data strategy 2017 jr
Big and fast data strategy 2017 jr
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWSACDKOCHI19 - Next Generation Data Analytics Platform on AWS
ACDKOCHI19 - Next Generation Data Analytics Platform on AWS
 
StreamCentral for the IT Professional
StreamCentral for the IT ProfessionalStreamCentral for the IT Professional
StreamCentral for the IT Professional
 
Horses for Courses: Database Roundtable
Horses for Courses: Database RoundtableHorses for Courses: Database Roundtable
Horses for Courses: Database Roundtable
 
Big Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platformBig Data 2.0: ETL & Analytics: Implementing a next generation platform
Big Data 2.0: ETL & Analytics: Implementing a next generation platform
 
Cloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow AnalysisCloud-Scale BGP and NetFlow Analysis
Cloud-Scale BGP and NetFlow Analysis
 
Pivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream AnalyticsPivotal Real Time Data Stream Analytics
Pivotal Real Time Data Stream Analytics
 
IS-4082, Real-Time insight in Big Data – Even faster using HSA, by Norbert He...
IS-4082, Real-Time insight in Big Data – Even faster using HSA, by Norbert He...IS-4082, Real-Time insight in Big Data – Even faster using HSA, by Norbert He...
IS-4082, Real-Time insight in Big Data – Even faster using HSA, by Norbert He...
 
Big Data Ready Enterprise
Big Data Ready Enterprise Big Data Ready Enterprise
Big Data Ready Enterprise
 
Real-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQLReal-Time Analytics with Confluent and MemSQL
Real-Time Analytics with Confluent and MemSQL
 
Druid @ branch
Druid @ branch Druid @ branch
Druid @ branch
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSetsEnabling Next Gen Analytics with Azure Data Lake and StreamSets
Enabling Next Gen Analytics with Azure Data Lake and StreamSets
 
Simplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache KuduSimplifying Real-Time Architectures for IoT with Apache Kudu
Simplifying Real-Time Architectures for IoT with Apache Kudu
 
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
DAMA & Denodo Webinar: Modernizing Data Architecture Using Data Virtualization
 
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
Tibco Augmented Intelligence - Analytics, IoT, Big Data, Streaming 20161025
 
Hadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data WarehouseHadoop and Your Enterprise Data Warehouse
Hadoop and Your Enterprise Data Warehouse
 
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
AWS Partner Webcast - Analyze Big Data for Consumer Applications with Looker ...
 
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio JourneyModernizing Global Shared Data Analytics Platform and our Alluxio Journey
Modernizing Global Shared Data Analytics Platform and our Alluxio Journey
 

More from Brett Sheppard

5 ways-to-improve-your-security-with-splunk
5 ways-to-improve-your-security-with-splunk5 ways-to-improve-your-security-with-splunk
5 ways-to-improve-your-security-with-splunkBrett Sheppard
 
Sample Google Paid campaign results
Sample Google Paid campaign resultsSample Google Paid campaign results
Sample Google Paid campaign resultsBrett Sheppard
 
Summary of Made to Stick book
Summary of Made to Stick bookSummary of Made to Stick book
Summary of Made to Stick bookBrett Sheppard
 
Shift from manual to interactive reporting
Shift from manual to interactive reportingShift from manual to interactive reporting
Shift from manual to interactive reportingBrett Sheppard
 
Brett sheppard references
Brett sheppard referencesBrett sheppard references
Brett sheppard referencesBrett Sheppard
 
Datadog APM Product Launch
Datadog APM Product LaunchDatadog APM Product Launch
Datadog APM Product LaunchBrett Sheppard
 
Brett Sheppard Sample Portfolio
Brett Sheppard Sample PortfolioBrett Sheppard Sample Portfolio
Brett Sheppard Sample PortfolioBrett Sheppard
 
Idc datadog-expands-into-apm
Idc datadog-expands-into-apmIdc datadog-expands-into-apm
Idc datadog-expands-into-apmBrett Sheppard
 
Tdwi brett-sheppard-interview-april-2014
Tdwi brett-sheppard-interview-april-2014Tdwi brett-sheppard-interview-april-2014
Tdwi brett-sheppard-interview-april-2014Brett Sheppard
 
SEO Checklist For Rapid-growth Startups
SEO Checklist For Rapid-growth StartupsSEO Checklist For Rapid-growth Startups
SEO Checklist For Rapid-growth StartupsBrett Sheppard
 
GigaOM Putting Big Data to Work by Brett Sheppard
GigaOM Putting Big Data to Work by Brett SheppardGigaOM Putting Big Data to Work by Brett Sheppard
GigaOM Putting Big Data to Work by Brett SheppardBrett Sheppard
 
DxContinuum Forrester Webinar
DxContinuum Forrester WebinarDxContinuum Forrester Webinar
DxContinuum Forrester WebinarBrett Sheppard
 
Yahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Yahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop ClustersYahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Yahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop ClustersBrett Sheppard
 

More from Brett Sheppard (16)

5 ways-to-improve-your-security-with-splunk
5 ways-to-improve-your-security-with-splunk5 ways-to-improve-your-security-with-splunk
5 ways-to-improve-your-security-with-splunk
 
Sample Google Paid campaign results
Sample Google Paid campaign resultsSample Google Paid campaign results
Sample Google Paid campaign results
 
Summary of Made to Stick book
Summary of Made to Stick bookSummary of Made to Stick book
Summary of Made to Stick book
 
Shift from manual to interactive reporting
Shift from manual to interactive reportingShift from manual to interactive reporting
Shift from manual to interactive reporting
 
Brett sheppard references
Brett sheppard referencesBrett sheppard references
Brett sheppard references
 
Datadog APM Product Launch
Datadog APM Product LaunchDatadog APM Product Launch
Datadog APM Product Launch
 
Brett Sheppard Sample Portfolio
Brett Sheppard Sample PortfolioBrett Sheppard Sample Portfolio
Brett Sheppard Sample Portfolio
 
Idc datadog-expands-into-apm
Idc datadog-expands-into-apmIdc datadog-expands-into-apm
Idc datadog-expands-into-apm
 
Tdwi brett-sheppard-interview-april-2014
Tdwi brett-sheppard-interview-april-2014Tdwi brett-sheppard-interview-april-2014
Tdwi brett-sheppard-interview-april-2014
 
Datadog brief
Datadog briefDatadog brief
Datadog brief
 
SEO Checklist For Rapid-growth Startups
SEO Checklist For Rapid-growth StartupsSEO Checklist For Rapid-growth Startups
SEO Checklist For Rapid-growth Startups
 
Rapid-fire BI
Rapid-fire BIRapid-fire BI
Rapid-fire BI
 
GigaOM Putting Big Data to Work by Brett Sheppard
GigaOM Putting Big Data to Work by Brett SheppardGigaOM Putting Big Data to Work by Brett Sheppard
GigaOM Putting Big Data to Work by Brett Sheppard
 
DxContinuum Forrester Webinar
DxContinuum Forrester WebinarDxContinuum Forrester Webinar
DxContinuum Forrester Webinar
 
Cloudera Hunk
Cloudera HunkCloudera Hunk
Cloudera Hunk
 
Yahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Yahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop ClustersYahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
Yahoo Enabling Exploratory Analytics of Data in Shared-service Hadoop Clusters
 

Recently uploaded

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 

Recently uploaded (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

How Comcast Turns Big Data into Real Time Operational Insights: Winter Olympics Content Broadcasting

  • 1. National Engineering & Technical Operations How Comcast Turns Big Data into Real-Time Operational Insights Patrick Shumate CDN Engineer VSS CDN Engineering
  • 2. Patrick Shumate CDN Engineering @ Comcast –  Data nerd supporting Content Delivery –  Avid cyclist –  Home brewer Brett Sheppard Big Data @ Splunk –  Data nerd supporting Big Data Enterprise Architectures –  Avid runner –  Home drinker How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 20142 Speakers
  • 3. Methods and Process (operating on data) CDN Operations Sochi Winter Olympic Games Agenda How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 20143
  • 4. Methods Experimentation / Inquisition Define KPI Model Steady State Predict Capacity Effect without Causation How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 20144
  • 5. Procedures Track Alarm (real time) Report (coffee time) Visualize Paper-cuts vs. Antennas How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 20145
  • 6. Comcast IPCDN Summary ●  Comcast Content Router –  Stateless –  DNS Round Robin ●  Rascal Health Monitoring ●  12 Monkeys Configuration Management ●  ATS Caches ●  Splunk Machine Data (Log) Collection and Analytics 6 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 7. The Comcast Content Router (CCR) ●  Tomcat Java application built in-house ●  Multiple VMs around the country in DNS Round Robin ●  Routes “by” DNS, HTTP 302, or REST ●  Can route based on: –  Regexp on URL host name (DNS and HTTP 302 redirect) –  Regexp on URL Path and headers (HTTP 302 redirect) –  Client location ●  Coverage Zone File from network ●  Geo IP lookup –  Edge cache health –  Edge cache load 7 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 8. Rascal ●  HTTP GETs vital stats from each cache every 5 seconds –  Modified stats_over_http plugin on caches exposes app & system stats ●  Determines and exposes state of caches to CRs ●  Can allow for real time monitoring / graphing of CDN ●  Can Expose 5 min avg/min/max to NE&TO Service Performance DB ●  Redundant by having 2 instances running independent of each other –  CRs pick one randomly 8 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 9. Configuration Management ●  Twelve Monkeys tool built in-house ●  Web based jQuery UI ●  Mojolicious Perl framework ●  MySQL database ●  REST interfaces ●  Integrated into standard Ops methods and best practices from day one ●  Monitoring from Health Protocol through Rascal server 9 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 10. The Caches - Software ●  Any HTTP 1.1 Compliant cache will work ●  We chose Apache Traffic Server (ATS) –  Top Level Apache project (NOT httpd!) –  Extremely scalable and proven –  Very good with our VOD load –  Efficient storage subsystem uses raw disks –  Extensible through plugin API –  Vibrant development community –  Added handful of plugins for specific use cases 10 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 11. Machine Data Files and Reporting ●  Splunk> ●  The only commercial product we use ●  Well defined interfaces - No vendor lock-in possible ●  ipCDN usage metrics by delivery service 11 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 12. Demos
  • 13. Splunk is a Different Approach for Raw Unstructured Big Data 13 Built by IT pros for IT pros One code base Open architecture Flexible and extensible Scales to big data Transparent support It’s all about the technical and business user from novice to guru Laptop to datacenter, agent to server, native to virtual indexes Files versus database, REST API, scriptable, SDKs Any data, any format, different views, built to be extended Not filtered, not “dumbed” down, not locked into a fixed schema Public documentation, public roadmap, real engineers on IRC How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 14. Inside Search-time Knowledge Extraction 14 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014 And user-defined fields Automatically discovered fields ... enable statistics and precise search on specific fields:
  • 15. Real-time Analytics with Managed Forwarders 15 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014 Data Parsing Queue Parsing Pipeline •  Source, event typing •  Character set normalization •  Line breaking •  Timestamp identification •  Regex transforms Indexing Pipeline Real-time Buffer Raw data Index Files Real-time Search Process Monitor Input Index Queue TCP/UDP Input Scripted Input Splunk Index
  • 16. Data Models and Pivot 16 •  Describe how underlying data is represented and accessed •  Drag-and-drop interface for non-specialists to analyze raw, unstructured data •  Click to visualize any chart type; reports dynamically update when fields change Select fields from data model Time window All chart types available in the chart toolbox Save report to share Data models: hierarchical object view of underlying data Add constraints to filter out events How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 17. Integration Methods 17 Dashboards and Views •  Simple XML, JavaScript, Django •  REST API •  iframe embed User Interface (UI) Extensibility •  Interactive dashboards and user workflows •  Custom styling, behavior & visuals •  Integrate charts, dashboards and query results into other applications •  Workflows can trigger an action in an external system or use REST endpoints •  ODBC driver to integrate with Tableau and other 3rd-party visualization software How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 18. Winter Olympic Games 2014 in Sochi Sports! Wait how many time zones? Events - on-demand How quick can we get it “on menu” How do we track, troubleshoot, and triage 18 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 19. A Good Day in Content 19 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 20. Credit: Flickr User DVIDSHUB, via CC Credit: defense.gov Credit:hotlightsandcoldsteel.com What it Feels Like to Broadcast the Olympics How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 201420
  • 21. Ingesting Data from Sochi 21 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 22. Working with Multiple Providers for Sports Programming 22 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 23. 23 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014 High-Definition and Standard-Definition Content Receipt Status
  • 24. Ingest Tracking 24 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 25. Demos
  • 26. The Nouns Splunk Forwarders Flume ( Kafka) Hadoop / Hive scripted inputs / outputs ETL to time series > Charts > wikis = dashboards API mining 26 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 27. Turn Diverse Raw Unstructured Data into Operational Intelligence 27 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 28. Search Commands and Graphing 28 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 29. Operational Dashboards Presentation title (optional)2929 How Comcast Turns Big Data Into Real-Time Operational Insights | Strata | February 2014
  • 30. Demos
  • 31. Costs/ Benefit MTTR Automation Reduction in skillset Fewer admins More SME Presentation title (optional)31
  • 32. National Engineering 
 & Technical Operations