Submit Search
Upload
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
•
2 likes
•
2,291 views
Spark Summit
Follow
Presentation at Spark Summit 2015
Read less
Read more
Data & Analytics
Report
Share
Report
Share
1 of 9
Download now
Download to read offline
Recommended
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Spark Summit
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark Summit
Apache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
Dr. Mirko Kämpf
Spark Summit East 2015 Keynote -- Databricks CEO Ion Stoica
Spark Summit East 2015 Keynote -- Databricks CEO Ion Stoica
Databricks
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Spark Summit
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
Spark Summit
Spark in the Enterprise - 2 Years Later by Alan Saldich
Spark in the Enterprise - 2 Years Later by Alan Saldich
Spark Summit
The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!
DataWorks Summit/Hadoop Summit
Recommended
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Hadoop and Spark-Perfect Together-(Arun C. Murthy, Hortonworks)
Spark Summit
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark and Hadoop at Production Scale-(Anil Gadre, MapR)
Spark Summit
Apache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
Dr. Mirko Kämpf
Spark Summit East 2015 Keynote -- Databricks CEO Ion Stoica
Spark Summit East 2015 Keynote -- Databricks CEO Ion Stoica
Databricks
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Spark Summit
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
R&D to Product Pipeline Using Apache Spark in AdTech: Spark Summit East talk ...
Spark Summit
Spark in the Enterprise - 2 Years Later by Alan Saldich
Spark in the Enterprise - 2 Years Later by Alan Saldich
Spark Summit
The key to unlocking the Value in the IoT? Managing the Data!
The key to unlocking the Value in the IoT? Managing the Data!
DataWorks Summit/Hadoop Summit
Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera
Cloudera, Inc.
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Spark Summit
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Databricks
Pandas UDF: Scalable Analysis with Python and PySpark
Pandas UDF: Scalable Analysis with Python and PySpark
Li Jin
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Sriram Krishnan
Auto Scaling Systems With Elastic Spark Streaming: Spark Summit East talk by ...
Auto Scaling Systems With Elastic Spark Streaming: Spark Summit East talk by ...
Spark Summit
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in Production
Codemotion
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
Spark Summit
A Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber Threats
Cloudera, Inc.
GPU 101: The Beast In Data Centers
GPU 101: The Beast In Data Centers
Rommel Garcia
Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)
Databricks
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Cloudera, Inc.
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutions
Stavros Kontopoulos
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
DataWorks Summit
Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...
DataWorks Summit/Hadoop Summit
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
Mathieu Dumoulin
Admiral Group
Admiral Group
DataWorks Summit/Hadoop Summit
Simplify and Scale Data Engineering Pipelines with Delta Lake
Simplify and Scale Data Engineering Pipelines with Delta Lake
Databricks
Apache solr performance and scalability effort update palo alto 2017%2 f7
Apache solr performance and scalability effort update palo alto 2017%2 f7
Cloudera, Inc.
Hadoop Everywhere
Hadoop Everywhere
DataWorks Summit/Hadoop Summit
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco Vasquez
MapR Technologies
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Cloudera, Inc.
More Related Content
What's hot
Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera
Cloudera, Inc.
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Spark Summit
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Databricks
Pandas UDF: Scalable Analysis with Python and PySpark
Pandas UDF: Scalable Analysis with Python and PySpark
Li Jin
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Sriram Krishnan
Auto Scaling Systems With Elastic Spark Streaming: Spark Summit East talk by ...
Auto Scaling Systems With Elastic Spark Streaming: Spark Summit East talk by ...
Spark Summit
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in Production
Codemotion
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
Spark Summit
A Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber Threats
Cloudera, Inc.
GPU 101: The Beast In Data Centers
GPU 101: The Beast In Data Centers
Rommel Garcia
Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)
Databricks
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Cloudera, Inc.
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutions
Stavros Kontopoulos
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
DataWorks Summit
Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...
DataWorks Summit/Hadoop Summit
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
Mathieu Dumoulin
Admiral Group
Admiral Group
DataWorks Summit/Hadoop Summit
Simplify and Scale Data Engineering Pipelines with Delta Lake
Simplify and Scale Data Engineering Pipelines with Delta Lake
Databricks
Apache solr performance and scalability effort update palo alto 2017%2 f7
Apache solr performance and scalability effort update palo alto 2017%2 f7
Cloudera, Inc.
Hadoop Everywhere
Hadoop Everywhere
DataWorks Summit/Hadoop Summit
What's hot
(20)
Supercharge Splunk with Cloudera
Supercharge Splunk with Cloudera
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Learnings Using Spark Streaming and DataFrames for Walmart Search: Spark Summ...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Deep Learning on Apache Spark at CERN’s Large Hadron Collider with Intel Tech...
Pandas UDF: Scalable Analysis with Python and PySpark
Pandas UDF: Scalable Analysis with Python and PySpark
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Data Platform at Twitter: Enabling Real-time & Batch Analytics at Scale
Auto Scaling Systems With Elastic Spark Streaming: Spark Summit East talk by ...
Auto Scaling Systems With Elastic Spark Streaming: Spark Summit East talk by ...
Real World Use Cases: Hadoop and NoSQL in Production
Real World Use Cases: Hadoop and NoSQL in Production
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
High Resolution Energy Modeling that Scales with Apache Spark 2.0 Spark Summi...
A Community Approach to Fighting Cyber Threats
A Community Approach to Fighting Cyber Threats
GPU 101: The Beast In Data Centers
GPU 101: The Beast In Data Centers
Announcing Databricks Cloud (Spark Summit 2014)
Announcing Databricks Cloud (Spark Summit 2014)
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Keynote – From MapReduce to Spark: An Ecosystem Evolves by Doug Cutting, Chie...
Machine learning at scale challenges and solutions
Machine learning at scale challenges and solutions
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
MaaS (Model as a Service): Modern Streaming Data Science with Apache Metron (...
Data infrastructure architecture for medium size organization: tips for colle...
Data infrastructure architecture for medium size organization: tips for colle...
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
State of the Art Robot Predictive Maintenance with Real-time Sensor Data
Admiral Group
Admiral Group
Simplify and Scale Data Engineering Pipelines with Delta Lake
Simplify and Scale Data Engineering Pipelines with Delta Lake
Apache solr performance and scalability effort update palo alto 2017%2 f7
Apache solr performance and scalability effort update palo alto 2017%2 f7
Hadoop Everywhere
Hadoop Everywhere
Viewers also liked
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco Vasquez
MapR Technologies
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Cloudera, Inc.
Apache Spark at Viadeo
Apache Spark at Viadeo
Cepoi Eugen
Intro to Apache Spark
Intro to Apache Spark
Cloudera, Inc.
Recommendation Techn
Recommendation Techn
Ted Dunning
Big Data Paris
Big Data Paris
MapR Technologies
The Evolution of Data Analysis with Hadoop - StampedeCon 2014
The Evolution of Data Analysis with Hadoop - StampedeCon 2014
StampedeCon
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Cloudera, Inc.
Intro to Spark development
Intro to Spark development
Spark Summit
Anatomy of Spark SQL Catalyst - Part 2
Anatomy of Spark SQL Catalyst - Part 2
datamantra
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
Cloudera, Inc.
Practical Machine Learning: Innovations in Recommendation Workshop
Practical Machine Learning: Innovations in Recommendation Workshop
MapR Technologies
Deep Dive : Spark Data Frames, SQL and Catalyst Optimizer
Deep Dive : Spark Data Frames, SQL and Catalyst Optimizer
Sachin Aggarwal
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Ted Dunning
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Cloudera, Inc.
Spark Application for Time Series Analysis
Spark Application for Time Series Analysis
MapR Technologies
Hadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
markgrover
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Chris Fregly
Introduction to Spark (Intern Event Presentation)
Introduction to Spark (Intern Event Presentation)
Databricks
Real Time and Big Data – It’s About Time
Real Time and Big Data – It’s About Time
MapR Technologies
Viewers also liked
(20)
Intro to Apache Spark by Marco Vasquez
Intro to Apache Spark by Marco Vasquez
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Hive, Impala, and Spark, Oh My: SQL-on-Hadoop in Cloudera 5.5
Apache Spark at Viadeo
Apache Spark at Viadeo
Intro to Apache Spark
Intro to Apache Spark
Recommendation Techn
Recommendation Techn
Big Data Paris
Big Data Paris
The Evolution of Data Analysis with Hadoop - StampedeCon 2014
The Evolution of Data Analysis with Hadoop - StampedeCon 2014
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Apache Kudu (Incubating): New Hadoop Storage for Fast Analytics on Fast Data ...
Intro to Spark development
Intro to Spark development
Anatomy of Spark SQL Catalyst - Part 2
Anatomy of Spark SQL Catalyst - Part 2
Moving Beyond Lambda Architectures with Apache Kudu
Moving Beyond Lambda Architectures with Apache Kudu
Practical Machine Learning: Innovations in Recommendation Workshop
Practical Machine Learning: Innovations in Recommendation Workshop
Deep Dive : Spark Data Frames, SQL and Catalyst Optimizer
Deep Dive : Spark Data Frames, SQL and Catalyst Optimizer
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Real-time Puppies and Ponies - Evolving Indicator Recommendations in Real-time
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Why Apache Spark is the Heir to MapReduce in the Hadoop Ecosystem
Spark Application for Time Series Analysis
Spark Application for Time Series Analysis
Hadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Advanced Apache Spark Meetup Spark SQL + DataFrames + Catalyst Optimizer + Da...
Introduction to Spark (Intern Event Presentation)
Introduction to Spark (Intern Event Presentation)
Real Time and Big Data – It’s About Time
Real Time and Big Data – It’s About Time
Similar to Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Stefan Lipp
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Stefan Lipp
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Cloudera, Inc.
Analytics, Everywhere. Keys to Effective Analytics and Data Discovery
Analytics, Everywhere. Keys to Effective Analytics and Data Discovery
DLT Solutions
Cloud expo 10 myths rex wang oracle ss
Cloud expo 10 myths rex wang oracle ss
Rex Wang
Stl meetup cloudera platform - january 2020
Stl meetup cloudera platform - january 2020
Adam Doyle
Big Data Fundamentals 6.6.18
Big Data Fundamentals 6.6.18
Cloudera, Inc.
Big Data Fundamentals
Big Data Fundamentals
Cloudera, Inc.
Seeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the Data
Cloudera, Inc.
Part 2: A Visual Dive into Machine Learning and Deep Learning
Part 2: A Visual Dive into Machine Learning and Deep Learning
Cloudera, Inc.
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA DATASCIENCE
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
DataWorks Summit
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
Timothy Spann
Get Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber Solution
Cloudera, Inc.
Optimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analytics
Cloudera, Inc.
Data Science in the Enterprise
Data Science in the Enterprise
The Hive
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
Cloudera, Inc.
Datameer6 for prospects - june 2016_v2
Datameer6 for prospects - june 2016_v2
Datameer
Oracle presentation at Tech Summit PR 2014
Oracle presentation at Tech Summit PR 2014
Tech Summit PR 2014
Similar to Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
(20)
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
Analytics, Everywhere. Keys to Effective Analytics and Data Discovery
Analytics, Everywhere. Keys to Effective Analytics and Data Discovery
Cloud expo 10 myths rex wang oracle ss
Cloud expo 10 myths rex wang oracle ss
Stl meetup cloudera platform - january 2020
Stl meetup cloudera platform - january 2020
Big Data Fundamentals 6.6.18
Big Data Fundamentals 6.6.18
Big Data Fundamentals
Big Data Fundamentals
Seeking Cybersecurity--Strategies to Protect the Data
Seeking Cybersecurity--Strategies to Protect the Data
Part 2: A Visual Dive into Machine Learning and Deep Learning
Part 2: A Visual Dive into Machine Learning and Deep Learning
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
NOVA Data Science Meetup 2-21-2018 Presentation Cloudera Data Science Workbench
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Cloud-Native Machine Learning: Emerging Trends and the Road Ahead
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
April 2024 - NLIT Cloudera Real-Time LLM Streaming 2024
Get Started with Cloudera’s Cyber Solution
Get Started with Cloudera’s Cyber Solution
Optimize your cloud strategy for machine learning and analytics
Optimize your cloud strategy for machine learning and analytics
Data Science in the Enterprise
Data Science in the Enterprise
The Vision & Challenge of Applied Machine Learning
The Vision & Challenge of Applied Machine Learning
Datameer6 for prospects - june 2016_v2
Datameer6 for prospects - june 2016_v2
Oracle presentation at Tech Summit PR 2014
Oracle presentation at Tech Summit PR 2014
More from Spark Summit
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
Spark Summit
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
Spark Summit
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Spark Summit
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Spark Summit
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
Spark Summit
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
Spark Summit
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Spark Summit
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Spark Summit
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
Spark Summit
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
Spark Summit
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Spark Summit
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Spark Summit
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spark Summit
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spark Summit
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
Spark Summit
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Spark Summit
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Spark Summit
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Spark Summit
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Spark Summit
More from Spark Summit
(20)
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Next CERN Accelerator Logging Service with Jakub Wozniak
Next CERN Accelerator Logging Service with Jakub Wozniak
Powering a Startup with Apache Spark with Kevin Kim
Powering a Startup with Apache Spark with Kevin Kim
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Goal Based Data Production with Sim Simeonov
Goal Based Data Production with Sim Simeonov
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
Recently uploaded
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
voginip
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
chwongval
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
e4aez8ss
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
Seán Kennedy
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
yuu sss
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
17djon017
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
Boston Institute of Analytics
Real-Time AI Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
Timothy Spann
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Boston Institute of Analytics
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
Jeremy Anderson
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
jennyeacort
How we prevented account sharing with MFA
How we prevented account sharing with MFA
Andrei Kaleshka
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
📊 Markus Baersch
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
yuu sss
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
Seán Kennedy
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
aleedritatuxx
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
Human37
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
John Sterrett
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
MYRABACSAFRA2
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
jennyeacort
Recently uploaded
(20)
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
Real-Time AI Streaming - AI Max Princeton
Real-Time AI Streaming - AI Max Princeton
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
How we prevented account sharing with MFA
How we prevented account sharing with MFA
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
modul pembelajaran robotic Workshop _ by Slidesgo.pptx
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
Identifying Appropriate Test Statistics Involving Population Mean
Identifying Appropriate Test Statistics Involving Population Mean
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Spark in the Hadoop Ecosystem-(Mike Olson, Cloudera)
1.
1 © Cloudera,
Inc. All rights reserved. Mike Olson | co-‐founder and chief strategy officer Spark in the Hadoop Ecosystem
2.
2 © Cloudera,
Inc. All rights reserved. Hadoop: From MapReduce to an Enterprise Data Hub Hadoop delivers: • One place for unlimited data • Unified, mulM-‐framework data access Enterprises require: • Leading Performance • Open Source, Open Standards • Enterprise Security • Data Governance • Complete Management Security and AdministraMon Unlimited Storage Process Discover Model Serve Deployment Flexibility On-‐Premises Appliances Engineered Systems Public Cloud Private Cloud Hybrid Cloud A modern data plaSorm plus what the enterprise requires.
3.
3 © Cloudera,
Inc. All rights reserved. Where Spark Fits in the Hadoop Ecosystem YARN: Shared resource management HDFS and HBase: Shared storage Impala Hive Pig MapReduce2 Search Spark Spark Streaming Hive (beta) Pig (beta) … With common • Security • Data governance • ConfiguraMon, deployment and operaMons across all components in the stack
4.
4 © Cloudera,
Inc. All rights reserved. Process millions of equity and bond market posiMons, and evaluate against future scenarios in minutes, versus days with MapReduce. Major Global Financial InsMtuMon
5.
5 © Cloudera,
Inc. All rights reserved. Monitor on-‐line user acMvity and opMmize content delivery and search results in real Mme. Large Consumer Company
6.
6 © Cloudera,
Inc. All rights reserved. Ingest and analyze complex data from a variety of sources conMnually, building new risk and value models in real Mme
7.
7 © Cloudera,
Inc. All rights reserved. Combine genomic and phenotype data with other data sources to understand disease onset and progression
8.
8 © Cloudera,
Inc. All rights reserved. Spark extends the Hadoop ecosystem with new analyMc and processing capabiliMes. 8 © Cloudera, Inc. All rights reserved.
9.
9 © Cloudera,
Inc. All rights reserved. Thank you! Mike Olson, chief strategy officer mike.olson@cloudera.com @mikeolson
Download now