Submit Search
Upload
Accelerating Spark Workloads in a Mesos Environment with Alluxio
•
0 likes
•
865 views
Alluxio, Inc.
Follow
MesosCon EU (Prague) 2017 Gene Pang, Software Engineer, Alluxio, Inc.
Read less
Read more
Technology
Report
Share
Report
Share
1 of 34
Download Now
Download to read offline
Recommended
Best Practices for Using Alluxio with Spark
Best Practices for Using Alluxio with Spark
Alluxio, Inc.
Alluxio Mesos Meetup - SMACK to SMAACK
Alluxio Mesos Meetup - SMACK to SMAACK
Alluxio, Inc.
Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017
Effective Spark with Alluxio at Strata+Hadoop World San Jose 2017
Alluxio, Inc.
Alluxio: Unify Data at Memory Speed at Strata and Hadoop World San Jose 2017
Alluxio: Unify Data at Memory Speed at Strata and Hadoop World San Jose 2017
Alluxio, Inc.
Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x perfo...
Introduction to Alluxio (formerly Tachyon) and how it brings up to 300x perfo...
Alluxio, Inc.
Spark Pipelines in the Cloud with Alluxio
Spark Pipelines in the Cloud with Alluxio
Alluxio, Inc.
Unify Data at Memory Speed by Haoyuan Li - VAULT Conference 2017
Unify Data at Memory Speed by Haoyuan Li - VAULT Conference 2017
Alluxio, Inc.
Best Practices for Using Alluxio with Spark
Best Practices for Using Alluxio with Spark
Alluxio, Inc.
More Related Content
What's hot
Best Practices for Using Alluxio with Spark
Best Practices for Using Alluxio with Spark
Alluxio, Inc.
Enable Fast Big Data Analytics on Ceph with Alluxio at Ceph Days 2017
Enable Fast Big Data Analytics on Ceph with Alluxio at Ceph Days 2017
Alluxio, Inc.
ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...
ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...
Alluxio, Inc.
Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...
Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...
Alluxio, Inc.
Alluxio Use Cases at Strata+Hadoop World Beijing 2016
Alluxio Use Cases at Strata+Hadoop World Beijing 2016
Alluxio, Inc.
Open Source Memory Speed Virtual Distributed Storage
Open Source Memory Speed Virtual Distributed Storage
Alluxio, Inc.
Flexible and Fast Storage for Deep Learning with Alluxio
Flexible and Fast Storage for Deep Learning with Alluxio
Alluxio, Inc.
Alluxio (formerly Tachyon): Open Source Memory Speed Virtual Distributed Storage
Alluxio (formerly Tachyon): Open Source Memory Speed Virtual Distributed Storage
Alluxio, Inc.
Running Solr in the Cloud at Memory Speed with Alluxio
Running Solr in the Cloud at Memory Speed with Alluxio
thelabdude
Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio, Inc.
Securely Enhancing Data Access in Hybrid Cloud with Alluxio
Securely Enhancing Data Access in Hybrid Cloud with Alluxio
Alluxio, Inc.
Speeding Up Spark Performance using Alluxio at China Unicom
Speeding Up Spark Performance using Alluxio at China Unicom
Alluxio, Inc.
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
Alluxio, Inc.
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Alluxio, Inc.
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri Simsa
Spark Summit
Alluxio Presentation at Strata San Jose 2016
Alluxio Presentation at Strata San Jose 2016
Jiří Šimša
Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio, Inc.
Alluxio (formerly Tachyon): The Journey thus far and the Road Ahead
Alluxio (formerly Tachyon): The Journey thus far and the Road Ahead
Alluxio, Inc.
Alluxio Presentation at AMPLab Summer Retreat 2016
Alluxio Presentation at AMPLab Summer Retreat 2016
Alluxio, Inc.
Alluxio-FUSE as a data access layer for Dask
Alluxio-FUSE as a data access layer for Dask
Alluxio, Inc.
What's hot
(20)
Best Practices for Using Alluxio with Spark
Best Practices for Using Alluxio with Spark
Enable Fast Big Data Analytics on Ceph with Alluxio at Ceph Days 2017
Enable Fast Big Data Analytics on Ceph with Alluxio at Ceph Days 2017
ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...
ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spar...
Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...
Alluxio (Formerly Tachyon): Unify Data At Memory Speed at Global Big Data Con...
Alluxio Use Cases at Strata+Hadoop World Beijing 2016
Alluxio Use Cases at Strata+Hadoop World Beijing 2016
Open Source Memory Speed Virtual Distributed Storage
Open Source Memory Speed Virtual Distributed Storage
Flexible and Fast Storage for Deep Learning with Alluxio
Flexible and Fast Storage for Deep Learning with Alluxio
Alluxio (formerly Tachyon): Open Source Memory Speed Virtual Distributed Storage
Alluxio (formerly Tachyon): Open Source Memory Speed Virtual Distributed Storage
Running Solr in the Cloud at Memory Speed with Alluxio
Running Solr in the Cloud at Memory Speed with Alluxio
Alluxio: Unify Data at Memory Speed; 2016-11-18
Alluxio: Unify Data at Memory Speed; 2016-11-18
Securely Enhancing Data Access in Hybrid Cloud with Alluxio
Securely Enhancing Data Access in Hybrid Cloud with Alluxio
Speeding Up Spark Performance using Alluxio at China Unicom
Speeding Up Spark Performance using Alluxio at China Unicom
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
Enterprise Distributed Query Service powered by Presto & Alluxio across cloud...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Building a high-performance data lake analytics engine at Alibaba Cloud with ...
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri Simsa
Alluxio Presentation at Strata San Jose 2016
Alluxio Presentation at Strata San Jose 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio Keynote at Strata+Hadoop World Beijing 2016
Alluxio (formerly Tachyon): The Journey thus far and the Road Ahead
Alluxio (formerly Tachyon): The Journey thus far and the Road Ahead
Alluxio Presentation at AMPLab Summer Retreat 2016
Alluxio Presentation at AMPLab Summer Retreat 2016
Alluxio-FUSE as a data access layer for Dask
Alluxio-FUSE as a data access layer for Dask
Similar to Accelerating Spark Workloads in a Mesos Environment with Alluxio
Accelerating Spark Workloads in an Apache Mesos Environment with Alluxio
Accelerating Spark Workloads in an Apache Mesos Environment with Alluxio
Alluxio, Inc.
Best Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene Pang
Spark Summit
Spark Pipelines in the Cloud with Alluxio by Bin Fan
Spark Pipelines in the Cloud with Alluxio by Bin Fan
Data Con LA
Best Practices for Using Alluxio with Apache Spark with Cheng Chang and Haoyu...
Best Practices for Using Alluxio with Apache Spark with Cheng Chang and Haoyu...
Databricks
Spark Pipelines in the Cloud with Alluxio with Gene Pang
Spark Pipelines in the Cloud with Alluxio with Gene Pang
Spark Summit
The Architecture of Decoupling Compute and Storage with Alluxio
The Architecture of Decoupling Compute and Storage with Alluxio
Alluxio, Inc.
Alluxio: Unify Data at Memory Speed
Alluxio: Unify Data at Memory Speed
Alluxio, Inc.
Getting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
Getting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
Alluxio, Inc.
Unify Data at Memory Speed
Unify Data at Memory Speed
Alluxio, Inc.
Best Practice in Accelerating Data Applications with Spark+Alluxio
Best Practice in Accelerating Data Applications with Spark+Alluxio
Alluxio, Inc.
Achieving compute and storage independence for data-driven workloads
Achieving compute and storage independence for data-driven workloads
Alluxio, Inc.
Data EcoSystem 2.0
Data EcoSystem 2.0
Alluxio, Inc.
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
Alluxio, Inc.
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Alluxio, Inc.
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri Simsa
Alluxio, Inc.
Open Source Data Orchestration for AI, Big Data, and Cloud
Open Source Data Orchestration for AI, Big Data, and Cloud
Alluxio, Inc.
Achieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud World
Alluxio, Inc.
Simplified Data Preparation for Machine Learning in Hybrid and Multi Clouds
Simplified Data Preparation for Machine Learning in Hybrid and Multi Clouds
Alluxio, Inc.
Dealing with kubesprawl tetris style !
Dealing with kubesprawl tetris style !
Taco Scargo
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
Alluxio, Inc.
Similar to Accelerating Spark Workloads in a Mesos Environment with Alluxio
(20)
Accelerating Spark Workloads in an Apache Mesos Environment with Alluxio
Accelerating Spark Workloads in an Apache Mesos Environment with Alluxio
Best Practices for Using Alluxio with Apache Spark with Gene Pang
Best Practices for Using Alluxio with Apache Spark with Gene Pang
Spark Pipelines in the Cloud with Alluxio by Bin Fan
Spark Pipelines in the Cloud with Alluxio by Bin Fan
Best Practices for Using Alluxio with Apache Spark with Cheng Chang and Haoyu...
Best Practices for Using Alluxio with Apache Spark with Cheng Chang and Haoyu...
Spark Pipelines in the Cloud with Alluxio with Gene Pang
Spark Pipelines in the Cloud with Alluxio with Gene Pang
The Architecture of Decoupling Compute and Storage with Alluxio
The Architecture of Decoupling Compute and Storage with Alluxio
Alluxio: Unify Data at Memory Speed
Alluxio: Unify Data at Memory Speed
Getting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
Getting Started with Apache Spark and Alluxio for Blazingly Fast Analytics
Unify Data at Memory Speed
Unify Data at Memory Speed
Best Practice in Accelerating Data Applications with Spark+Alluxio
Best Practice in Accelerating Data Applications with Spark+Alluxio
Achieving compute and storage independence for data-driven workloads
Achieving compute and storage independence for data-driven workloads
Data EcoSystem 2.0
Data EcoSystem 2.0
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
Alluxio 2.0 Deep Dive – Simplifying data access for cloud workloads
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds
Spark Summit EU talk by Jiri Simsa
Spark Summit EU talk by Jiri Simsa
Open Source Data Orchestration for AI, Big Data, and Cloud
Open Source Data Orchestration for AI, Big Data, and Cloud
Achieving Separation of Compute and Storage in a Cloud World
Achieving Separation of Compute and Storage in a Cloud World
Simplified Data Preparation for Machine Learning in Hybrid and Multi Clouds
Simplified Data Preparation for Machine Learning in Hybrid and Multi Clouds
Dealing with kubesprawl tetris style !
Dealing with kubesprawl tetris style !
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
Meetup at AI NextCon 2019: In-Stream data process, Data Orchestration & More
More from Alluxio, Inc.
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Alluxio, Inc.
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio, Inc.
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio, Inc.
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Alluxio, Inc.
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Alluxio, Inc.
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
Alluxio, Inc.
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
Alluxio, Inc.
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio, Inc.
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
Alluxio, Inc.
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
Alluxio, Inc.
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
Alluxio, Inc.
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
Alluxio, Inc.
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
Alluxio, Inc.
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
Alluxio, Inc.
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio, Inc.
Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...
Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...
Alluxio, Inc.
Alluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio, Inc.
Alluxio Webinar - Maximize GPU Utilization for Model Training
Alluxio Webinar - Maximize GPU Utilization for Model Training
Alluxio, Inc.
Alluxio Product school Webinar - Distributed Caching for Generative AI
Alluxio Product school Webinar - Distributed Caching for Generative AI
Alluxio, Inc.
Alluxio Product School Webinar - Get Started with Alluxio on Kubernetes
Alluxio Product School Webinar - Get Started with Alluxio on Kubernetes
Alluxio, Inc.
More from Alluxio, Inc.
(20)
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Big Data Bellevue Meetup | Enhancing Python Data Loading in the Cloud for AI/ML
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Why a Multi-Cloud Strategy Matters for Your AI Plat...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Alluxio Monthly Webinar | Five Disruptive Trends that Every Data & AI Leader...
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Your Trino/Presto Queries - Gain the Alluxio Edge
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | Accelerate Distributed PyTorch/Ray Workloads in the Cloud
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | ByteDance's Native Parquet Reader
Data Infra Meetup | Uber's Data Storage Evolution
Data Infra Meetup | Uber's Data Storage Evolution
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
Alluxio Monthly Webinar | Why NFS/NAS on Object Storage May Not Solve Your AI...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | Accelerate Your Model Training and Serving with Distributed Ca...
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | The AI Infra in the Generative AI Era
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | Hands-on Lab: CV Model Training with PyTorch & Alluxio on Kube...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | The Generative AI Market And Intel AI Strategy and Product Up...
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Composable PyTorch Distributed with PT2 @ Meta
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
AI Infra Day | Model Lifecycle Management Quality Assurance at Uber Scale
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio Monthly Webinar | Efficient Data Loading for Model Training on AWS
Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...
Alluxio + Eckerson Webinar | Simplifying and Accelerating Data Access for AI/...
Alluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Monthly Webinar - Accelerate AI Path to Production
Alluxio Webinar - Maximize GPU Utilization for Model Training
Alluxio Webinar - Maximize GPU Utilization for Model Training
Alluxio Product school Webinar - Distributed Caching for Generative AI
Alluxio Product school Webinar - Distributed Caching for Generative AI
Alluxio Product School Webinar - Get Started with Alluxio on Kubernetes
Alluxio Product School Webinar - Get Started with Alluxio on Kubernetes
Recently uploaded
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
David Newbury
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
Tarek Kalaji
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
Liveplex
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
Jamie (Taka) Wang
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IES VE
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
Mahmoud Rabie
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
Daniel Santiago Silva Capera
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
YounusS2
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
Precisely
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
infogdgmi
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
Adam Moalla
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
Bachir Benyammi
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
Brian Pichman
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
Md Hossain Ali
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Commit University
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
DianaGray10
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
D Cloud Solutions
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
DianaGray10
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
Udaiappa Ramachandran
Recently uploaded
(20)
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
VoIP Service and Marketing using Odoo and Asterisk PBX
VoIP Service and Marketing using Odoo and Asterisk PBX
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
IESVE Software for Florida Code Compliance Using ASHRAE 90.1-2019
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
AI You Can Trust - Ensuring Success with Data Integrity Webinar
AI You Can Trust - Ensuring Success with Data Integrity Webinar
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
Building Your Own AI Instance (TBLC AI )
Building Your Own AI Instance (TBLC AI )
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
AI Fame Rush Review – Virtual Influencer Creation In Just Minutes
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
UiPath Studio Web workshop series - Day 7
UiPath Studio Web workshop series - Day 7
Building AI-Driven Apps Using Semantic Kernel.pptx
Building AI-Driven Apps Using Semantic Kernel.pptx
Accelerating Spark Workloads in a Mesos Environment with Alluxio
1.
1 Accelerating Spark Workloads
in a Mesos Environment with Alluxio Gene Pang, Software Engineer, Alluxio, Inc. * ©2017 Alluxio, Inc. All Rights Reserved
2.
About Me Gene Pang Software
Engineer @ Alluxio, Inc. Alluxio Open Source PMC Member Ph.D. from AMPLab @ UC Berkeley Worked at Google before UC Berkeley Twitter: @unityxx Github: @gpang ©2017 Alluxio, Inc. All Rights Reserved 2
3.
Outline Alluxio Overview Alluxio +
Spark + Mesos Use Cases Using Spark with Alluxio on Mesos Deployment with Mesos Demo 1 2 3 4 5 ©2017 Alluxio, Inc. All Rights Reserved 3
4.
Data Ecosystem Yesterday 4*
©2017 Alluxio, Inc. All Rights Reserved • One Compute Framework • Single Storage System • Co-located
5.
Data Ecosystem Today 5*
©2017 Alluxio, Inc. All Rights Reserved … • Many Compute Frameworks • Multiple Storage Systems • Most not co-located …
6.
Data Ecosystem Issues 6*
©2017 Alluxio, Inc. All Rights Reserved • Each application manage multiple data sources • Add/Removing data sources require application changes • Storage optimizations requires application change • Lower performance due to lack of locality … …
7.
Data Ecosystem with
Alluxio 7* ©2017 Alluxio, Inc. All Rights Reserved • Apps only talk to Alluxio • Simple Add/Remove • No App Changes • Memory Performance … …
8.
Next Gen Analytics
with Alluxio 8* ©2017 Alluxio, Inc. All Rights Reserved ✓ Big Data/IoT ✓ AI/ML ✓ Deep Learning ✓ Cloud Migration ✓ Multi Platform ✓ Autonomous … … Native File System Hadoop Compatible File System Native Key-Value Interface Fuse Compatible File System HDFS Interface Amazon S3 Interface Swift Interface GlusterFS Interface Apps, Data & Storage at Memory Speed
9.
Enabling Next Gen
Analytics Unify your Data 9 1 Architecture Flexibility2 Improved I/O Performance3 * ©2017 Alluxio, Inc. All Rights Reserved
10.
Fastest Growing Big
Data Open Source Project 10* ©2017 Alluxio, Inc. All Rights Reserved • Fastest Growing open- source project in the big data ecosystem • Running world’s largest production clusters • 600+ Contributors from 100+ organizations
11.
Outline Alluxio Overview Alluxio +
Spark + Mesos Use Cases Using Spark with Alluxio on Mesos Deployment with Mesos Demo 1 2 3 4 5 ©2017 Alluxio, Inc. All Rights Reserved 11
12.
Big Data Case
Study – Challenge – Gain end to end view of business with large volume of data for $5B Travel Site Queries were slow / not interactive, resulting in operational inefficiency SPARK HDFS Solution – With Alluxio, 300x improvement in performance Impact – Increased revenue from immediate response to user behavior Use case: http://bit.ly/2pDJdrq CEPH HDFS CEPH FLINK SPARK FLINK ©2017 Alluxio, Inc. All Rights Reserved 12 MESOS
13.
Machine Learning Case
Study – 136/12/17 ©2017 Alluxio, Inc. All Rights Reserved Challenge – Disparate Data both on-prem and Cloud. Heterogeneous types of data. Scaling of Exabyte size data. Slow due to disk based approach. SPARK HDFS SPARK MINIO Solution – Using Alluxio to prevent I/O bottlenecks Impact – Orders of magnitude higher performance than before. http://bit.ly/2p18ds3 MESOS
14.
Outline Alluxio Overview Alluxio +
Spark + Mesos Use Cases Using Spark with Alluxio on Mesos Deployment with Mesos Demo 1 2 3 4 5 ©2017 Alluxio, Inc. All Rights Reserved 14
15.
Sharing Data via
Memory Storage Engine & Execution Engine Same Process • Two copies of data in memory – double the memory used • Inter-process Sharing Slowed Down by Network / Disk I/O ©2017 Alluxio, Inc. All Rights Reserved 15 Mesos Spark Compute Spark Storage block 1 block 3 HDFS / Amazon S3 block 1 block 3 block 2 block 4 Spark Compute Spark Storage block 1 block 3
16.
Sharing Data via
Memory Storage Engine & Execution Engine Different process • Half the memory used • Inter-process Sharing Happens at Memory Speed Spark Compute Spark Storage HDFS / Amazon S3 block 1 block 3 block 2 block 4 HDFS disk block 1 block 3 block 2 block 4 Alluxio block 1 block 3 block 4 Spark Compute Spark Storage ©2017 Alluxio, Inc. All Rights Reserved 16 Mesos
17.
Data Resilience During
Crash Spark Compute Spark Storage block 1 block 3 HDFS / Amazon S3 block 1 block 3 block 2 block 4 Storage Engine & Execution Engine Same Process ©2017 Alluxio, Inc. All Rights Reserved 17 Mesos
18.
Data Resilience During
Crash CRASH Spark Storage block 1 block 3 HDFS / Amazon S3 block 1 block 3 block 4 block 2 • Process Crash Requires Network and/or Disk I/O to Re-read Data Storage Engine & Execution Engine Same Process ©2017 Alluxio, Inc. All Rights Reserved 18 Mesos
19.
Data Resilience During
Crash CRASH HDFS / Amazon S3 block 1 block 3 block 2 block 4 Storage Engine & Execution Engine Same Process • Process Crash Requires Network and/or Disk I/O to Re-read Data ©2017 Alluxio, Inc. All Rights Reserved 19 Mesos
20.
Data Resilience During
Crash Spark Compute Spark Storage HDFS / Amazon S3 block 1 block 3 block 2 block 4 HDFS disk block 1 block 3 block 2 block 4 Alluxio block 1 block 3 block 4 Storage Engine & Execution Engine Different process ©2017 Alluxio, Inc. All Rights Reserved 20 Mesos
21.
Data Resilience During
Crash Process Crash - Data is Re-read at Memory SpeedHDFS / Amazon S3 block 1 block 3 block 2 block 4 HDFS disk block 1 block 3 block 2 block 4 Alluxio block 1 block 3 block 4 CRASH Storage Engine & Execution Engine Different process ©2017 Alluxio, Inc. All Rights Reserved 21 Mesos
22.
Alluxio Architecture ©2017 Alluxio,
Inc. All Rights Reserved 22 Application AlluxioClient Alluxio Master Alluxio Worker Alluxio Worker … Storage Storage …
23.
Alluxio Client ©2017 Alluxio,
Inc. All Rights Reserved 23 Applications interact with Alluxio via the Alluxio client ● Native Alluxio Filesystem Client • Alluxio specific operations like [un]pin, [un]mount, [un]set TTL ● HDFS-Compatible Filesystem Client • No code change necessary ● S3 API
24.
Alluxio Master ©2017 Alluxio,
Inc. All Rights Reserved 24 Master is responsible for managing metadata ● Filesystem namespace metadata ● Blocks / workers metadata Primary master writes journal for durable operations ● Secondary masters replay journal entries
25.
Alluxio Worker ©2017 Alluxio,
Inc. All Rights Reserved 25 Worker is responsible for managing block data Worker stores block data on various storage media ● HDD, SSD, Memory Reads and writes data to underlying storage systems
26.
Outline Alluxio Overview Alluxio +
Spark + Mesos Use Cases Using Spark with Alluxio on Mesos Deployment with Mesos Demo 1 2 3 4 5 ©2017 Alluxio, Inc. All Rights Reserved 26
27.
Alluxio on DC/OS ©2017
Alluxio, Inc. All Rights Reserved 27
28.
Alluxio on DC/OS ©2017
Alluxio, Inc. All Rights Reserved 28 Alluxio brings A unified view of data across disparate storage systems High performance & predictable SLA for analytics workloads DC/OS makes provisioning infrastructure easy Automates provisioning, management & elastic scaling Benefits include: Faster analytics with Spark and other frameworks Process data from hybrid cloud storage systems (HDFS, S3, etc)
29.
Outline Alluxio Overview Alluxio +
Spark + Mesos Use Cases Using Spark with Alluxio on Mesos Deployment with Mesos Demo 1 2 3 4 5 ©2017 Alluxio, Inc. All Rights Reserved 29
30.
Demo Environment Spark Alluxio ©2017 Alluxio,
Inc. All Rights Reserved 30 SPARK MESOS
31.
Demo Setup Alluxio 1.5.0 DC/OS
1.9.4 Spark 2.0.2 Amazon EC2 (m3.xlarge) ©2017 Alluxio, Inc. All Rights Reserved 31
32.
Results ©2017 Alluxio, Inc.
All Rights Reserved 32 8x improvement
33.
Conclusion Easy to use
Alluxio with Spark in a Mesos environment Predictable and improved performance Easily connect to various storage systems ©2017 Alluxio, Inc. All Rights Reserved 33
34.
Thank you! Gene Pang Software
Engineer gene@alluxio.com 34 Twitter.com/alluxio Linkedin.com/alluxio Website www.alluxio.com E-mail info@alluxio.com @ Social Media * ©2017 Alluxio, Inc. All Rights Reserved
Download Now