4. About Alluxio
• Team
– Alluxio Creators and Top Developers/Committers
(all top 8 committers).
• Investors
5. Performance Trend: Memory is Fast
• RAM throughput increasing exponentially
• Disk throughput increasing slowly
• Memory-locality key to interactive response
times
11. Open Source Alluxio System
• The fastest
growing open
source project
in big data
• Over 250
contributors
from over 100
organizations
12. Alluxio Benefits
• Flexibility
– Enable new workloads across any storage systems
– Unified Name Space enable application to access data in any
storage system
• Agility
– Work with the framework of your choice
– Work with the storage of your choice
• Performance
– High performance data access
• Cost
– Grow Storage and Compute independently
• Any application accesses any data from any storage at
memory speed.
13. New Features and
Improvements in
Alluxio 1.0 and 1.1
Gene Pang @ Alluxio, Inc.
June 15, 2016 @ Alluxio Meetup (hosted by Intel)
14. About Me
• Gene Pang - Software Engineer @ Alluxio, Inc.
• One of the core maintainers of Alluxio Open Source Project
• Ph.D. @ AMPLab, UC Berkeley
• Worked at Google before UC Berkeley
• Twitter: @unityxx
14
21. 21
New Integrations
Native OpenStack Swift Driver
Alluxio to FUSE Connector
Google Cloud Storage
Aliyun Object Storage Service
Google Compute Engine
improve performance, reduce complexity
manage data on Alibaba Cloud
mount Alluxio to local file system
manage data on Google Cloud Platform
deploy Alluxio on Google Cloud Platform
22. 22
Access Control (Alpha)
User/Group Support
Command-line Permission Tools
Configuration Parameter
File System Permissions
similar to POSIX permission model
chown, chgrp, chmod
alluxio.security.authorization.permission.enabled
similar to POSIX permission model
23. 23
Usability Improvements
Write Location Policies
Simplified Configuration
Automatic Metadata Loading
configure how to write data to Alluxio
load metadata automatically
customize with properties
24. 24
Performance Improvements
Improved Alluxio Master Scalability
Better Support for Random I/O Workloads
Improved Alluxio Worker Scalability
fine-grained locking, efficient journaling
improved data structures, improved locking
cache blocks during random I/O (e.g., parquet files)