Cassandra Summit 2015 - Building a multi-tenant API PaaS with DataStax Enterprise Search

1. Introduction
2. Persistence needs of an API PaaS
3. Selecting DataStax Enterprise Search
4. Main challenges and solutions
5. Conclusion
6. Q&A
Agenda

● Jérôme Louvel
○ founder & CTO of Restlet, Web API platform vendor
○ created Restlet Framework, first REST framework in 2004
○ contributor to “RESTful Web Services” (O’Reilly, 2007)
○ member of the JAX-RS 1.0 expert group (2007 - 2009)
○ co-author of “Restlet in Action” (Manning, 2012)
○ InfoQ editor covering Web APIs since 2014
● Guillaume Blondeau
○ DevOps engineer at Restlet
○ working on APISpark cloud platform
○ Cassandra Administrator certified by DataStax
About the Speakers

x
•
○
○
○
•
○
○
○

● Key features
○ visual creation & deployment of
data APIs
○ operation of APIs &
their local data sources
○ management of any API
● Benefits
○ accessible via web browser,
no technical expertise required
○ companies of any size can
become API providers
○ get started for free, then pay
when the API generates traffic
About APISpark

Persistence Needs
of an API PaaS

High Availability of APIs and their Data Stores

Low Latency for Users Across the Globe
Rugby World Cup Data

High Scalability & Elasticity
● For API traffic
○ concurrent calls
○ workload types
○ peaks handling
● For data storage
○ number of stores
○ volume of data ...
...
...
...

● Filtering on properties
● Pagination
● Sorting
Rich Query Capabilities

High Multi-tenant Density
● Balance between
○ data isolation
○ low cost
● Many customers & projects
○ sharing persistence
infrastructure
○ isolated data stores
● Many users & groups
○ personal data
○ shared group data

Selecting
DataStax Enterprise Search

Step 1: Prototyping with AWS NoSQL
● Started with SimpleDB
○ zero ops, highly available & low latency
○ mono-region & limited query capabilities
● Upgraded to DynamoDB
○ better scalability & predictability
○ not really for multi-tenant use cases (soft limits)
○ not very elastic (provisioned throughput)
● Other limitations
○ unable to develop and test locally (MySQL mode)
○ strong AWS lock-in

Step 2: Moving to Apache Cassandra
● For APISpark beta version
○ increasing multi-tenancy needs
○ increasing cost concerns
● Benefits
○ fully open source & free (vendor support)
○ on-premise deployments possible
○ proven scalability on AWS (Netflix)
○ richer query capabilities
○ natively multi-region

Step 3: Upgrading to DataStax Enterprise
● For APISpark GA
○ DataStax certified stack
○ production support
● Improved capabilities
○ much richer query capabilities with Solr integration
○ administration console
○ command line tooling
○ comprehensive documentation
● Still open source foundation
○ limited vendor lock-in
○ mature open source components

Current Persistence Design
Entity Store
Entity
Property
Primary Key

7 Main Challenges &
Solutions
DataStax Enterprise Search 4.6.7
(Cassandra 2.0.14, Solr 4.6.0)

● Using Ec2MultiRegionSnitch
● 1 Entity Store = 1 Keyspace
○ Each keyspace can set its own replication policy
I. Deploying Across Multiple Regions

● 1 Entity Store = 1 Keyspace
○ Data isolated in File System and Memory
● Complementary benefit
○ ACL per keyspace
II. Isolating Customer Data & Keeping Cost Low
Keyspace
Table

Composite property
List property
III. Supporting Complex Data Models

IV. Dealing with Dynamic Schema Changes (1/3)
ALTER TABLE DROP
ALTER TABLE ADD

User Action on Entity Store Action performed in DB
Create Entity CQL: “CREATE TABLE <tableName>” + Solr Core creation
Delete Entity CQL: “DROP TABLE <tableName>”
Create Property
CQL: “ALTER TABLE ADD <columnName> <type>” +
Solr Core schema update
Delete Property
CQL: “ALTER TABLE DROP <columnName>” +
Solr Core schema update
Add Property in composite Java: Alter JSON for all rows
Delete Property in composite Java: Alter JSON for all rows

● Advantages
○ flexibility compared to RDBMS
■ no lock
○ available actions
■ add / drop / rename column
■ change type of column
● Limitations
○ schema deployment can take time
○ in some edge cases can’t recreate columns

V. High Multi-tenant Density (1/2)
Schema deployment time with growing # of tables

● Challenge
○ large number of C* tables & Solr cores
○ memory usage (ex: 1 C* table takes more than 1MB of heap)
● Solutions
○ adjust JVM memory settings
○ need to create additional clusters
○ deprovision unused Entity Stores
V. High Multi-tenant Density (2/2)

VI. Query Capabilities 1/2
Search queries
Upsert / Delete / “Get by id” queries

● Filtering on a property
● Pagination
● Sorting
VI. Query Capabilities 2/2
Solr Queries

VII. Analytics (1/2)
Provide analytics about API calls

VII. Analytics (2/2)
used for latest API calls
issue with wide rows
(heavily used APIs)
1 table per report
use of C* counters

● Special use case of DataStax Enterprise
○ not a lot of shared knowledge about it
○ great support from DataStax
○ DSE is a good fit despite some challenges
● Looking forward to DSE 4.8 !
○ User Defined Types with Solr indexing
○ live indexing of C* data into Solr
○ improved overall performance
Conclusion

Cassandra Summit 2015 - Building a multi-tenant API PaaS with DataStax Enterprise Search

Cassandra Summit 2015 - Building a multi-tenant API PaaS with DataStax Enterprise Search

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Viewers also liked

Viewers also liked (20)

Similar to Cassandra Summit 2015 - Building a multi-tenant API PaaS with DataStax Enterprise Search

Similar to Cassandra Summit 2015 - Building a multi-tenant API PaaS with DataStax Enterprise Search (20)

More from Restlet

More from Restlet (20)

Recently uploaded

Recently uploaded (20)

Cassandra Summit 2015 - Building a multi-tenant API PaaS with DataStax Enterprise Search