SlideShare a Scribd company logo
1 of 26
Download to read offline
The State of CQL
Sylvain Lebresne (@pcmanus)
June 12, 2013
Why CQL?
(Rational and goals behind CQL)
What is CQL?
(How do you model application with CQL)
The native protocol
(Transporting CQL queries)
What's next?
(Cassandra 2.0 and beyond)
2/26
Disclaimer
This presentation focuses exclusively on CQL version 3. Many things do not apply to CQL version 1 and 2.
Unless explicitly state otherwise, the terms rows and columns means CQL3 rows and CQL3 columns, which does
not map directly to the notion of rows and columns in thrift (or the internal C* implementation).
·
·
3/26
Why?
Rational and goals behind CQL
The thrift API is:
Cassandra has often been regarded as hard to develop against.
It doesn't have to be that way!
Not user friendly, hard to use.
Low level.
Very little abstraction.
Hard to evolve (in a backward compatible way).
·
·
·
·
5/26
Why the hell a SQL look-alike query language?!
So why not?
Very easy to read.
Programming Language independent.
Ubiquitous, widely known.
Copy/paste friendly.
Easy to evolve.
Does not imply slow.
Doesn't force you to work with string.
·
·
·
·
·
·
·
6/26
Hence, CQL
"Denormalized SQL"
Strictly real-time oriented
·
·
No joins
No sub-queries
No aggregation
Limited ORDER BY
-
-
-
-
7/26
CQL: the 'C' stands for Cassandra
Goals:
Not goals:
Provide a user friendly, productive API for C*.
Make it easy to do the right thing, hard to do the wrong one.
Provide higher level constructs for useful modeling patterns.
Be a complete alternative to the Thrift API.
·
·
·
·
Be SQL.
Abstract C* (useful) specificities away (distribution awareness, C* storage engine, ...).
Be slow.
·
·
·
8/26
What is CQL?
How do you model application with CQL
Cassandra modeling 101
Efficient queries in Cassandra boils down to:
And denormalization is the technique that allows to achieve this in practice.
But this imply the API should:
The Thrift API allows that. So does CQL.
1. Data Locality at the cluster level: a query should only hit one node.
2. Data Locality at the node level: C* storage engine allows data collocation on disk.
expose how to collocate data in the same replica set.
expose how to collocate data on disk (for a given replica).
to query data that is collocated.
·
·
·
10/26
A naive e-mailing application
We want to model:
Users
Emails
Users inboxes (all emails received by a user in chronological order)
·
·
·
11/26
Storing user profiles
CREATETABLEusers(
user_iduuid,
nametext,
passwordtext,
emailtext,
picture_profileblob,
PRIMARYKEY(user_id)
)
--ThisisreallyanUPSERT
INSERTINTOusers(user_id,name,password,email,picture_profile)
VALUES(51b-23-ab8,'SylvainLebresne','Hd3!ba','lebresne@gmail.com',0xf8ac...);
--ThistooisanUPSERT
UPDATEusersSETemail='sylvain@datastax.com',password='B9a1^'WHEREuser_id=51b-23-ab8;
CQL
The first component of the PRIMARY KEY is called the partition key.
All the data sharing the same partition key is stored on the same replica set.
·
·
12/26
Allowing user defined properties
Say we want the user to be able to add to this own profile a set of custom properties:
user_id email name password picture_profile user_props
51b-23-ab8 lebresne@gmail.com Sylvain Lebresne B9a1^ 0xf8ac... { 'myProperty' : 'Whatever I want' }
ALTERTABLEusersADDuser_propsmap<text,text>;
UPDATEusersSETuser_props['myProperty']='WhateverIwant'WHEREuser_id=51b-23-ab8;
SELECT*FROMusers;
CQL
13/26
Storing emails
Only “indexed” queried are allowed. You cannot do:
That is, unless you explicitely index from using:
CREATETABLEemails(
email_idtimeuuidPRIMARYKEY, --Embedstheemailcreationdate
subjecttext,
senderuuid,
recipientsset<uuid>,
bodytext
)
--Insertsemails...
CQL
SELECT*FROMemailsWHEREsender=51b-23-ab8; CQL
CREATEINDEXONemails(sender); CQL
14/26
Inboxes
For each user, it's inbox is the list of it's emails chronologically sorted.
To display the inbox, we need for each email the subject, the sender and recipients names and emails.
In a traditional RDBMS, we could join the users and emails table.
In Cassandra, we denormalize. That is, we store the pre-computed result of queries we care about (always up to
date materialized view).
·
·
·
Good luck to scale that!-
·
Collocate all the data for an inbox on the same node.
Collocate all inbox emails on disk, in the order queried.
This is typically the time-series kind of model for which Cassandra shines.
-
-
-
15/26
Storing inboxes
CQL distinguishes 2 sub-parts in the PRIMARY KEY:
In practice, we are interested by having emails stored in reverse chronological order.
CREATETABLEinboxes(
user_iduuid,
email_idtimeuuid,
sender_emailtext,
recipients_emailsset<text>,
subjecttext,
is_readboolean,
PRIMARYKEY(user_id, email_id)
)WITHCLUSTERINGORDERBY(email_idDESC)
CQL
partition key: decides the node on which the data is stored
clustering columns: within the same partition key, (CQL3) rows are physically ordered following the clustering
columns
·
·
16/26
Storing inboxes cont'd
In this example, this allows efficient queries of time range of emails for a given inbox.
email_id dateOf(email_id) sender_email recipients_emails subject
d20-32-012 2013-06-24 00:42+0000 Yuki Morishita <yuki@datastax.com> { 'Sylvain Lebresne' } あなたに幸せな誕生日 false
17a-bf-65f 2013-03-01 17:03+0000 Aleksey Yeschenko <aleksey@datastax.com> { 'Sylvain Lebresne' } RE: What do you think? true
a9c-13-9da 2013-02-10 04:12+0000 Brandon Williams <brandon@datastax.com> { 'Jonathan Ellis', 'Sylvain Lebresne' } dtests are broken!?@# true
241-b4-ca0 2013-01-04 12:45+0000 Jonathan Ellis <jbellis@datastax.com> { 'Sylvain Lebresne' } Whatzz up? true
--Getallemailsforuser51b-23-ab8sinceJan01,2013inreversechronologicalorder.
SELECTemail_id,dateOf(email_id),sender_email,recipients_emails,subject,is_read
FROMinboxes
WHEREuser_id=51b-23-ab8ANDemail_id>minTimeuuid('2013-01-0100:00+0000')
ORDERBYemail_idDESC;
CQL
17/26
Handling huge inboxes
What if inboxes can become too big? The traditional solution consists in sharding inboxes in adapted time shards
(say a year), to avoid storing it all on one node.
This can be easily done using a composite partition key:
CREATETABLEinboxes(
user_iduuid,
yearint,
email_idtimeuuid,
sender_emailtext,
recipients_namestext,
subjecttext,
PRIMARYKEY((user_id,year),email_id)
)WITHCLUSTERINGORDERBY(email_idDESC)
CQL
18/26
Upgrading from thrift
For more details on the relationship between thrift and CQL:
CQL uses the same internal storage engine than Thrift
CQL can read your existing Thrift column families (no data migration needed):
You can read CQL3 tables from thrift, but this is not easy in practice because some CQL3 metadata are not
exposed through thrift for compatibility reasons.
CQL is meant to be an alternative to Thrift, not a complement to it.
·
·
cqlsh>USE"<keyspace_name>";
cqlsh>DESCRIBE"<column_family_name>";
cqlsh>SELECT*FROM"<column_family_name>"LIMIT20;
CQL
·
·
http://www.datastax.com/dev/blog/thrift-to-cql3
http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows
·
·
19/26
The native protocol
Transporting CQL queries
The native protocol
A binary transport for CQL3:
Want to know more about drivers using this native protocol? Stay in the room for Michaël and Patrick's talk.
Asynchronous (allows multiple concurrent queries per connection)
Server notifications (Only for generic cluster events currently)
Made for CQL3
·
·
·
21/26
What's next?
Cassandra 2.0 and beyond
Cassandra 2.0: CQL3
Compare-and-swap support
Triggers
Allow preparation of TIMESTAMP, TTL and LIMIT.
Primary key columns 2ndary indexing
ALTER ... DROP
·
UPDATEloginSETpassword='fs3!c'WHEREusername='pcmanus'IFNOTEXISTS;
UPDATEusersSETemail='sylvain@datastax.com'WHEREuser_id=51b-23-ab8IFemail='slebresne@apache.org';
CQL
·
·
·
·
23/26
Cassandra 2.0: Native protocol
One-short prepare-and-execute message
Batching of prepared statement
SASL authentication
Automatic query paging
·
·
·
·
24/26
After C* 2.0
Continue to improve the user experience by facilitating good data modeling, while respecting Cassandra inherent
specificities.
Storage engine optimizations
Collections 2ndary indexing
Aggregations within a partition
User defined 'struct' types
...
·
·
·
·
·
25/26
Thank You!
(Questions?)

More Related Content

Viewers also liked

How News and Publishing Use Technology
How News and Publishing Use TechnologyHow News and Publishing Use Technology
How News and Publishing Use Technologylogomachy
 
Manual prevenció de riscos laborals
Manual prevenció de riscos laboralsManual prevenció de riscos laborals
Manual prevenció de riscos laboralsAnna Bernardez Fanlo
 
Sortida cultural al palau nacional de barcelona
Sortida cultural al palau nacional de barcelonaSortida cultural al palau nacional de barcelona
Sortida cultural al palau nacional de barcelonaLaura Salvatierra
 
L3 methodology lesson
L3 methodology lessonL3 methodology lesson
L3 methodology lessonandypinks
 
Presentación2
Presentación2Presentación2
Presentación2Carlis93
 
Presentación2
Presentación2Presentación2
Presentación2Carlis93
 
Meeting room refurbishment ideas for the workplace
Meeting room refurbishment ideas for the workplaceMeeting room refurbishment ideas for the workplace
Meeting room refurbishment ideas for the workplaceBen Johnson Ltd
 
Plan de gestión de riesgos
Plan de gestión de riesgosPlan de gestión de riesgos
Plan de gestión de riesgosWendy Navarro
 
Presentacion prueba
Presentacion pruebaPresentacion prueba
Presentacion pruebaAngie Acosta
 
Tecnicas artìsticas
Tecnicas artìsticasTecnicas artìsticas
Tecnicas artìsticascavero55
 
Portal para-as-estrelas
Portal para-as-estrelasPortal para-as-estrelas
Portal para-as-estrelasjmpcard
 
Presentación del curso geometría analítica para 5ºhumanistico
Presentación del curso geometría analítica para 5ºhumanisticoPresentación del curso geometría analítica para 5ºhumanistico
Presentación del curso geometría analítica para 5ºhumanisticoWalter Agustín
 
Trabajo final2
Trabajo final2Trabajo final2
Trabajo final2marenas
 

Viewers also liked (20)

How News and Publishing Use Technology
How News and Publishing Use TechnologyHow News and Publishing Use Technology
How News and Publishing Use Technology
 
ΡΑΔΙΟΦΩΝΟ
ΡΑΔΙΟΦΩΝΟΡΑΔΙΟΦΩΝΟ
ΡΑΔΙΟΦΩΝΟ
 
Manual prevenció de riscos laborals
Manual prevenció de riscos laboralsManual prevenció de riscos laborals
Manual prevenció de riscos laborals
 
Sortida cultural al palau nacional de barcelona
Sortida cultural al palau nacional de barcelonaSortida cultural al palau nacional de barcelona
Sortida cultural al palau nacional de barcelona
 
Estadocivil
EstadocivilEstadocivil
Estadocivil
 
RyR
RyRRyR
RyR
 
Cartel tc
Cartel tcCartel tc
Cartel tc
 
Presentación final DAI
Presentación final DAIPresentación final DAI
Presentación final DAI
 
L3 methodology lesson
L3 methodology lessonL3 methodology lesson
L3 methodology lesson
 
Presentación2
Presentación2Presentación2
Presentación2
 
Presentación2
Presentación2Presentación2
Presentación2
 
Meeting room refurbishment ideas for the workplace
Meeting room refurbishment ideas for the workplaceMeeting room refurbishment ideas for the workplace
Meeting room refurbishment ideas for the workplace
 
Plan de gestión de riesgos
Plan de gestión de riesgosPlan de gestión de riesgos
Plan de gestión de riesgos
 
Presentacion prueba
Presentacion pruebaPresentacion prueba
Presentacion prueba
 
Deportes llll
Deportes llllDeportes llll
Deportes llll
 
Tecnicas artìsticas
Tecnicas artìsticasTecnicas artìsticas
Tecnicas artìsticas
 
Presentación1
Presentación1Presentación1
Presentación1
 
Portal para-as-estrelas
Portal para-as-estrelasPortal para-as-estrelas
Portal para-as-estrelas
 
Presentación del curso geometría analítica para 5ºhumanistico
Presentación del curso geometría analítica para 5ºhumanisticoPresentación del curso geometría analítica para 5ºhumanistico
Presentación del curso geometría analítica para 5ºhumanistico
 
Trabajo final2
Trabajo final2Trabajo final2
Trabajo final2
 

Similar to C* Summit 2013: The State of CQL by Sylvain Lebresne

Cassandra basics 2.0
Cassandra basics 2.0Cassandra basics 2.0
Cassandra basics 2.0Asis Mohanty
 
Scaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlScaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlDavid Daeschler
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystemAlex Thompson
 
Cluster computings
Cluster computingsCluster computings
Cluster computingsRagu1033
 
Introduction to NoSQL CassandraDB
Introduction to NoSQL CassandraDBIntroduction to NoSQL CassandraDB
Introduction to NoSQL CassandraDBJanos Geronimo
 
Apache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceApache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceDataStax Academy
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraRobert Stupp
 
Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011bostonrb
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMIJCI JOURNAL
 
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...DataStax Academy
 
GumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSGumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSDataStax Academy
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016DataStax
 
Ame 2284 mq shared queues
Ame 2284 mq shared queuesAme 2284 mq shared queues
Ame 2284 mq shared queuesPete Siddall
 
IBM Impact 2014 AMC-1878: IBM WebSphere MQ for zOS: Shared Queues
IBM Impact 2014 AMC-1878: IBM WebSphere MQ for zOS: Shared QueuesIBM Impact 2014 AMC-1878: IBM WebSphere MQ for zOS: Shared Queues
IBM Impact 2014 AMC-1878: IBM WebSphere MQ for zOS: Shared QueuesPaul Dennis
 
Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)zznate
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinChristian Johannsen
 

Similar to C* Summit 2013: The State of CQL by Sylvain Lebresne (20)

Cassndra (4).pptx
Cassndra (4).pptxCassndra (4).pptx
Cassndra (4).pptx
 
Cassandra basics 2.0
Cassandra basics 2.0Cassandra basics 2.0
Cassandra basics 2.0
 
Scaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosqlScaling opensimulator inventory using nosql
Scaling opensimulator inventory using nosql
 
The Apache Cassandra ecosystem
The Apache Cassandra ecosystemThe Apache Cassandra ecosystem
The Apache Cassandra ecosystem
 
Cassandra no sql ecosystem
Cassandra no sql ecosystemCassandra no sql ecosystem
Cassandra no sql ecosystem
 
Cluster computings
Cluster computingsCluster computings
Cluster computings
 
Introduction to NoSQL CassandraDB
Introduction to NoSQL CassandraDBIntroduction to NoSQL CassandraDB
Introduction to NoSQL CassandraDB
 
Apache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis PriceApache Cassandra Data Modeling with Travis Price
Apache Cassandra Data Modeling with Travis Price
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011Mongodb in-anger-boston-rb-2011
Mongodb in-anger-boston-rb-2011
 
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEMCASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
 
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan OttTrivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
Trivadis TechEvent 2016 Big Data Cassandra, wieso brauche ich das? by Jan Ott
 
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
Tales From The Front: An Architecture For Multi-Data Center Scalable Applicat...
 
GumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWSGumGum: Multi-Region Cassandra in AWS
GumGum: Multi-Region Cassandra in AWS
 
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
From Postgres to Cassandra (Rimas Silkaitis, Heroku) | C* Summit 2016
 
Ame 2284 mq shared queues
Ame 2284 mq shared queuesAme 2284 mq shared queues
Ame 2284 mq shared queues
 
IBM Impact 2014 AMC-1878: IBM WebSphere MQ for zOS: Shared Queues
IBM Impact 2014 AMC-1878: IBM WebSphere MQ for zOS: Shared QueuesIBM Impact 2014 AMC-1878: IBM WebSphere MQ for zOS: Shared Queues
IBM Impact 2014 AMC-1878: IBM WebSphere MQ for zOS: Shared Queues
 
Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)Introduciton to Apache Cassandra for Java Developers (JavaOne)
Introduciton to Apache Cassandra for Java Developers (JavaOne)
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
Apache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek BerlinApache Cassandra at the Geek2Geek Berlin
Apache Cassandra at the Geek2Geek Berlin
 

More from DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 

More from DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Recently uploaded

Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.YounusS2
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Websitedgelyza
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIUdaiappa Ramachandran
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopBachir Benyammi
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfDaniel Santiago Silva Capera
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceMartin Humpolec
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfAijun Zhang
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncObject Automation
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding TeamAdam Moalla
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxYounusS2
 

Recently uploaded (20)

Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.Basic Building Blocks of Internet of Things.
Basic Building Blocks of Internet of Things.
 
COMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a WebsiteCOMPUTER 10 Lesson 8 - Building a Website
COMPUTER 10 Lesson 8 - Building a Website
 
RAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AIRAG Patterns and Vector Search in Generative AI
RAG Patterns and Vector Search in Generative AI
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
NIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 WorkshopNIST Cybersecurity Framework (CSF) 2.0 Workshop
NIST Cybersecurity Framework (CSF) 2.0 Workshop
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdfIaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
IaC & GitOps in a Nutshell - a FridayInANuthshell Episode.pdf
 
Things you didn't know you can use in your Salesforce
Things you didn't know you can use in your SalesforceThings you didn't know you can use in your Salesforce
Things you didn't know you can use in your Salesforce
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Machine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdfMachine Learning Model Validation (Aijun Zhang 2024).pdf
Machine Learning Model Validation (Aijun Zhang 2024).pdf
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation Inc
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team9 Steps For Building Winning Founding Team
9 Steps For Building Winning Founding Team
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptx
 

C* Summit 2013: The State of CQL by Sylvain Lebresne

  • 1. The State of CQL Sylvain Lebresne (@pcmanus) June 12, 2013
  • 2. Why CQL? (Rational and goals behind CQL) What is CQL? (How do you model application with CQL) The native protocol (Transporting CQL queries) What's next? (Cassandra 2.0 and beyond) 2/26
  • 3. Disclaimer This presentation focuses exclusively on CQL version 3. Many things do not apply to CQL version 1 and 2. Unless explicitly state otherwise, the terms rows and columns means CQL3 rows and CQL3 columns, which does not map directly to the notion of rows and columns in thrift (or the internal C* implementation). · · 3/26
  • 5. The thrift API is: Cassandra has often been regarded as hard to develop against. It doesn't have to be that way! Not user friendly, hard to use. Low level. Very little abstraction. Hard to evolve (in a backward compatible way). · · · · 5/26
  • 6. Why the hell a SQL look-alike query language?! So why not? Very easy to read. Programming Language independent. Ubiquitous, widely known. Copy/paste friendly. Easy to evolve. Does not imply slow. Doesn't force you to work with string. · · · · · · · 6/26
  • 7. Hence, CQL "Denormalized SQL" Strictly real-time oriented · · No joins No sub-queries No aggregation Limited ORDER BY - - - - 7/26
  • 8. CQL: the 'C' stands for Cassandra Goals: Not goals: Provide a user friendly, productive API for C*. Make it easy to do the right thing, hard to do the wrong one. Provide higher level constructs for useful modeling patterns. Be a complete alternative to the Thrift API. · · · · Be SQL. Abstract C* (useful) specificities away (distribution awareness, C* storage engine, ...). Be slow. · · · 8/26
  • 9. What is CQL? How do you model application with CQL
  • 10. Cassandra modeling 101 Efficient queries in Cassandra boils down to: And denormalization is the technique that allows to achieve this in practice. But this imply the API should: The Thrift API allows that. So does CQL. 1. Data Locality at the cluster level: a query should only hit one node. 2. Data Locality at the node level: C* storage engine allows data collocation on disk. expose how to collocate data in the same replica set. expose how to collocate data on disk (for a given replica). to query data that is collocated. · · · 10/26
  • 11. A naive e-mailing application We want to model: Users Emails Users inboxes (all emails received by a user in chronological order) · · · 11/26
  • 13. Allowing user defined properties Say we want the user to be able to add to this own profile a set of custom properties: user_id email name password picture_profile user_props 51b-23-ab8 lebresne@gmail.com Sylvain Lebresne B9a1^ 0xf8ac... { 'myProperty' : 'Whatever I want' } ALTERTABLEusersADDuser_propsmap<text,text>; UPDATEusersSETuser_props['myProperty']='WhateverIwant'WHEREuser_id=51b-23-ab8; SELECT*FROMusers; CQL 13/26
  • 14. Storing emails Only “indexed” queried are allowed. You cannot do: That is, unless you explicitely index from using: CREATETABLEemails( email_idtimeuuidPRIMARYKEY, --Embedstheemailcreationdate subjecttext, senderuuid, recipientsset<uuid>, bodytext ) --Insertsemails... CQL SELECT*FROMemailsWHEREsender=51b-23-ab8; CQL CREATEINDEXONemails(sender); CQL 14/26
  • 15. Inboxes For each user, it's inbox is the list of it's emails chronologically sorted. To display the inbox, we need for each email the subject, the sender and recipients names and emails. In a traditional RDBMS, we could join the users and emails table. In Cassandra, we denormalize. That is, we store the pre-computed result of queries we care about (always up to date materialized view). · · · Good luck to scale that!- · Collocate all the data for an inbox on the same node. Collocate all inbox emails on disk, in the order queried. This is typically the time-series kind of model for which Cassandra shines. - - - 15/26
  • 16. Storing inboxes CQL distinguishes 2 sub-parts in the PRIMARY KEY: In practice, we are interested by having emails stored in reverse chronological order. CREATETABLEinboxes( user_iduuid, email_idtimeuuid, sender_emailtext, recipients_emailsset<text>, subjecttext, is_readboolean, PRIMARYKEY(user_id, email_id) )WITHCLUSTERINGORDERBY(email_idDESC) CQL partition key: decides the node on which the data is stored clustering columns: within the same partition key, (CQL3) rows are physically ordered following the clustering columns · · 16/26
  • 17. Storing inboxes cont'd In this example, this allows efficient queries of time range of emails for a given inbox. email_id dateOf(email_id) sender_email recipients_emails subject d20-32-012 2013-06-24 00:42+0000 Yuki Morishita <yuki@datastax.com> { 'Sylvain Lebresne' } あなたに幸せな誕生日 false 17a-bf-65f 2013-03-01 17:03+0000 Aleksey Yeschenko <aleksey@datastax.com> { 'Sylvain Lebresne' } RE: What do you think? true a9c-13-9da 2013-02-10 04:12+0000 Brandon Williams <brandon@datastax.com> { 'Jonathan Ellis', 'Sylvain Lebresne' } dtests are broken!?@# true 241-b4-ca0 2013-01-04 12:45+0000 Jonathan Ellis <jbellis@datastax.com> { 'Sylvain Lebresne' } Whatzz up? true --Getallemailsforuser51b-23-ab8sinceJan01,2013inreversechronologicalorder. SELECTemail_id,dateOf(email_id),sender_email,recipients_emails,subject,is_read FROMinboxes WHEREuser_id=51b-23-ab8ANDemail_id>minTimeuuid('2013-01-0100:00+0000') ORDERBYemail_idDESC; CQL 17/26
  • 18. Handling huge inboxes What if inboxes can become too big? The traditional solution consists in sharding inboxes in adapted time shards (say a year), to avoid storing it all on one node. This can be easily done using a composite partition key: CREATETABLEinboxes( user_iduuid, yearint, email_idtimeuuid, sender_emailtext, recipients_namestext, subjecttext, PRIMARYKEY((user_id,year),email_id) )WITHCLUSTERINGORDERBY(email_idDESC) CQL 18/26
  • 19. Upgrading from thrift For more details on the relationship between thrift and CQL: CQL uses the same internal storage engine than Thrift CQL can read your existing Thrift column families (no data migration needed): You can read CQL3 tables from thrift, but this is not easy in practice because some CQL3 metadata are not exposed through thrift for compatibility reasons. CQL is meant to be an alternative to Thrift, not a complement to it. · · cqlsh>USE"<keyspace_name>"; cqlsh>DESCRIBE"<column_family_name>"; cqlsh>SELECT*FROM"<column_family_name>"LIMIT20; CQL · · http://www.datastax.com/dev/blog/thrift-to-cql3 http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows · · 19/26
  • 21. The native protocol A binary transport for CQL3: Want to know more about drivers using this native protocol? Stay in the room for Michaël and Patrick's talk. Asynchronous (allows multiple concurrent queries per connection) Server notifications (Only for generic cluster events currently) Made for CQL3 · · · 21/26
  • 23. Cassandra 2.0: CQL3 Compare-and-swap support Triggers Allow preparation of TIMESTAMP, TTL and LIMIT. Primary key columns 2ndary indexing ALTER ... DROP · UPDATEloginSETpassword='fs3!c'WHEREusername='pcmanus'IFNOTEXISTS; UPDATEusersSETemail='sylvain@datastax.com'WHEREuser_id=51b-23-ab8IFemail='slebresne@apache.org'; CQL · · · · 23/26
  • 24. Cassandra 2.0: Native protocol One-short prepare-and-execute message Batching of prepared statement SASL authentication Automatic query paging · · · · 24/26
  • 25. After C* 2.0 Continue to improve the user experience by facilitating good data modeling, while respecting Cassandra inherent specificities. Storage engine optimizations Collections 2ndary indexing Aggregations within a partition User defined 'struct' types ... · · · · · 25/26