SlideShare a Scribd company logo
1 of 42
Download to read offline
Cassandra : Introduction
Patrick McFadin
Chief Evangelist/Solution Architect - DataStax
@PatrickMcFadin
©2013 DataStax Confidential. Do not distribute without consent.
Who I am

• Patrick McFadin
• Solution Architect at DataStax
• Cassandra MVP
• User for years
• Follow me for more:

Dude.
Uptime == $$

@PatrickMcFadin

I talk about Cassandra and building scalable, resilient apps ALL THE TIME!
!2
Five Years of Cassandra

0.1
Jul-08

...

0.3
0

0.6
1

0.7

1.0
2

1.2
3

DSE

4

2.0
5
Why Cassandra?
The Best
Persistence
Tier
For Your
Application
!
!
!
!
!
!
!
!
Cassandra - An introduction
Cassandra - Roots
• Based on Amazon Dynamo and Google BigTable paper
• Shared nothing
• Data safe as possible
• Predictable scaling

Dynamo

BigTable
!7
Cassandra - More than one server
Each node owns
25% of the data

• All nodes participate in a cluster
• Shared nothing
• Add or remove as needed
25%

• More capacity? Add a server


25%

25%

25%

!8
Core Concepts Write path

<row,column>

Compacted later
Core Concepts Read Path

Real user story
• New app
• SSDs
• 2.5 m requests
• Client P99: 3.17ms!
Cassandra - Locally Distributed
• Client writes to any node
• Node coordinates with others
• Data replicated in parallel
• Replication factor: How many
copies of your data?
• RF = 3 here

!11
Cassandra - Consistency
• Consistency Level (CL)
• Client specifies per read or write

• ALL = All replicas ack
• QUORUM = > 51% of replicas ack
• LOCAL_QUORUM = > 51% in local DC ack
• ONE = Only one replica acks
!12
Cassandra - Transparent to the application
• A single node failure shouldn’t bring failure
• Replication Factor + Consistency Level = Success
• This example:
• RF = 3
• CL = QUORUM

>51% Ack so we are good!

!13
My favorite feature.

Ever!

!14
Cassandra - Geographically Distributed
• Client writes local
• Data syncs across WAN
• Replication Factor per DC

!15
Cassandra Applications - Drivers
• DataStax Drivers for Cassandra
• Java
• C#
• Python
• more on the way

!16
Cassandra Applications - Connecting
• Create a pool of local servers
• Client just uses session to interact with Cassandra
!
contactPoints = {“10.0.0.1”,”10.0.0.2”}!

!

keyspace = “videodb”!

!
!

public VideoDbBasicImpl(List<String> contactPoints, String keyspace) {!
cluster = Cluster!
.builder()!
.addContactPoints(!
contactPoints.toArray(new String[contactPoints.size()]))!
.withLoadBalancingPolicy(Policies.defaultLoadBalancingPolicy())!
.withRetryPolicy(Policies.defaultRetryPolicy())!
.build();!

!

!

session = cluster.connect(keyspace);!
}

!17
CQL Intro
• Cassandra Query Language
• SQL–like language to query Cassandra
• Limited predicates. Attempts to prevent bad queries
• But still offers enough leeway to get into trouble

!18
Data Model Logical containers
Cluster - Contains all nodes. Even across WAN
Keyspace - Contains all tables. Specifies replication
Table (Column Family) - Contains rows
CQL Intro
• CREATE / DROP / ALTER TABLE
• SELECT
!

• BUT
• INSERT AND UPDATE are similar to each other
• If a row doesn’t exist, UPDATE will insert it, and if it exists, INSERT will replace it.
• Think of it as an UPSERT
• Therefore we never get a key violation
• For updates, Cassandra never reads (no col = col + 1)

!20
Data Modeling Creating Tables
CREATE TABLE user (!
! username varchar,!
! firstname varchar,!
! lastname varchar,!
! shopping_carts set<varchar>,!
! PRIMARY KEY (username)!
);

Collection!
CREATE TABLE shopping_cart (!
! username varchar,!
! cart_name text!
! item_id int,!
! item_name varchar,!
description varchar,!
! price float,!
! item_detail map<varchar,varchar>!
! PRIMARY KEY
((username,cart_name),item_id)!
);

Creates compound partition row key
CQL Inserts
• Insert will always overwrite

INSERT INTO users (username, firstname, lastname, !
email, password, created_date)!
VALUES ('pmcfadin','Patrick','McFadin',!
['patrick@datastax.com'],'ba27e03fd95e507daf2937c937d499ab',!
'2011-06-20 13:50:00');!

!22
CQL Selects
• No joins
• Data is returned in row/column format
SELECT username, firstname, lastname, !
email, password, created_date!
FROM users!
WHERE username = 'pmcfadin';!

username | firstname | lastname | email
| password
| created_date!
----------+-----------+----------+--------------------------+----------------------------------+--------------------------!
pmcfadin |
Patrick | McFadin | ['patrick@datastax.com'] | ba27e03fd95e507daf2937c937d499ab | 2011-06-20 13:50:00-0700!

!23
Cassandra and Time Series
Time Series Taming the beast
• Peter Higgs and Francois Englert. Nobel prize for Physics
• Theorized the existence of the Higgs boson
!

• Found using ATLAS
!
!

• Data stored in P-BEAST
!
!

• Time series running on Cassandra
Use Cassandra for time series

Get a nobel prize
Time Series Why
• Storage model from BigTable is perfect
• One row key and tons of (variable)columns
• Single layout on disk

Row Key

Column Name

Column Name

Column Value

Column Value
Time Series Example
• Storing weather data
• One weather station
• Temperature measurements every minute

WeatherStation ID 2013-10-09 10:00 AM 2013-10-09 10:00 AM
72 Degrees

72 Degrees

2013-10-10 11:00 AM
65 Degrees
Time Series Example
• Query data
• Weather Station ID = Locality of single node
Date query

weatherStationID = 100 AND!
date = 2013-10-09 10:00 AM

WeatherStation ID
2013-10-09 10:00 AM 2013-10-09 10:00 AM
100
72 Degrees

72 Degrees

2013-10-10 11:00 AM
65 Degrees

OR
Date Range

weatherStationID = 100 AND!
date > 2013-10-09 10:00 AM AND!
date < 2013-10-10 11:01 AM
Time Series How
• CQL expresses this well
• Data partitioned by weather station ID and time
CREATE TABLE temperature (!
weatherstation_id text,!
event_time timestamp,!
temperature text,!
PRIMARY KEY (weatherstation_id,event_time)!
);

!
!
!

• Easy to insert data
INSERT

INTO temperature(weatherstation_id,event_time,temperature) !
VALUES ('1234ABCD','2013-04-03 07:01:00','72F');

!
!

• Easy to query

SELECT temperature !
FROM temperature !
WHERE weatherstation_id='1234ABCD'!
AND event_time > '2013-04-03 07:01:00'!
AND event_time < '2013-04-03 07:04:00';
Time Series Further partitioning
• At every minute you will eventually run out of rows
• 2 billion columns per storage row
• Data partitioned by weather station ID and time
• Use the partition key to split things up
CREATE TABLE temperature_by_day (!
weatherstation_id text,!
date text,!
event_time timestamp,!
temperature text,!
PRIMARY KEY ((weatherstation_id,date),event_time)!
);
Time Series Further Partitioning
• Still easy to insert
!
!

INSERT INTO temperature_by_day(weatherstation_id,date,event_time,temperature) !
VALUES ('1234ABCD','2013-04-03','2013-04-03 07:01:00','72F');

!
!

• Still easy to query
SELECT temperature !
FROM temperature_by_day !
WHERE weatherstation_id='1234ABCD' !
AND date='2013-04-03'!
AND event_time > '2013-04-03 07:01:00'!
AND event_time < '2013-04-03 07:04:00';
Time Series Use cases
• Logging
• Thing Tracking (IoT)
• Sensor Data
• User Tracking
• Fraud Detection
• Nobel prizes!
Application Example - Layout
• Active-Active
• Service based DNS routing

Cassandra Replication

!34
Application Example - Uptime
• Normal server maintenance
• Application is unaware

Cassandra Replication

!35
Application Example - Failure
• Data center failure

Another happy user!

• Data is safe. Route traffic.

33
!36
Cassandra Users and Use Cases
Netflix!
• If you haven’t heard their story… where have you been?
• 18B market cap — Runs on Cassandra
• User accounts
• Play lists
• Payments
• Statistics
Spotify
• Millions of songs. Millions of users.
• Playlists
• 1 billion playlists
• 30+ Cassandra clusters
• 50+ TB of data
• 40k req/sec peak
http://www.slideshare.net/noaresare/cassandra-nyc

!39
Instagram(Facebook)
• Loads and loads of photos. (Probably yours)
• All in AWS
• Security audits
• News feed
• 20k writes/sec. 15k reads/sec.

!40
DataStax Ac*demy for Apache Cassandra
Content
• First four sessions available with Weekly roll-out of 7 sessions total
• Based on DataStax Community Edition
• CQL, Schema Design and Data Modeling
• Introduction to Cassandra Objects
• First Java, then Python, C# and .NET

Goals
• 100,000 Registrations by the end of 2014
• 25,000 Certifications by the end of 2014
https://datastaxacademy.elogiclearning.com/
!41
©2013 DataStax Confidential. Do not distribute without consent.

!42

More Related Content

What's hot

Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valleyPatrick McFadin
 
Spark Streaming with Cassandra
Spark Streaming with CassandraSpark Streaming with Cassandra
Spark Streaming with CassandraJacek Lewandowski
 
Monitoring Cassandra with Riemann
Monitoring Cassandra with RiemannMonitoring Cassandra with Riemann
Monitoring Cassandra with RiemannPatricia Gorla
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterprisePatrick McFadin
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team ApachePatrick McFadin
 
Apache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fireApache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the firePatrick McFadin
 
Advanced Sharding Features in MongoDB 2.4
Advanced Sharding Features in MongoDB 2.4 Advanced Sharding Features in MongoDB 2.4
Advanced Sharding Features in MongoDB 2.4 MongoDB
 
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case StudyMongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case StudyMongoDB
 
Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & CassandraEscape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & CassandraPiotr Kolaczkowski
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformHow We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformDataStax Academy
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingVassilis Bekiaris
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...DataStax
 
Laying down the smack on your data pipelines
Laying down the smack on your data pipelinesLaying down the smack on your data pipelines
Laying down the smack on your data pipelinesPatrick McFadin
 
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...DataStax
 
Nike Tech Talk: Double Down on Apache Cassandra and Spark
Nike Tech Talk:  Double Down on Apache Cassandra and SparkNike Tech Talk:  Double Down on Apache Cassandra and Spark
Nike Tech Talk: Double Down on Apache Cassandra and SparkPatrick McFadin
 
Spark with Cassandra by Christopher Batey
Spark with Cassandra by Christopher BateySpark with Cassandra by Christopher Batey
Spark with Cassandra by Christopher BateySpark Summit
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and DriversDataStax Academy
 
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraDataStax
 
Lightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and CassandraLightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and Cassandranickmbailey
 

What's hot (20)

Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valley
 
Spark Streaming with Cassandra
Spark Streaming with CassandraSpark Streaming with Cassandra
Spark Streaming with Cassandra
 
Monitoring Cassandra with Riemann
Monitoring Cassandra with RiemannMonitoring Cassandra with Riemann
Monitoring Cassandra with Riemann
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
 
An Introduction to time series with Team Apache
An Introduction to time series with Team ApacheAn Introduction to time series with Team Apache
An Introduction to time series with Team Apache
 
Apache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fireApache cassandra and spark. you got the the lighter, let's start the fire
Apache cassandra and spark. you got the the lighter, let's start the fire
 
Advanced Sharding Features in MongoDB 2.4
Advanced Sharding Features in MongoDB 2.4 Advanced Sharding Features in MongoDB 2.4
Advanced Sharding Features in MongoDB 2.4
 
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case StudyMongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
MongoDB World 2018: Overnight to 60 Seconds: An IOT ETL Performance Case Study
 
Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & CassandraEscape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
Escape from Hadoop: Ultra Fast Data Analysis with Spark & Cassandra
 
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics PlatformHow We Used Cassandra/Solr to Build Real-Time Analytics Platform
How We Used Cassandra/Solr to Build Real-Time Analytics Platform
 
Cassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series ModelingCassandra Basics, Counters and Time Series Modeling
Cassandra Basics, Counters and Time Series Modeling
 
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
Maximum Overdrive: Tuning the Spark Cassandra Connector (Russell Spitzer, Dat...
 
Laying down the smack on your data pipelines
Laying down the smack on your data pipelinesLaying down the smack on your data pipelines
Laying down the smack on your data pipelines
 
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
 
Nike Tech Talk: Double Down on Apache Cassandra and Spark
Nike Tech Talk:  Double Down on Apache Cassandra and SparkNike Tech Talk:  Double Down on Apache Cassandra and Spark
Nike Tech Talk: Double Down on Apache Cassandra and Spark
 
Spark with Cassandra by Christopher Batey
Spark with Cassandra by Christopher BateySpark with Cassandra by Christopher Batey
Spark with Cassandra by Christopher Batey
 
Apache Cassandra and Drivers
Apache Cassandra and DriversApache Cassandra and Drivers
Apache Cassandra and Drivers
 
Webinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache CassandraWebinar: Getting Started with Apache Cassandra
Webinar: Getting Started with Apache Cassandra
 
Cassandra 2.0 (Introduction)
Cassandra 2.0 (Introduction)Cassandra 2.0 (Introduction)
Cassandra 2.0 (Introduction)
 
Lightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and CassandraLightning fast analytics with Spark and Cassandra
Lightning fast analytics with Spark and Cassandra
 

Viewers also liked

Cassandra Summit: C* Keys - Partitioning, Clustering, & Crossfit
Cassandra Summit: C* Keys - Partitioning, Clustering, & CrossfitCassandra Summit: C* Keys - Partitioning, Clustering, & Crossfit
Cassandra Summit: C* Keys - Partitioning, Clustering, & CrossfitAdam Hutson
 
Introduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgIntroduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgzznate
 
Introduction to apache_cassandra_for_develope
Introduction to apache_cassandra_for_developeIntroduction to apache_cassandra_for_develope
Introduction to apache_cassandra_for_developezznate
 
Cassandra Summit 2014: Cassandra at Instagram 2014
Cassandra Summit 2014: Cassandra at Instagram 2014Cassandra Summit 2014: Cassandra at Instagram 2014
Cassandra Summit 2014: Cassandra at Instagram 2014DataStax Academy
 
Cassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraCassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraDataStax Academy
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecturenickmbailey
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!Patrick McFadin
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data modelDuyhai Doan
 
Cassandra Summit 2014: CQL Under the Hood
Cassandra Summit 2014: CQL Under the HoodCassandra Summit 2014: CQL Under the Hood
Cassandra Summit 2014: CQL Under the HoodDataStax Academy
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basicsnickmbailey
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache CassandraRobert Stupp
 
Overview of DataStax OpsCenter
Overview of DataStax OpsCenterOverview of DataStax OpsCenter
Overview of DataStax OpsCenterDataStax
 
DataStax: Backup and Restore in Cassandra and OpsCenter
DataStax: Backup and Restore in Cassandra and OpsCenterDataStax: Backup and Restore in Cassandra and OpsCenter
DataStax: Backup and Restore in Cassandra and OpsCenterDataStax Academy
 
Apache cassandra architecture internals
Apache cassandra architecture internalsApache cassandra architecture internals
Apache cassandra architecture internalsBhuvan Rawal
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for SysadminsNathan Milford
 

Viewers also liked (16)

Cassandra Summit: C* Keys - Partitioning, Clustering, & Crossfit
Cassandra Summit: C* Keys - Partitioning, Clustering, & CrossfitCassandra Summit: C* Keys - Partitioning, Clustering, & Crossfit
Cassandra Summit: C* Keys - Partitioning, Clustering, & Crossfit
 
Introduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhgIntroduction to apache_cassandra_for_developers-lhg
Introduction to apache_cassandra_for_developers-lhg
 
Introduction to apache_cassandra_for_develope
Introduction to apache_cassandra_for_developeIntroduction to apache_cassandra_for_develope
Introduction to apache_cassandra_for_develope
 
Cassandra Summit 2014: Cassandra at Instagram 2014
Cassandra Summit 2014: Cassandra at Instagram 2014Cassandra Summit 2014: Cassandra at Instagram 2014
Cassandra Summit 2014: Cassandra at Instagram 2014
 
Cassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache CassandraCassandra Day Denver 2014: Introduction to Apache Cassandra
Cassandra Day Denver 2014: Introduction to Apache Cassandra
 
Introduction to Cassandra Architecture
Introduction to Cassandra ArchitectureIntroduction to Cassandra Architecture
Introduction to Cassandra Architecture
 
NoSQL Essentials: Cassandra
NoSQL Essentials: CassandraNoSQL Essentials: Cassandra
NoSQL Essentials: Cassandra
 
Open source or proprietary, choose wisely!
Open source or proprietary,  choose wisely!Open source or proprietary,  choose wisely!
Open source or proprietary, choose wisely!
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data model
 
Cassandra Summit 2014: CQL Under the Hood
Cassandra Summit 2014: CQL Under the HoodCassandra Summit 2014: CQL Under the Hood
Cassandra Summit 2014: CQL Under the Hood
 
Introduction to Cassandra Basics
Introduction to Cassandra BasicsIntroduction to Cassandra Basics
Introduction to Cassandra Basics
 
Introduction to Apache Cassandra
Introduction to Apache CassandraIntroduction to Apache Cassandra
Introduction to Apache Cassandra
 
Overview of DataStax OpsCenter
Overview of DataStax OpsCenterOverview of DataStax OpsCenter
Overview of DataStax OpsCenter
 
DataStax: Backup and Restore in Cassandra and OpsCenter
DataStax: Backup and Restore in Cassandra and OpsCenterDataStax: Backup and Restore in Cassandra and OpsCenter
DataStax: Backup and Restore in Cassandra and OpsCenter
 
Apache cassandra architecture internals
Apache cassandra architecture internalsApache cassandra architecture internals
Apache cassandra architecture internals
 
Cassandra for Sysadmins
Cassandra for SysadminsCassandra for Sysadmins
Cassandra for Sysadmins
 

Similar to Cassandra Introduction Overview

Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataPatrick McFadin
 
Jan 2015 - Cassandra101 Manchester Meetup
Jan 2015 - Cassandra101 Manchester MeetupJan 2015 - Cassandra101 Manchester Meetup
Jan 2015 - Cassandra101 Manchester MeetupChristopher Batey
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseAll Things Open
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyDataStax Academy
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Sparknickmbailey
 
Data Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkData Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkChristopher Batey
 
Analytics with Cassandra & Spark
Analytics with Cassandra & SparkAnalytics with Cassandra & Spark
Analytics with Cassandra & SparkMatthias Niehoff
 
Big Data Analytics with Spark
Big Data Analytics with SparkBig Data Analytics with Spark
Big Data Analytics with SparkDataStax Academy
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataVictor Coustenoble
 
Getting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxGetting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxData Con LA
 
Data Modeling IoT and Time Series data in NoSQL
Data Modeling IoT and Time Series data in NoSQLData Modeling IoT and Time Series data in NoSQL
Data Modeling IoT and Time Series data in NoSQLBasho Technologies
 
Boundary Front end tech talk: how it works
Boundary Front end tech talk: how it worksBoundary Front end tech talk: how it works
Boundary Front end tech talk: how it worksBoundary
 
Using cassandra as a distributed logging to store pb data
Using cassandra as a distributed logging to store pb dataUsing cassandra as a distributed logging to store pb data
Using cassandra as a distributed logging to store pb dataRamesh Veeramani
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japanHiromitsu Komatsu
 
Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...
Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...
Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...Helena Edelson
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Jon Haddad
 
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 ParisReal time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 ParisDuyhai Doan
 
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...NoSQLmatters
 

Similar to Cassandra Introduction Overview (20)

1 Dundee - Cassandra 101
1 Dundee - Cassandra 1011 Dundee - Cassandra 101
1 Dundee - Cassandra 101
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series data
 
Jan 2015 - Cassandra101 Manchester Meetup
Jan 2015 - Cassandra101 Manchester MeetupJan 2015 - Cassandra101 Manchester Meetup
Jan 2015 - Cassandra101 Manchester Meetup
 
Re-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series DatabaseRe-Engineering PostgreSQL as a Time-Series Database
Re-Engineering PostgreSQL as a Time-Series Database
 
Managing Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al TobeyManaging Cassandra at Scale by Al Tobey
Managing Cassandra at Scale by Al Tobey
 
Cassandra and Spark
Cassandra and SparkCassandra and Spark
Cassandra and Spark
 
Data Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkData Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and Spark
 
Analytics with Cassandra & Spark
Analytics with Cassandra & SparkAnalytics with Cassandra & Spark
Analytics with Cassandra & Spark
 
Big Data Analytics with Spark
Big Data Analytics with SparkBig Data Analytics with Spark
Big Data Analytics with Spark
 
Spark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational DataSpark + Cassandra = Real Time Analytics on Operational Data
Spark + Cassandra = Real Time Analytics on Operational Data
 
Getting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of DatastaxGetting started with Spark & Cassandra by Jon Haddad of Datastax
Getting started with Spark & Cassandra by Jon Haddad of Datastax
 
Data Modeling IoT and Time Series data in NoSQL
Data Modeling IoT and Time Series data in NoSQLData Modeling IoT and Time Series data in NoSQL
Data Modeling IoT and Time Series data in NoSQL
 
Boundary Front end tech talk: how it works
Boundary Front end tech talk: how it worksBoundary Front end tech talk: how it works
Boundary Front end tech talk: how it works
 
Using cassandra as a distributed logging to store pb data
Using cassandra as a distributed logging to store pb dataUsing cassandra as a distributed logging to store pb data
Using cassandra as a distributed logging to store pb data
 
Presentation
PresentationPresentation
Presentation
 
Instaclustr webinar 2017 feb 08 japan
Instaclustr webinar 2017 feb 08   japanInstaclustr webinar 2017 feb 08   japan
Instaclustr webinar 2017 feb 08 japan
 
Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...
Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...
Delivering Meaning In Near-Real Time At High Velocity In Massive Scale with A...
 
Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)Spark and cassandra (Hulu Talk)
Spark and cassandra (Hulu Talk)
 
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 ParisReal time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
Real time data processing with spark & cassandra @ NoSQLMatters 2015 Paris
 
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
DuyHai DOAN - Real time analytics with Cassandra and Spark - NoSQL matters Pa...
 

More from DataStax Academy

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftDataStax Academy
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseDataStax Academy
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraDataStax Academy
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsDataStax Academy
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingDataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackDataStax Academy
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache CassandraDataStax Academy
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready CassandraDataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonDataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1DataStax Academy
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First ClusterDataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with DseDataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraDataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraDataStax Academy
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph DatabasesDataStax Academy
 

More from DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 
Getting Started with Graph Databases
Getting Started with Graph DatabasesGetting Started with Graph Databases
Getting Started with Graph Databases
 

Recently uploaded

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 

Recently uploaded (20)

What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 

Cassandra Introduction Overview

  • 1. Cassandra : Introduction Patrick McFadin Chief Evangelist/Solution Architect - DataStax @PatrickMcFadin ©2013 DataStax Confidential. Do not distribute without consent.
  • 2. Who I am • Patrick McFadin • Solution Architect at DataStax • Cassandra MVP • User for years • Follow me for more: Dude. Uptime == $$ @PatrickMcFadin I talk about Cassandra and building scalable, resilient apps ALL THE TIME! !2
  • 3. Five Years of Cassandra 0.1 Jul-08 ... 0.3 0 0.6 1 0.7 1.0 2 1.2 3 DSE 4 2.0 5
  • 6. Cassandra - An introduction
  • 7. Cassandra - Roots • Based on Amazon Dynamo and Google BigTable paper • Shared nothing • Data safe as possible • Predictable scaling Dynamo BigTable !7
  • 8. Cassandra - More than one server Each node owns 25% of the data • All nodes participate in a cluster • Shared nothing • Add or remove as needed 25% • More capacity? Add a server
 25% 25% 25% !8
  • 9. Core Concepts Write path <row,column> Compacted later
  • 10. Core Concepts Read Path Real user story • New app • SSDs • 2.5 m requests • Client P99: 3.17ms!
  • 11. Cassandra - Locally Distributed • Client writes to any node • Node coordinates with others • Data replicated in parallel • Replication factor: How many copies of your data? • RF = 3 here !11
  • 12. Cassandra - Consistency • Consistency Level (CL) • Client specifies per read or write • ALL = All replicas ack • QUORUM = > 51% of replicas ack • LOCAL_QUORUM = > 51% in local DC ack • ONE = Only one replica acks !12
  • 13. Cassandra - Transparent to the application • A single node failure shouldn’t bring failure • Replication Factor + Consistency Level = Success • This example: • RF = 3 • CL = QUORUM >51% Ack so we are good! !13
  • 15. Cassandra - Geographically Distributed • Client writes local • Data syncs across WAN • Replication Factor per DC !15
  • 16. Cassandra Applications - Drivers • DataStax Drivers for Cassandra • Java • C# • Python • more on the way !16
  • 17. Cassandra Applications - Connecting • Create a pool of local servers • Client just uses session to interact with Cassandra ! contactPoints = {“10.0.0.1”,”10.0.0.2”}! ! keyspace = “videodb”! ! ! public VideoDbBasicImpl(List<String> contactPoints, String keyspace) {! cluster = Cluster! .builder()! .addContactPoints(! contactPoints.toArray(new String[contactPoints.size()]))! .withLoadBalancingPolicy(Policies.defaultLoadBalancingPolicy())! .withRetryPolicy(Policies.defaultRetryPolicy())! .build();! ! ! session = cluster.connect(keyspace);! } !17
  • 18. CQL Intro • Cassandra Query Language • SQL–like language to query Cassandra • Limited predicates. Attempts to prevent bad queries • But still offers enough leeway to get into trouble !18
  • 19. Data Model Logical containers Cluster - Contains all nodes. Even across WAN Keyspace - Contains all tables. Specifies replication Table (Column Family) - Contains rows
  • 20. CQL Intro • CREATE / DROP / ALTER TABLE • SELECT ! • BUT • INSERT AND UPDATE are similar to each other • If a row doesn’t exist, UPDATE will insert it, and if it exists, INSERT will replace it. • Think of it as an UPSERT • Therefore we never get a key violation • For updates, Cassandra never reads (no col = col + 1) !20
  • 21. Data Modeling Creating Tables CREATE TABLE user (! ! username varchar,! ! firstname varchar,! ! lastname varchar,! ! shopping_carts set<varchar>,! ! PRIMARY KEY (username)! ); Collection! CREATE TABLE shopping_cart (! ! username varchar,! ! cart_name text! ! item_id int,! ! item_name varchar,! description varchar,! ! price float,! ! item_detail map<varchar,varchar>! ! PRIMARY KEY ((username,cart_name),item_id)! ); Creates compound partition row key
  • 22. CQL Inserts • Insert will always overwrite INSERT INTO users (username, firstname, lastname, ! email, password, created_date)! VALUES ('pmcfadin','Patrick','McFadin',! ['patrick@datastax.com'],'ba27e03fd95e507daf2937c937d499ab',! '2011-06-20 13:50:00');! !22
  • 23. CQL Selects • No joins • Data is returned in row/column format SELECT username, firstname, lastname, ! email, password, created_date! FROM users! WHERE username = 'pmcfadin';! username | firstname | lastname | email | password | created_date! ----------+-----------+----------+--------------------------+----------------------------------+--------------------------! pmcfadin | Patrick | McFadin | ['patrick@datastax.com'] | ba27e03fd95e507daf2937c937d499ab | 2011-06-20 13:50:00-0700! !23
  • 25. Time Series Taming the beast • Peter Higgs and Francois Englert. Nobel prize for Physics • Theorized the existence of the Higgs boson ! • Found using ATLAS ! ! • Data stored in P-BEAST ! ! • Time series running on Cassandra
  • 26. Use Cassandra for time series Get a nobel prize
  • 27. Time Series Why • Storage model from BigTable is perfect • One row key and tons of (variable)columns • Single layout on disk Row Key Column Name Column Name Column Value Column Value
  • 28. Time Series Example • Storing weather data • One weather station • Temperature measurements every minute WeatherStation ID 2013-10-09 10:00 AM 2013-10-09 10:00 AM 72 Degrees 72 Degrees 2013-10-10 11:00 AM 65 Degrees
  • 29. Time Series Example • Query data • Weather Station ID = Locality of single node Date query weatherStationID = 100 AND! date = 2013-10-09 10:00 AM WeatherStation ID 2013-10-09 10:00 AM 2013-10-09 10:00 AM 100 72 Degrees 72 Degrees 2013-10-10 11:00 AM 65 Degrees OR Date Range weatherStationID = 100 AND! date > 2013-10-09 10:00 AM AND! date < 2013-10-10 11:01 AM
  • 30. Time Series How • CQL expresses this well • Data partitioned by weather station ID and time CREATE TABLE temperature (! weatherstation_id text,! event_time timestamp,! temperature text,! PRIMARY KEY (weatherstation_id,event_time)! ); ! ! ! • Easy to insert data INSERT INTO temperature(weatherstation_id,event_time,temperature) ! VALUES ('1234ABCD','2013-04-03 07:01:00','72F'); ! ! • Easy to query SELECT temperature ! FROM temperature ! WHERE weatherstation_id='1234ABCD'! AND event_time > '2013-04-03 07:01:00'! AND event_time < '2013-04-03 07:04:00';
  • 31. Time Series Further partitioning • At every minute you will eventually run out of rows • 2 billion columns per storage row • Data partitioned by weather station ID and time • Use the partition key to split things up CREATE TABLE temperature_by_day (! weatherstation_id text,! date text,! event_time timestamp,! temperature text,! PRIMARY KEY ((weatherstation_id,date),event_time)! );
  • 32. Time Series Further Partitioning • Still easy to insert ! ! INSERT INTO temperature_by_day(weatherstation_id,date,event_time,temperature) ! VALUES ('1234ABCD','2013-04-03','2013-04-03 07:01:00','72F'); ! ! • Still easy to query SELECT temperature ! FROM temperature_by_day ! WHERE weatherstation_id='1234ABCD' ! AND date='2013-04-03'! AND event_time > '2013-04-03 07:01:00'! AND event_time < '2013-04-03 07:04:00';
  • 33. Time Series Use cases • Logging • Thing Tracking (IoT) • Sensor Data • User Tracking • Fraud Detection • Nobel prizes!
  • 34. Application Example - Layout • Active-Active • Service based DNS routing Cassandra Replication !34
  • 35. Application Example - Uptime • Normal server maintenance • Application is unaware Cassandra Replication !35
  • 36. Application Example - Failure • Data center failure Another happy user! • Data is safe. Route traffic. 33 !36
  • 37. Cassandra Users and Use Cases
  • 38. Netflix! • If you haven’t heard their story… where have you been? • 18B market cap — Runs on Cassandra • User accounts • Play lists • Payments • Statistics
  • 39. Spotify • Millions of songs. Millions of users. • Playlists • 1 billion playlists • 30+ Cassandra clusters • 50+ TB of data • 40k req/sec peak http://www.slideshare.net/noaresare/cassandra-nyc !39
  • 40. Instagram(Facebook) • Loads and loads of photos. (Probably yours) • All in AWS • Security audits • News feed • 20k writes/sec. 15k reads/sec. !40
  • 41. DataStax Ac*demy for Apache Cassandra Content • First four sessions available with Weekly roll-out of 7 sessions total • Based on DataStax Community Edition • CQL, Schema Design and Data Modeling • Introduction to Cassandra Objects • First Java, then Python, C# and .NET Goals • 100,000 Registrations by the end of 2014 • 25,000 Certifications by the end of 2014 https://datastaxacademy.elogiclearning.com/ !41
  • 42. ©2013 DataStax Confidential. Do not distribute without consent. !42