SlideShare a Scribd company logo
1 of 54
Download to read offline
Five questions
for your NoSQL solution!
Jonathan Ellis
CTO, DataStax
Project Chair, Apache Cassandra
how do I



 my application?
                 model

©2012 DataStax
Popular options
  • Key/value
  • Tabular
  • Document
  • Graph?




©2012 DataStax
Schema is your friend

{
         "id": "e451dd42-ece3-11e1-a0a3-34159e154f4c",
         "name": "jbellis",
         "state": "TX",
         "birthdate": "1/1/1976",
         "email_addresses": ["jbellis@gmail", "jbellis@datastax.com"],
}




    ©2012 DataStax
SQL can be your friend too

 CREATE TABLE users (
    id uuid PRIMARY KEY,
    name text,
    state text,
    birth_date date
 );



 CREATE INDEX ON users(state);

 SELECT * FROM users
 WHERE state=‘Texas’ AND birth_date > ‘1950-01-01’;




©2012 DataStax
Collections
 CREATE TABLE users (
    id uuid PRIMARY KEY,
    name text,
    state text,
    birth_date date
 );

 CREATE TABLE users_addresses (
    user_id uuid REFERENCES users,
    email text
 );

 SELECT *
 FROM users NATURAL JOIN users_addresses;




©2012 DataStax
Collections
 CREATE TABLE users (
    id uuid PRIMARY KEY,
    name text,
    state text,




                 X
    birth_date date
 );

 CREATE TABLE users_addresses (
    user_id uuid REFERENCES users,
    email text
 );

 SELECT *
 FROM users NATURAL JOIN users_addresses;




©2012 DataStax
Collections
 CREATE TABLE users (
    id uuid PRIMARY KEY,
    name text,
    state text,
    birth_date date,
    email_addresses set<text>
 );

 UPDATE users
 SET email_addresses = email_addresses + {‘jbellis@gmail.com’,
 ‘jbellis@datastax.com’};




©2012 DataStax
Joins don’t scale
  • No joins
  • No subqueries
  • No aggregation functions* or GROUP BY
  • ORDER BY?




©2012 DataStax
SELECT * FROM tweets
WHERE user_id IN (SELECT follower FROM followers
                  WHERE user_id = ’driftx’)

                       followers




                  ?




 ©2012 DataStax
                                    tweets
Clustering in Cassandra
CREATE TABLE timeline (     user_id   tweet_id   _author    _body
  user_id uuid,
  tweet_id timeuuid,        jbellis   3290f9da.. rbranson   lorem
  tweet_author uuid,        jbellis   3895411a..   tjake    ipsum
   tweet_body text,           ...         ...        ...
  PRIMARY KEY (user_id,
                tweet_id)   driftx    3290f9da.. rbranson   lorem
);
                            driftx    71b46a84.. yzhang     dolor
                              ...         ...       ...


                            yukim     3290f9da.. rbranson   lorem
                            yukim     e451dd42..   tjake     amet
                              ...         ...        ...



 ©2012 DataStax
Clustering in Cassandra
CREATE TABLE timeline (     user_id   tweet_id   _author    _body
  user_id uuid,
  tweet_id timeuuid,        jbellis   3290f9da.. rbranson   lorem
  tweet_author uuid,        jbellis   3895411a..   tjake    ipsum
   tweet_body text,           ...         ...        ...
  PRIMARY KEY (user_id,
                tweet_id)   driftx    3290f9da.. rbranson   lorem
);
                            driftx    71b46a84.. yzhang     dolor
                              ...         ...       ...
SELECT * FROM timeline
WHERE user_id = ’driftx’;   yukim     3290f9da.. rbranson   lorem
                            yukim     e451dd42..   tjake     amet
                              ...         ...        ...



 ©2012 DataStax
how does it

                 perform?

©2012 DataStax
VLDB benchmark




©2012 DataStax
Locking




©2012 DataStax
Efficiency




©2012 DataStax
UPDATE users
 SET email_addresses = email_addresses + {...}
 WHERE user_id = ‘jbellis’;




©2012 DataStax
Durability




©2012 DataStax
Log-structured storage engine
           write( k1 ,   c1:v1   )

                                                  Memory




                                     Memtable




        Commit log



©2012 DataStax                                  Hard drive
write( k1 , c1:v1 )

                                                         Memory
                                 k1 c1:v1




                                            Memtable



                 k1 c1:v1




        Commit log



©2012 DataStax                                         Hard drive
write( k1 , c2:v2 )

                                                    Memory
                                 k1 c1:v1 c2:v2




                 k1 c1:v1
                 k1 c2:v2




©2012 DataStax                                    Hard drive
write(        k2   ,   c1:v1 c2:v2   )
                                                                        Memory
                                                     k1 c1:v1 c2:v2

                                                     k2 c1:v1 c2:v2




                   k1 c1:v1
                   k1 c2:v2
                 k2 c1:v1 c2:v2




©2012 DataStax                                                        Hard drive
write(        k1   ,   c1:v4 c3:v3   )
                                                                              Memory
                                                     k1 c1:v4 c2:v2 c3:v3

                                                     k2 c1:v1 c2:v2




                   k1 c1:v1
                   k1 c2:v2
                 k2 c1:v1 c2:v2
             k1 c1:v4 c3:v3




©2012 DataStax                                                              Hard drive
Memory




                           flush




                                  index / BF
                 cleanup    k1 c1:v4 c2:v2 c3:v3

                            k2 c1:v1 c2:v2


                                                   SSTable




©2012 DataStax                                               Hard drive
No random writes




©2012 DataStax
The gory details




©2012 DataStax
Larger than memory datasets




©2012 DataStax
how does it handle

                 failure?

©2012 DataStax
Classic partitioning with SPOF
                 partition 1   partition 2      partition 3   partition 4




                                         router


                                             client
©2012 DataStax
Availability
  • “High availability implies that a single fault will
            not bring down your system. Not ‘we’ll recover
            quickly.’”
            -- Ben Coverston: DataStax

     • “The biggest problem with failover is that you're
            almost never using it until it really hurts. It's like
            backups that you never test.”
            -- Rick Branson: Instagram




©2012 DataStax
Fully distributed, no SPOF
                 client




                          p3
                                p6        p1
                           p1




                                     p1




©2012 DataStax
Multiple datacenters




©2012 DataStax
©2012 DataStax
Self-healing

                 request   1




 Client                         Coordinator        internal
                                                   request    2
                 response   4




                                    internal
                                   response    3
                                                    Replica

©2012 DataStax
Self-healing

                 request   1




 Client                         Coordinator        internal
                                                   request    2
                 response   4




                                    internal
                                   response    3
                                                    Replica

©2012 DataStax
Self-healing

                 request   1




 Client                            Coordinator   internal
                                                 request    2

                  timeout
                 response      4




                                                  Replica

©2012 DataStax                                   replica fails
Self-healing

                 request   1




 Client                            Coordinator   internal
                                                 request    2




                                                 X
                  timeout
                 response      4




                                                  Replica

©2012 DataStax                                   replica fails
Self-healing

                 request   1




 Client                            Coordinator   internal
                                                 request    2

                  timeout
                 response      4


                                      hint   3


                                                  Replica

©2012 DataStax                                   replica fails
Self-healing

                 request   1




 Client                            Coordinator   internal
                                                 request    2




                                                 X
                  timeout
                 response      4


                                      hint   3


                                                  Replica

©2012 DataStax                                   replica fails
Other healing modes
 • AntiEntropyService
 • Read repair




©2012 DataStax
Dynamic snitch
(dealing with partial failure)
                                     90% busy



  Client           Coordinator
                                            30% busy




                                 40% busy
 ©2012 DataStax
how does it

                 scale?

©2012 DataStax
VLDB benchmark




©2012 DataStax
Scaling antipatterns
  • Metadata servers
  • Router bottlenecks
  • Overloading existing nodes when adding
            capacity




©2012 DataStax
how


 is it?
                 flexible

©2012 DataStax
©2012 DataStax
Data model: Realtime
     LiveStocks      stock       last
                    GOOG        $95.52
                     AAPL      $186.10
                    AMZN       $112.98


       Portfolios    user       stock       shares
                    jbellis     GOOG          80
                    jbellis     LNKD          20
                    yukim       AMZN         100

      StockHist     stock        date       price
                    GOOG      2011-01-01    $8.23
                    GOOG      2011-01-02    $6.14
                    GOOG      2011-001-03   $7.78
©2012 DataStax
Data model: Analytics
 HistLoss                     worst_date    loss
                 Portfolio1   2011-07-23   -$34.81
                 Portfolio2   2011-03-11 -$11432.24
                 Portfolio3   2011-05-21 -$1476.93




©2012 DataStax
Data model: Analytics
  10dayreturns
          stock      rdate     return
          GOOG    2011-07-25   $8.23
          GOOG    2011-07-24   $6.14
          GOOG    2011-07-23   $7.78
          AAPL    2011-07-25   $15.32
          AAPL    2011-07-24   $12.68


     INSERT OVERWRITE TABLE 10dayreturns
     SELECT a.stock,
            b.date as rdate,
            b.price - a.price
     FROM StockHist a
     JOIN StockHist b
     ON (a.stock = b.stock
         AND date_add(a.date, 10) = b.date);

©2012 DataStax
Data model: Analytics
  portfolio_returns
            portfolio       rdate      preturn
            Portfolio1   2011-07-25    $118.21
            Portfolio1   2011-07-24     $60.78
            Portfolio1   2011-07-23    -$34.81
            Portfolio2   2011-07-25   $2143.92
            Portfolio3   2011-07-24    -$10.19


       INSERT OVERWRITE TABLE portfolio_returns
       SELECT portfolio,
              rdate,
              SUM(b.return)
       FROM portfolios a JOIN 10dayreturns b
       ON (a.stock = b.stock)
       GROUP BY portfolio, rdate;

©2012 DataStax
Data model: Analytics
  HistLoss
                       worst_date    loss
          Portfolio1   2011-07-23   -$34.81
          Portfolio2   2011-03-11 -$11432.24
          Portfolio3   2011-05-21 -$1476.93



    INSERT OVERWRITE TABLE HistLoss
    SELECT a.portfolio, rdate, minp
    FROM (
      SELECT portfolio, min(preturn) as minp
      FROM portfolio_returns
      GROUP BY portfolio
    ) a
    JOIN portfolio_returns b
    ON (a.portfolio = b.portfolio and a.minp = b.preturn);

©2012 DataStax
©2012 DataStax
Some Cassandra users




©2012 DataStax
Questions?

Image credits
 •   http://www.flickr.com/photos/26817893@N05/2573006312/

 •   http://www.flickr.com/photos/rowanbank/7686239548

 •   http://www.flickr.com/photos/mervtheswerve/6081933265

 •   http://www.flickr.com/photos/dg_pics/2526208830

 •   http://www.flickr.com/photos/wainwright/351684037

 •   http://www.flickr.com/photos/mikeneilson/1606662529

 •   http://www.flickr.com/photos/sbisson/3852905534

 •   http://www.flickr.com/photos/breadnbadger/2674928517

More Related Content

What's hot

Data Locality, Latency and Caching: JSR-107 and the new Java JCACHE Standard
Data Locality, Latency and Caching:  JSR-107 and the new Java JCACHE StandardData Locality, Latency and Caching:  JSR-107 and the new Java JCACHE Standard
Data Locality, Latency and Caching: JSR-107 and the new Java JCACHE StandardBen Cotton
 
Configuring a failover cluster on a Dell PowerEdge VRTX
Configuring a failover cluster on a Dell PowerEdge VRTXConfiguring a failover cluster on a Dell PowerEdge VRTX
Configuring a failover cluster on a Dell PowerEdge VRTXPrincipled Technologies
 
VDI performance comparison: Dell PowerEdge FX2 and FC430 servers with VMware ...
VDI performance comparison: Dell PowerEdge FX2 and FC430 servers with VMware ...VDI performance comparison: Dell PowerEdge FX2 and FC430 servers with VMware ...
VDI performance comparison: Dell PowerEdge FX2 and FC430 servers with VMware ...Principled Technologies
 
Comparing performance and cost: Dell PowerEdge VRTX with one Dell PowerEdge M...
Comparing performance and cost: Dell PowerEdge VRTX with one Dell PowerEdge M...Comparing performance and cost: Dell PowerEdge VRTX with one Dell PowerEdge M...
Comparing performance and cost: Dell PowerEdge VRTX with one Dell PowerEdge M...Principled Technologies
 
Converged architecture advantages: Dell PowerEdge FX2s and FC830 servers vs. ...
Converged architecture advantages: Dell PowerEdge FX2s and FC830 servers vs. ...Converged architecture advantages: Dell PowerEdge FX2s and FC830 servers vs. ...
Converged architecture advantages: Dell PowerEdge FX2s and FC830 servers vs. ...Principled Technologies
 
Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseDipti Borkar
 
MySQL Cluster 8.0 tutorial
MySQL Cluster 8.0 tutorialMySQL Cluster 8.0 tutorial
MySQL Cluster 8.0 tutorialFrazer Clement
 
What's new in SQL Server 2012 for philly code camp 2012.1
What's new in SQL Server 2012 for philly code camp 2012.1What's new in SQL Server 2012 for philly code camp 2012.1
What's new in SQL Server 2012 for philly code camp 2012.1Mark Kromer
 
Dell PowerEdge M520 server solution: Energy efficiency and database performance
Dell PowerEdge M520 server solution: Energy efficiency and database performanceDell PowerEdge M520 server solution: Energy efficiency and database performance
Dell PowerEdge M520 server solution: Energy efficiency and database performancePrincipled Technologies
 
Throughput comparison: Dell PowerEdge R720 drive options
Throughput comparison: Dell PowerEdge R720 drive optionsThroughput comparison: Dell PowerEdge R720 drive options
Throughput comparison: Dell PowerEdge R720 drive optionsPrincipled Technologies
 
Exchange 2010 ha ctd
Exchange 2010 ha ctdExchange 2010 ha ctd
Exchange 2010 ha ctdKaliyan S
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012jbellis
 
Dell PowerEdge VRTX and M-series compute nodes configuration study
Dell PowerEdge VRTX and M-series compute nodes configuration studyDell PowerEdge VRTX and M-series compute nodes configuration study
Dell PowerEdge VRTX and M-series compute nodes configuration studyPrincipled Technologies
 
Deduplication without performance hits: Intel Xeon processor E5-2697v2-powere...
Deduplication without performance hits: Intel Xeon processor E5-2697v2-powere...Deduplication without performance hits: Intel Xeon processor E5-2697v2-powere...
Deduplication without performance hits: Intel Xeon processor E5-2697v2-powere...Principled Technologies
 
Workstation heat and power usage: Lenovo ThinkStation P500 vs. HP Z440 Workst...
Workstation heat and power usage: Lenovo ThinkStation P500 vs. HP Z440 Workst...Workstation heat and power usage: Lenovo ThinkStation P500 vs. HP Z440 Workst...
Workstation heat and power usage: Lenovo ThinkStation P500 vs. HP Z440 Workst...Principled Technologies
 
Compaction and Splitting in Apache Accumulo
Compaction and Splitting in Apache AccumuloCompaction and Splitting in Apache Accumulo
Compaction and Splitting in Apache AccumuloHortonworks
 

What's hot (20)

Data Locality, Latency and Caching: JSR-107 and the new Java JCACHE Standard
Data Locality, Latency and Caching:  JSR-107 and the new Java JCACHE StandardData Locality, Latency and Caching:  JSR-107 and the new Java JCACHE Standard
Data Locality, Latency and Caching: JSR-107 and the new Java JCACHE Standard
 
Configuring a failover cluster on a Dell PowerEdge VRTX
Configuring a failover cluster on a Dell PowerEdge VRTXConfiguring a failover cluster on a Dell PowerEdge VRTX
Configuring a failover cluster on a Dell PowerEdge VRTX
 
VDI performance comparison: Dell PowerEdge FX2 and FC430 servers with VMware ...
VDI performance comparison: Dell PowerEdge FX2 and FC430 servers with VMware ...VDI performance comparison: Dell PowerEdge FX2 and FC430 servers with VMware ...
VDI performance comparison: Dell PowerEdge FX2 and FC430 servers with VMware ...
 
Comparing performance and cost: Dell PowerEdge VRTX with one Dell PowerEdge M...
Comparing performance and cost: Dell PowerEdge VRTX with one Dell PowerEdge M...Comparing performance and cost: Dell PowerEdge VRTX with one Dell PowerEdge M...
Comparing performance and cost: Dell PowerEdge VRTX with one Dell PowerEdge M...
 
Converged architecture advantages: Dell PowerEdge FX2s and FC830 servers vs. ...
Converged architecture advantages: Dell PowerEdge FX2s and FC830 servers vs. ...Converged architecture advantages: Dell PowerEdge FX2s and FC830 servers vs. ...
Converged architecture advantages: Dell PowerEdge FX2s and FC830 servers vs. ...
 
DataStax 6 and Beyond
DataStax 6 and BeyondDataStax 6 and Beyond
DataStax 6 and Beyond
 
Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and Couchbase
 
MySQL Cluster 8.0 tutorial
MySQL Cluster 8.0 tutorialMySQL Cluster 8.0 tutorial
MySQL Cluster 8.0 tutorial
 
What's new in SQL Server 2012 for philly code camp 2012.1
What's new in SQL Server 2012 for philly code camp 2012.1What's new in SQL Server 2012 for philly code camp 2012.1
What's new in SQL Server 2012 for philly code camp 2012.1
 
Dell PowerEdge M520 server solution: Energy efficiency and database performance
Dell PowerEdge M520 server solution: Energy efficiency and database performanceDell PowerEdge M520 server solution: Energy efficiency and database performance
Dell PowerEdge M520 server solution: Energy efficiency and database performance
 
Throughput comparison: Dell PowerEdge R720 drive options
Throughput comparison: Dell PowerEdge R720 drive optionsThroughput comparison: Dell PowerEdge R720 drive options
Throughput comparison: Dell PowerEdge R720 drive options
 
Exchange 2010 ha ctd
Exchange 2010 ha ctdExchange 2010 ha ctd
Exchange 2010 ha ctd
 
Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012Cassandra at NoSql Matters 2012
Cassandra at NoSql Matters 2012
 
Dell PowerEdge VRTX and M-series compute nodes configuration study
Dell PowerEdge VRTX and M-series compute nodes configuration studyDell PowerEdge VRTX and M-series compute nodes configuration study
Dell PowerEdge VRTX and M-series compute nodes configuration study
 
Grails 2.0 Update
Grails 2.0 UpdateGrails 2.0 Update
Grails 2.0 Update
 
SQLFire at Strata 2012
SQLFire at Strata 2012SQLFire at Strata 2012
SQLFire at Strata 2012
 
Deduplication without performance hits: Intel Xeon processor E5-2697v2-powere...
Deduplication without performance hits: Intel Xeon processor E5-2697v2-powere...Deduplication without performance hits: Intel Xeon processor E5-2697v2-powere...
Deduplication without performance hits: Intel Xeon processor E5-2697v2-powere...
 
Workstation heat and power usage: Lenovo ThinkStation P500 vs. HP Z440 Workst...
Workstation heat and power usage: Lenovo ThinkStation P500 vs. HP Z440 Workst...Workstation heat and power usage: Lenovo ThinkStation P500 vs. HP Z440 Workst...
Workstation heat and power usage: Lenovo ThinkStation P500 vs. HP Z440 Workst...
 
Compaction and Splitting in Apache Accumulo
Compaction and Splitting in Apache AccumuloCompaction and Splitting in Apache Accumulo
Compaction and Splitting in Apache Accumulo
 
SQLFire lightning talk
SQLFire lightning talkSQLFire lightning talk
SQLFire lightning talk
 

Viewers also liked

Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ... Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...Big Data Spain
 
Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...Big Data Spain
 
Intro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceIntro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceBig Data Spain
 
Location analytics by Marc Planaguma at Big Data Spain 2014
 Location analytics by Marc Planaguma at Big Data Spain 2014 Location analytics by Marc Planaguma at Big Data Spain 2014
Location analytics by Marc Planaguma at Big Data Spain 2014Big Data Spain
 
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014Big Data Spain
 
Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...Big Data Spain
 
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...Big Data Spain
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...Big Data Spain
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...Big Data Spain
 
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...Big Data Spain
 
Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Big Data Spain
 
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data Spain
 
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...Big Data Spain
 
A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...Big Data Spain
 
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...Big Data Spain
 
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Big Data Spain
 
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...Big Data Spain
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Big Data Spain
 
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...Big Data Spain
 
Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
 Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at... Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...Big Data Spain
 

Viewers also liked (20)

Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ... Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
Dataflows: The abstraction that powers Big Data by Raul Castro Fernandez at ...
 
Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...Big Data the potential for data to improve service and business management by...
Big Data the potential for data to improve service and business management by...
 
Intro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conferenceIntro to the Big Data Spain 2014 conference
Intro to the Big Data Spain 2014 conference
 
Location analytics by Marc Planaguma at Big Data Spain 2014
 Location analytics by Marc Planaguma at Big Data Spain 2014 Location analytics by Marc Planaguma at Big Data Spain 2014
Location analytics by Marc Planaguma at Big Data Spain 2014
 
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014 Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
Data warehouse modernization programme by TOBY WOOLFE at Big Data Spain 2014
 
Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...Getting the best insights from your data using Apache Metamodel by Alberto Ro...
Getting the best insights from your data using Apache Metamodel by Alberto Ro...
 
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
ToroDB: Scaling PostgreSQL like MongoDB by Álvaro Hernández at Big Data Spain...
 
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data... Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
 
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
State of Play. Data Science on Hadoop in 2015 by SEAN OWEN at Big Data Spain ...
 
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
CloudMC: A cloud computing map-reduce implementation for radiotherapy. RUBEN ...
 
Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0Convergent Replicated Data Types in Riak 2.0
Convergent Replicated Data Types in Riak 2.0
 
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
Big Data, analytics and 4th generation data warehousing by Martyn Jones at Bi...
 
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
How to integrate Big Data onto an analytical portal, Big Data benchmarking fo...
 
A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...A new streaming computation engine for real-time analytics by Michael Barton ...
A new streaming computation engine for real-time analytics by Michael Barton ...
 
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
IAd-learning: A new e-learning platform by José Antonio Omedes at Big Data Sp...
 
Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...Essential ingredients for real time stream processing @Scale by Kartik pParam...
Essential ingredients for real time stream processing @Scale by Kartik pParam...
 
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
Geospatial and bitemporal search in C* with pluggable Lucene index by Andrés ...
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
 
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...Analyzing organization e-mails in near real time using hadoop ecosystem tools...
Analyzing organization e-mails in near real time using hadoop ecosystem tools...
 
Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
 Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at... Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
 

Similar to The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012

Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databasesjbellis
 
Paris Cassandra Meetup - Overview of New Cassandra Drivers
Paris Cassandra Meetup - Overview of New Cassandra DriversParis Cassandra Meetup - Overview of New Cassandra Drivers
Paris Cassandra Meetup - Overview of New Cassandra DriversMichaël Figuière
 
Linked In Lessons Learned And Growth And Scalability
Linked In Lessons Learned And Growth And ScalabilityLinked In Lessons Learned And Growth And Scalability
Linked In Lessons Learned And Growth And ScalabilityConSanFrancisco123
 
On Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and FutureOn Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and Futurepcmanus
 
Paris Cassandra Meetup - Cassandra for Developers
Paris Cassandra Meetup - Cassandra for DevelopersParis Cassandra Meetup - Cassandra for Developers
Paris Cassandra Meetup - Cassandra for DevelopersMichaël Figuière
 
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...Ludovico Caldara
 
Scalability 09262012
Scalability 09262012Scalability 09262012
Scalability 09262012Mike Miller
 
NoSQL Matters 2012 - Real Time Big Data in practice with Cassandra
NoSQL Matters 2012 - Real Time Big Data in practice with CassandraNoSQL Matters 2012 - Real Time Big Data in practice with Cassandra
NoSQL Matters 2012 - Real Time Big Data in practice with CassandraMichaël Figuière
 
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitApache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitDenis Magda
 
Scaling DataStax in Docker
Scaling DataStax in DockerScaling DataStax in Docker
Scaling DataStax in DockerDataStax
 
Getting Big Value from Big Data
Getting Big Value from Big DataGetting Big Value from Big Data
Getting Big Value from Big DataDataStax
 
Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloudjbellis
 
Big Data Uses with Distributed Asynchronous Object Storage
Big Data Uses with Distributed Asynchronous Object StorageBig Data Uses with Distributed Asynchronous Object Storage
Big Data Uses with Distributed Asynchronous Object StorageIntel® Software
 
ApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
ApacheCon Europe 2012 - Real Time Big Data in practice with CassandraApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
ApacheCon Europe 2012 - Real Time Big Data in practice with CassandraMichaël Figuière
 
Consolidated shared indexes in real time
Consolidated shared indexes in real timeConsolidated shared indexes in real time
Consolidated shared indexes in real timeJeff Mace
 
Implementing and Troubleshooting EdgeSight
Implementing and Troubleshooting EdgeSightImplementing and Troubleshooting EdgeSight
Implementing and Troubleshooting EdgeSightDavid McGeough
 

Similar to The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012 (20)

Five Lessons in Distributed Databases
Five Lessons  in Distributed DatabasesFive Lessons  in Distributed Databases
Five Lessons in Distributed Databases
 
Paris Cassandra Meetup - Overview of New Cassandra Drivers
Paris Cassandra Meetup - Overview of New Cassandra DriversParis Cassandra Meetup - Overview of New Cassandra Drivers
Paris Cassandra Meetup - Overview of New Cassandra Drivers
 
An Hour of DB2 Tips
An Hour of DB2 TipsAn Hour of DB2 Tips
An Hour of DB2 Tips
 
Linked In Lessons Learned And Growth And Scalability
Linked In Lessons Learned And Growth And ScalabilityLinked In Lessons Learned And Growth And Scalability
Linked In Lessons Learned And Growth And Scalability
 
On Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and FutureOn Cassandra Development: Past, Present and Future
On Cassandra Development: Past, Present and Future
 
Paris Cassandra Meetup - Cassandra for Developers
Paris Cassandra Meetup - Cassandra for DevelopersParis Cassandra Meetup - Cassandra for Developers
Paris Cassandra Meetup - Cassandra for Developers
 
ITE - Chapter 12
ITE - Chapter 12ITE - Chapter 12
ITE - Chapter 12
 
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
Oracle Active Data Guard 12c: Far Sync Instance, Real-Time Cascade and Other ...
 
Scalability 09262012
Scalability 09262012Scalability 09262012
Scalability 09262012
 
CouchDB
CouchDBCouchDB
CouchDB
 
NoSQL Matters 2012 - Real Time Big Data in practice with Cassandra
NoSQL Matters 2012 - Real Time Big Data in practice with CassandraNoSQL Matters 2012 - Real Time Big Data in practice with Cassandra
NoSQL Matters 2012 - Real Time Big Data in practice with Cassandra
 
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science ToolkitApache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
 
Scaling DataStax in Docker
Scaling DataStax in DockerScaling DataStax in Docker
Scaling DataStax in Docker
 
Getting Big Value from Big Data
Getting Big Value from Big DataGetting Big Value from Big Data
Getting Big Value from Big Data
 
Data day texas: Cassandra and the Cloud
Data day texas: Cassandra and the CloudData day texas: Cassandra and the Cloud
Data day texas: Cassandra and the Cloud
 
Big Data Uses with Distributed Asynchronous Object Storage
Big Data Uses with Distributed Asynchronous Object StorageBig Data Uses with Distributed Asynchronous Object Storage
Big Data Uses with Distributed Asynchronous Object Storage
 
ApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
ApacheCon Europe 2012 - Real Time Big Data in practice with CassandraApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
ApacheCon Europe 2012 - Real Time Big Data in practice with Cassandra
 
Consolidated shared indexes in real time
Consolidated shared indexes in real timeConsolidated shared indexes in real time
Consolidated shared indexes in real time
 
Implementing and Troubleshooting EdgeSight
Implementing and Troubleshooting EdgeSightImplementing and Troubleshooting EdgeSight
Implementing and Troubleshooting EdgeSight
 
Neethu_Abraham
Neethu_AbrahamNeethu_Abraham
Neethu_Abraham
 

More from Big Data Spain

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data Spain
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Big Data Spain
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017Big Data Spain
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Big Data Spain
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Big Data Spain
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Big Data Spain
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Big Data Spain
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Big Data Spain
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...Big Data Spain
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Big Data Spain
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Big Data Spain
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...Big Data Spain
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Big Data Spain
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Big Data Spain
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Big Data Spain
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Big Data Spain
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...Big Data Spain
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Big Data Spain
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...Big Data Spain
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Big Data Spain
 

More from Big Data Spain (20)

Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
Big Data, Big Quality? by Irene Gonzálvez at Big Data Spain 2017
 
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
 
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017AI: The next frontier by Amparo Alonso at Big Data Spain 2017
AI: The next frontier by Amparo Alonso at Big Data Spain 2017
 
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
Disaster Recovery for Big Data by Carlos Izquierdo at Big Data Spain 2017
 
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
Presentation: Boost Hadoop and Spark with in-memory technologies by Akmal Cha...
 
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
Data science for lazy people, Automated Machine Learning by Diego Hueltes at ...
 
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
Training Deep Learning Models on Multiple GPUs in the Cloud by Enrique Otero ...
 
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
Unbalanced data: Same algorithms different techniques by Eric Martín at Big D...
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
 
Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...Trading at market speed with the latest Kafka features by Iñigo González at B...
Trading at market speed with the latest Kafka features by Iñigo González at B...
 
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
Unified Stream Processing at Scale with Apache Samza by Jake Maes at Big Data...
 
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a... The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
The Analytic Platform behind IBM’s Watson Data Platform by Luciano Resende a...
 
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
Artificial Intelligence and Data-centric businesses by Óscar Méndez at Big Da...
 
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017
 
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
Meme Index. Analyzing fads and sensations on the Internet by Miguel Romero at...
 
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
Vehicle Big Data that Drives Smart City Advancement by Mike Branch at Big Dat...
 
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
End of the Myth: Ultra-Scalable Transactional Management by Ricardo Jiménez-P...
 
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
Attacking Machine Learning used in AntiVirus with Reinforcement by Rubén Mart...
 
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
More people, less banking: Blockchain by Salvador Casquero at Big Data Spain ...
 
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
Make the elephant fly, once again by Sourygna Luangsay at Big Data Spain 2017
 

Recently uploaded

Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Jeffrey Haguewood
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFMichael Gough
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 

Recently uploaded (20)

Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
Email Marketing Automation for Bonterra Impact Management (fka Social Solutio...
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
All These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDFAll These Sophisticated Attacks, Can We Really Detect Them - PDF
All These Sophisticated Attacks, Can We Really Detect Them - PDF
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 

The top five questions to ask about NoSQL. JONATHAN ELLIS at Big Data Spain 2012

  • 1. Five questions for your NoSQL solution! Jonathan Ellis CTO, DataStax Project Chair, Apache Cassandra
  • 2. how do I my application? model ©2012 DataStax
  • 3. Popular options • Key/value • Tabular • Document • Graph? ©2012 DataStax
  • 4. Schema is your friend { "id": "e451dd42-ece3-11e1-a0a3-34159e154f4c", "name": "jbellis", "state": "TX", "birthdate": "1/1/1976", "email_addresses": ["jbellis@gmail", "jbellis@datastax.com"], } ©2012 DataStax
  • 5. SQL can be your friend too CREATE TABLE users ( id uuid PRIMARY KEY, name text, state text, birth_date date ); CREATE INDEX ON users(state); SELECT * FROM users WHERE state=‘Texas’ AND birth_date > ‘1950-01-01’; ©2012 DataStax
  • 6. Collections CREATE TABLE users ( id uuid PRIMARY KEY, name text, state text, birth_date date ); CREATE TABLE users_addresses ( user_id uuid REFERENCES users, email text ); SELECT * FROM users NATURAL JOIN users_addresses; ©2012 DataStax
  • 7. Collections CREATE TABLE users ( id uuid PRIMARY KEY, name text, state text, X birth_date date ); CREATE TABLE users_addresses ( user_id uuid REFERENCES users, email text ); SELECT * FROM users NATURAL JOIN users_addresses; ©2012 DataStax
  • 8. Collections CREATE TABLE users ( id uuid PRIMARY KEY, name text, state text, birth_date date, email_addresses set<text> ); UPDATE users SET email_addresses = email_addresses + {‘jbellis@gmail.com’, ‘jbellis@datastax.com’}; ©2012 DataStax
  • 9. Joins don’t scale • No joins • No subqueries • No aggregation functions* or GROUP BY • ORDER BY? ©2012 DataStax
  • 10. SELECT * FROM tweets WHERE user_id IN (SELECT follower FROM followers WHERE user_id = ’driftx’) followers ? ©2012 DataStax tweets
  • 11. Clustering in Cassandra CREATE TABLE timeline ( user_id tweet_id _author _body   user_id uuid,   tweet_id timeuuid, jbellis 3290f9da.. rbranson lorem   tweet_author uuid, jbellis 3895411a.. tjake ipsum tweet_body text, ... ... ...   PRIMARY KEY (user_id, tweet_id) driftx 3290f9da.. rbranson lorem ); driftx 71b46a84.. yzhang dolor ... ... ... yukim 3290f9da.. rbranson lorem yukim e451dd42.. tjake amet ... ... ... ©2012 DataStax
  • 12. Clustering in Cassandra CREATE TABLE timeline ( user_id tweet_id _author _body   user_id uuid,   tweet_id timeuuid, jbellis 3290f9da.. rbranson lorem   tweet_author uuid, jbellis 3895411a.. tjake ipsum tweet_body text, ... ... ...   PRIMARY KEY (user_id, tweet_id) driftx 3290f9da.. rbranson lorem ); driftx 71b46a84.. yzhang dolor ... ... ... SELECT * FROM timeline WHERE user_id = ’driftx’; yukim 3290f9da.. rbranson lorem yukim e451dd42.. tjake amet ... ... ... ©2012 DataStax
  • 13. how does it perform? ©2012 DataStax
  • 17. UPDATE users SET email_addresses = email_addresses + {...} WHERE user_id = ‘jbellis’; ©2012 DataStax
  • 19. Log-structured storage engine write( k1 , c1:v1 ) Memory Memtable Commit log ©2012 DataStax Hard drive
  • 20. write( k1 , c1:v1 ) Memory k1 c1:v1 Memtable k1 c1:v1 Commit log ©2012 DataStax Hard drive
  • 21. write( k1 , c2:v2 ) Memory k1 c1:v1 c2:v2 k1 c1:v1 k1 c2:v2 ©2012 DataStax Hard drive
  • 22. write( k2 , c1:v1 c2:v2 ) Memory k1 c1:v1 c2:v2 k2 c1:v1 c2:v2 k1 c1:v1 k1 c2:v2 k2 c1:v1 c2:v2 ©2012 DataStax Hard drive
  • 23. write( k1 , c1:v4 c3:v3 ) Memory k1 c1:v4 c2:v2 c3:v3 k2 c1:v1 c2:v2 k1 c1:v1 k1 c2:v2 k2 c1:v1 c2:v2 k1 c1:v4 c3:v3 ©2012 DataStax Hard drive
  • 24. Memory flush index / BF cleanup k1 c1:v4 c2:v2 c3:v3 k2 c1:v1 c2:v2 SSTable ©2012 DataStax Hard drive
  • 27. Larger than memory datasets ©2012 DataStax
  • 28. how does it handle failure? ©2012 DataStax
  • 29. Classic partitioning with SPOF partition 1 partition 2 partition 3 partition 4 router client ©2012 DataStax
  • 30. Availability • “High availability implies that a single fault will not bring down your system. Not ‘we’ll recover quickly.’” -- Ben Coverston: DataStax • “The biggest problem with failover is that you're almost never using it until it really hurts. It's like backups that you never test.” -- Rick Branson: Instagram ©2012 DataStax
  • 31. Fully distributed, no SPOF client p3 p6 p1 p1 p1 ©2012 DataStax
  • 34. Self-healing request 1 Client Coordinator internal request 2 response 4 internal response 3 Replica ©2012 DataStax
  • 35. Self-healing request 1 Client Coordinator internal request 2 response 4 internal response 3 Replica ©2012 DataStax
  • 36. Self-healing request 1 Client Coordinator internal request 2 timeout response 4 Replica ©2012 DataStax replica fails
  • 37. Self-healing request 1 Client Coordinator internal request 2 X timeout response 4 Replica ©2012 DataStax replica fails
  • 38. Self-healing request 1 Client Coordinator internal request 2 timeout response 4 hint 3 Replica ©2012 DataStax replica fails
  • 39. Self-healing request 1 Client Coordinator internal request 2 X timeout response 4 hint 3 Replica ©2012 DataStax replica fails
  • 40. Other healing modes • AntiEntropyService • Read repair ©2012 DataStax
  • 41. Dynamic snitch (dealing with partial failure) 90% busy Client Coordinator 30% busy 40% busy ©2012 DataStax
  • 42. how does it scale? ©2012 DataStax
  • 44. Scaling antipatterns • Metadata servers • Router bottlenecks • Overloading existing nodes when adding capacity ©2012 DataStax
  • 45. how is it? flexible ©2012 DataStax
  • 47. Data model: Realtime LiveStocks stock last GOOG $95.52 AAPL $186.10 AMZN $112.98 Portfolios user stock shares jbellis GOOG 80 jbellis LNKD 20 yukim AMZN 100 StockHist stock date price GOOG 2011-01-01 $8.23 GOOG 2011-01-02 $6.14 GOOG 2011-001-03 $7.78 ©2012 DataStax
  • 48. Data model: Analytics HistLoss worst_date loss Portfolio1 2011-07-23 -$34.81 Portfolio2 2011-03-11 -$11432.24 Portfolio3 2011-05-21 -$1476.93 ©2012 DataStax
  • 49. Data model: Analytics 10dayreturns stock rdate return GOOG 2011-07-25 $8.23 GOOG 2011-07-24 $6.14 GOOG 2011-07-23 $7.78 AAPL 2011-07-25 $15.32 AAPL 2011-07-24 $12.68 INSERT OVERWRITE TABLE 10dayreturns SELECT a.stock, b.date as rdate, b.price - a.price FROM StockHist a JOIN StockHist b ON (a.stock = b.stock AND date_add(a.date, 10) = b.date); ©2012 DataStax
  • 50. Data model: Analytics portfolio_returns portfolio rdate preturn Portfolio1 2011-07-25 $118.21 Portfolio1 2011-07-24 $60.78 Portfolio1 2011-07-23 -$34.81 Portfolio2 2011-07-25 $2143.92 Portfolio3 2011-07-24 -$10.19 INSERT OVERWRITE TABLE portfolio_returns SELECT portfolio, rdate, SUM(b.return) FROM portfolios a JOIN 10dayreturns b ON (a.stock = b.stock) GROUP BY portfolio, rdate; ©2012 DataStax
  • 51. Data model: Analytics HistLoss worst_date loss Portfolio1 2011-07-23 -$34.81 Portfolio2 2011-03-11 -$11432.24 Portfolio3 2011-05-21 -$1476.93 INSERT OVERWRITE TABLE HistLoss SELECT a.portfolio, rdate, minp FROM ( SELECT portfolio, min(preturn) as minp FROM portfolio_returns GROUP BY portfolio ) a JOIN portfolio_returns b ON (a.portfolio = b.portfolio and a.minp = b.preturn); ©2012 DataStax
  • 54. Questions? Image credits • http://www.flickr.com/photos/26817893@N05/2573006312/ • http://www.flickr.com/photos/rowanbank/7686239548 • http://www.flickr.com/photos/mervtheswerve/6081933265 • http://www.flickr.com/photos/dg_pics/2526208830 • http://www.flickr.com/photos/wainwright/351684037 • http://www.flickr.com/photos/mikeneilson/1606662529 • http://www.flickr.com/photos/sbisson/3852905534 • http://www.flickr.com/photos/breadnbadger/2674928517