SlideShare a Scribd company logo
1 of 79
Download to read offline
Open source, high performance database




Anti-social Databases: NoSQL and MongoDB
Will LaForest
Senior Director of 10gen Federal
will@10gen.com
@WLaForest




                                           1
SQL                              Dynamic
                   invented                          Web Content                       released

                                                                                    10gen
                                                             Web applications      founded
                          Oracle              Client
IBM’s IMS                founded              Server                           SOA

                                                                        BigTable



1965        1970     1975     1980     1985   1990       1995         2000     2005      2010




     Codd publishes            PC’s gain                  3 tier                  Cloud
 relational model paper         traction               architecture             Computing
         in 1970
                                                                Brewer’s Cap            NoSQL
                                              WWW                   born               Movement
                                              born
                                                                                                  2
Attribute




Tuple




                    Relation
                               3
Category           BlogCategory
           N   1

                         N



                         1

           N   1                  N   1
 Author               Blog                BlogTag

                                              N



                                              N


                     Content               Tag




                                                    4
• Data stored in a RDBMS is very compact (disk was
  more expensive)
• SQL and RDBMS made queries flexible with rigid
  schemas
• Rigid schemas helps optimize joins and storage
• Massive ecosystem of tools, libraries and integrations
• Been around 40 years!




                                                           5
• Gartner uses the 3Vs to define
• Volume - Big/difficult/extreme volume is relative
• Variety
   –   Changing or evolving data
   –   Uncontrolled formats
   –   Does not easily adhere to a single schema
   –   Unknown at design time
• Velocity
   – High or volatile inbound data
   – High query and read operations
   – Low latency
                                                      6
VOLUME/VELOCITY &
                            NEW ARCHITECTURES
                            • Systems need to scale horizontal
                              not vertically
                            • Commodity servers
                            • Cloud Computing



DATA VARIETY &
VOLATILITY
• Extremely difficult to
  find a single fixed
  schema
• Don’t know data
  schema a-priori          AGILE DEVELOPMENT
                           • Iterative & continuous
                           • New and emerging Apps


                                                         7
• Non-relational been hanging around (heard of
  MUMPS?)
• Modern NoSQL theory and offerings started in early
  2000s
• NoSQL = Not Only SQL
• A collection of very different products
• Alternatives to relational databases when a bad fit
• Common motivations
   – Horizontally scalable (commodity server/cloud computing)
   – Schema Flexibility
                                                                8
• I group by data model
    – Some people use or include data arangement
• Data arrangement is important and cuts across data
  models (see my talk at BigData DC next week)
    – Column/Document
    – Consistent hashing partitioning
    – Range based partitioning


•   Key/Value
•   Big Table Descendents
•   Document Oriented
•   Graph
                                                       9
•   Value (Data) mapped to a key (think primary)
•   Some typed, some just BLOBs
•   Fast hash based queries
•   No range queries, no ordering, simple data model


         “Will”        • “will@10gen.com”

        “Chris”        • “chris@10gen.com”

       Will-obj        • [4e61 6d65 3a57 …

                                                       10
•   Data stored on disk in a column oriented fashion
•   Predominantly hash based indexing
•   Rudimentary or no secondary indexes
•   Range queries and ordering on one dimension (row key)
•   Some consistent hashing some range based
          row keys         column family                 column family
                             “contact”                     “personal”

                     “twitter”: “wlaforest”
                                                  “bio”: “Will attended … ”
    row     Will     “email”: “will@10gen.com
                                                  “picture”: …
                     “phone”: “555-555-5555”

                                                  “bio”: “ … ”
                     “email”: “chris@10gen.com”
    row    Chris                                  “picture”: …
                     “phone”: “555-555-5555”
                                                  “hobby”: “golf”




                                                                              11
•   Data modeled as documents or objects (XML and JSON)
•   No fixed schema
•   Richest data model
•   Consistent hashing and range based partitioning

       {name: “will”,           name: “jeff”,   {name: “brendan”,
        eyes: “blue”,           eyes: “blue”,    aliases: [“el diablo”]}
        birthplace: “NY”,       height: 72,
        aliases: [“bill”, “la   boss: “ben”}
       ciacco”],                                {name: “matt”,
        gender: ”???”,                           pizza: “DiGiorno”,
        boss: ”ben”}            name: “ben”,     height: 72,
                                hat: ”yes”}      boss: 555.555.1212}



                                                                           12
•   Not a database!
•   Map Reduce on HDFS or other data source
•   Great for grinding through data
•   When you want to use it
    – Can’t use a index
    – Distributing custom algorithms
    – ETL
• Many NoSQL offerings have native map reduce
  functionality


                                                13
14
•   2007 founded
•   2009 first release of MongoDB
•   MongoDB is open source
•   10gen has a Redhat-like business model
    – Subscriptions (subscriber build, support)
    – Training
    – Consulting
• ~80M in funding
    – Sequoia, NEA, In-Q-Tel, Union Square Ventures, Flybridge
      Capital,

                                                                 15
#2 on Indeed’s Fastest Growing Jobs        Jaspersoft BigData Index

                                                          Demand for
                                                          MongoDB, the
                                                          document-oriented
                                                          NoSQL database, saw
                                                          the biggest spike
                                                          with over 200%
                                                          growth in 2011.




                                                 451 Group
         Google Searches              “MongoDB increasing its dominance”




                                                                                16
#2 ON INDEED’S FASTEST GROWING JOBS




                                      17
“MongoDB INCREASING ITS DOMINANCE”




                                     18
• Scale horizontally over commodity hardware
• Agility essential (schema free/heterogeneous
  interface)
• RDBMSs great so keep what works
   – Rich data models
   – Adhoc queries
   – Fully featured indexes
• What doesn’t distribute well?
   – Long running multi-row transactions
   – Join
   – Both artifacts of the relational data model

                                                   19
•   Data stored as documents (JSON)
•   Schema free
•   CRUD operations – (Create Read Update Delete)
•   Atomic document operations
•   Consistent (but is tunable… advanced topic)
•   Rich indexing (secondary, geospatial, covered)
•   Ad hoc Queries like SQL
    –   Equality
    –   Ranges
    –   Regular expression searches
    –   Geospatial
•   Replication – HA, read scalability, geo centric reads
•   Sharding (sometimes called partitioning) for scalability

                                                               20
RDBMS      MongoDB



Database   Database


Table      Collection


Row        Document




                        21
22
var p = { author: “roger”,
     date: new Date(),
     text: “Spirited Away”,
     tags: *“Tezuka”, “Manga”+-

> db.posts.save(p)




                                  23
>db.posts.find()

 { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
   author : "roger",
   date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)",
   text : "Spirited Away",
   tags : [ "Tezuka", "Manga" ] }

Notes:
 - _id is unique, but can be anything you’d like
                                                       24
Create index on any Field in Document

 // 1 means ascending, -1 means descending

 >db.posts.ensureIndex({author: 1})

 >db.posts.find({author: 'roger'})

 { _id     : ObjectId("4c4ba5c0672c685e5e8aabf3"),
   author : "roger",
   ... }


                                                     25
• Conditional Operators
  – $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $size, $type
  – $lt, $lte, $gt, $gte

  // find posts with any tags
  > db.posts.find( {tags: {$exists: true }} )

  // find posts matching a regular expression
  > db.posts.find( {author: /^rog*/i } )

  // count posts by an author before a certain date
  > db.posts.find( {author: ‘roger’, date:, $lt: Sat… -- ).count()
                                                                     26
• $set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit

> comment = { author: “fred”,
           date: new Date(),
           text: “Best Movie Ever”-

> db.posts.update( { _id: “...” -,
              $push: {comments: comment} );



                                                               27
{ _id : ObjectId("4c4ba5c0672c685e5e8aabf3"),
  author : "roger",
  date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)",
  text : "Spirited Away",
  tags : [ "Tezuka", "Manga" ],
  comments : [
   {
       author : "Fred",
       date : "Sat Jul 24 2010 20:51:03 GMT-0700 (PDT)",
       text : "Best Movie Ever"
   }
  ]}
                                                           28
// Index nested documents
> db.posts.ensureIndex( “comments.author”:1 )
 db.posts.find(,‘comments.author’:’Fred’-)

// Index on tags
> db.posts.ensureIndex( tags: 1)
> db.posts.find( { tags: ’Manga’ - )

// geospatial index
> db.posts.ensureIndex( “author.location”: “2d” )
> db.posts.find( “author.location” : , $near : [22,42] } )
                                                             29
30
•   Do them client side (Python or R)
•   Native Map/Reduce in JS in MongoDB
    –   Distributes across the cluster with good data locality
•   New aggregation framework
    –   Declarative (no JS required)
    –   Pipeline approach (like Unix ps -ax | tee processes.txt | more)
•   Hadoop
    –   Intersect the indexing of MongoDB with the brute force parallelization
        of hadoop
    –   Hadoop MongoDB connector




                                                                                 31
32
• Prod clusters on order of
   • 100B objects
   • 50k qps per server
   • ~1400 nodes


Better data locality   In-Memory                     Auto-Sharding
                        Caching




                                   Replication /HA



                                                      Horizontal Scaling



                                                                           33
•   Sharding Details
•   Replica Set Details
•   Consistency Details
•   Common Deployment Scenarios
•   Citations




                                  34
35
Write
                 Primary


                 Secondary
         Read

                 Secondary
Driver




         Read



                             36
Write
                 Primary
Driver




         Read

                 Secondary


                 Secondary



                             37
1. Write
                    Primary


                    Secondary   1. Replicate
         2. Read

                    Secondary
Driver




         2. Read



                                               38
• Fire and forget
• Wait for error
• Wait for fsync
• Wait for journal sync
• Wait for replication




                          39
Driver           Primary
         write

                           apply in memory




                                             40
Driver             Primary
            write
         getLastError
                             apply in memory




                                               41
Driver              Primary
            write
         getLastError
                              apply in memory
           j:true
                              Write to journal




                                                 42
Driver             Primary
            write
         getLastError
                             apply in memory
          fsync:true


                             fsync




                                               43
Driver             Primary                     Secondary
            write
         getLastError
                             apply in memory
           w:2
                                 replicate




                                                           44
Value          Meaning

<n:integer>    Replicate to N members of
               replica set
“majority”     Replicate to a majority of
               replica set members
<m:modeName>   Use cutom error mode
               name




                                            45
46
> db.runCommand( { shardcollection: “test.users”,
          key: { email: 1 }} )
      {
          name: “Jared”,
          email: “jsr@10gen.com”,
      }
      {
          name: “Scott”,
          email: “scott@10gen.com”,
      }
      {
          name: “Dan”,
          email: “dan@10gen.com”,
      }


                                                    47
-∞   +∞




      48
-∞                                              +∞




     dan@10gen.com            scott@10gen.com


              jsr@10gen.com




                                                 49
Split!




-∞                                              +∞




     dan@10gen.com            scott@10gen.com


              jsr@10gen.com




                                                 50
This is a                   Split!        This is a
     chunk                                     chunk


-∞                                                          +∞




                 dan@10gen.com            scott@10gen.com


                          jsr@10gen.com




                                                             51
-∞                                              +∞




     dan@10gen.com            scott@10gen.com


              jsr@10gen.com




                                                 52
Split!




-∞                                              +∞




     dan@10gen.com            scott@10gen.com


              jsr@10gen.com




                                                 53
-∞                adam@10gen.com    1
adam@10gen.com    jared@10gen.com   1
jared@10gen.com   scott@10gen.com   1
scott@10gen.com   +∞                1


• Stored in the config serers
• Cached in mongos
• Used to route requests and keep cluster balanced


                                                     54
mongos
                                                                          config
                                     balancer
                                                                          config
Chunks!
                                                                          config




  1   2    3    4     13   14   15   16         25   26   27   28    37   38   39   40

  5   6    7    8     17   18   19   20         29   30   31   32    41   42   43   44

  9   10   11   12    21   22   23   24         33   34   35   36    45   46   47   48


 Shard 1             Shard 2                Shard 3                 Shard 4



                                                                                         55
mongos
                                                                          config
                                     balancer
                                                                          config


                    Imbalance                                             config




 1   2    3    4

 5   6    7    8

 9   10   11   12     21   22   23   24         33   34   35   36    45   46   47   48


Shard 1             Shard 2                 Shard 3                 Shard 4



                                                                                         56
mongos
                                                                          config
                                     balancer
                                                                          config

                               Move chunk 1 to                            config
                               Shard 2




 1   2    3    4

 5   6    7    8

 9   10   11   12    21   22    23   24         33   34   35   36    45   46   47   48


Shard 1             Shard 2                 Shard 3                 Shard 4



                                                                                         57
mongos
                                                                         config
                                    balancer
                                                                         config

                                                                         config




 1   2    3    4

 5   6    7    8

 9   10   11   12    21   22   23   24         33   34   35   36    45   46   47   48


Shard 1             Shard 2                Shard 3                 Shard 4



                                                                                        58
mongos
                                                                         config
                                    balancer
                                                                         config
                                         Chunks 1,2, and 3
                                         have migrated
                                                                         config




               4

 5   6    7    8     1                         2                    3

 9   10   11   12    21   22   23   24         33   34   35   36    45   46   47   48


Shard 1             Shard 2                Shard 3                 Shard 4




                                                                                        59
60
1             1.Query arrives at
                                           mongos
                                4
                    mongos
                                         2.mongos routes query
                                           to a single shard

                                         3.Shard returns results
                       2
                                           of query
                       3
                                         4.Results returned to
                                           client

Shard 1   Shard 2              Shard 3




                                                                   61
1               1.Query arrives at
                                               mongos
                                 4
                    mongos                   2.mongos broadcasts
                                               query to all shards

                                             3.Each shard returns
           2                                   results for query
                     2               2
                                         3
           3             3                   4.Results combined
                                               and returned to client

Shard 1   Shard 2                Shard 3




                                                                        62
1               1.Query arrives at mongos

                                   6           2.mongos broadcasts query
                    mongos                       to all shards
                               5               3.Each shard locally sorts
                                                 results

             2                                 4.Results returned to
                       2                         mongos
                                   2
             4             4           4
                                               5.mongos merge sorts
                                                 individual results
   3          3                            3
                                               6.Combined sorted result
Shard 1   Shard 2              Shard 3           returned to client




                                                                        63
Inserts   Requires shard   db.users.insert({
          key                name: “Jared”,
                             email: “jsr@10gen.com”})

Removes   Routed           db.users.delete({
                             email: “jsr@10gen.com”})

          Scattered        db.users.delete({name: “Jared”})


Updates   Routed           db.users.update(
                             {email: “jsr@10gen.com”},
                             {$set: { state: “CA”}})
          Scattered        db.users.update(
                             {state: “FZ”},
                             {$set:{ state: “CA”}} )
                                                              64
By Shard      Routed            db.users.find(
                                  {email: “jsr@10gen.com”})
Key

Sorted by     Routed in order   db.users.find().sort({email:-1})
shard key
Find by non   Scatter Gather    db.users.find({state:”CA”})
shard key
Sorted by     Distributed merge db.users.find().sort({state:1})
              sort
non shard
key

                                                                   65
66
Data Center




     Primary   Secondary   Secondary




                                       67
Data Center




     Primary   Secondary   Secondary
                           hidden=true




                            backups




                                         68
Active Data Center                  Standby Data Center




  Primary            Secondary          Secondary
  priority = 1       priority = 1




                                                          69
West Coast DC   Central DC      East Coast DC




 Secondary       Primary           Secondary
                 priority = 1




                                                70
71
•   History of Database Management (http://bit.ly/w3r0dv)
•   EMC IDC Study (http://bit.ly/y1mJgJ)
•   Gartner & Big Data (http://bit.ly/xvRP3a)
•   SQL (http://en.wikipedia.org/wiki/SQL)
•   Database Management Systems
    http://en.wikipedia.org/wiki/Dbms)
• Dynamo: Amazon’s Highly Available Key-value Store
    (http://bit.ly/A8F8oy)
• CAP Theorem (http://bit.ly/zvA6O6)
• NoSQL Google File System and BigTable
    (http://oreil.ly/wOXliP)
• NoSQL Movement whitepaper (http://bit.ly/A8RBuJ)
• Sample ERD diagram (http://bit.ly/xV30v)

                                                            72
73
74
$project                 $match                  $limit              $skip
   $unwind                  $group                  $sort


{                                                    db.article.aggregate(
 title : “this is my title” ,                           { $project : {
 author : “bob” ,                                           author : 1,
 posted : new Date () ,                                     tags : 1,
 pageViews : 5 ,                                        }},
 tags : [ “fun” , “good” , “fun” ] ,                    { $unwind : "$tags" },
 comments : [                                           { $group : {
      { author :“joe” , text : “this is cool” } ,           _id : “$tags”,
      { author :“sam” , text : “this is bad” }              authors : { $addToSet : "$author" }
 ],                                                     }}
 other : { foo : 5 }                                 );
}



                                                                                              75
76
Input data   MAP   Intermediate data   REDUCE   Output data




              1                          1




              2                          2




              3                          3




                                                              77
Input data   MAP   Intermediate data   REDUCE   Output data




              1                          1




              2                          2




              3                          3




                                                              78
• Impossible for a distributed computer system to
  simultaneously provide all three of the following
  guarantees
   – Consistency - All nodes see the same data at the same
     time.
   – Availability - A guarantee that every request receives a
     response about whether it was successful or failed.
   – Partition tolerance - No set of failures less than total
     network failure is allowed to cause the system to respond
     incorrectly



                                                                 79

More Related Content

What's hot

Big Data and NoSQL in Microsoft-Land
Big Data and NoSQL in Microsoft-LandBig Data and NoSQL in Microsoft-Land
Big Data and NoSQL in Microsoft-LandAndrew Brust
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentationjexp
 
Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big DataAndrew Brust
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsAndrew Brust
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7abdulrahmanhelan
 
NoSQL – Back to the Future or Yet Another DB Feature?
NoSQL – Back to the Future or Yet Another DB Feature?NoSQL – Back to the Future or Yet Another DB Feature?
NoSQL – Back to the Future or Yet Another DB Feature?Martin Scholl
 
Nosql Now 2012: MongoDB Use Cases
Nosql Now 2012: MongoDB Use CasesNosql Now 2012: MongoDB Use Cases
Nosql Now 2012: MongoDB Use CasesMongoDB
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Andrew Brust
 
Evolved BI with SQL Server 2012
Evolved BIwith SQL Server 2012Evolved BIwith SQL Server 2012
Evolved BI with SQL Server 2012Andrew Brust
 
An Evening with MongoDB Detroit 2013
An Evening with MongoDB Detroit 2013An Evening with MongoDB Detroit 2013
An Evening with MongoDB Detroit 2013MongoDB
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture OverviewChristopher Foot
 
Social network architecture - Part 3. Big data - Machine learning
Social network architecture - Part 3. Big data - Machine learningSocial network architecture - Part 3. Big data - Machine learning
Social network architecture - Part 3. Big data - Machine learningPhu Luong Trong
 
Australia SharePoint Conference 2012 - Quest Governance Solutions
Australia SharePoint Conference 2012 - Quest Governance SolutionsAustralia SharePoint Conference 2012 - Quest Governance Solutions
Australia SharePoint Conference 2012 - Quest Governance SolutionsChris McNulty
 
Optimizing and Accelerating your SharePoint Farm
Optimizing and Accelerating your SharePoint FarmOptimizing and Accelerating your SharePoint Farm
Optimizing and Accelerating your SharePoint FarmChris McNulty
 
Hitchhiker’s Guide to SharePoint BI
Hitchhiker’s Guide to SharePoint BIHitchhiker’s Guide to SharePoint BI
Hitchhiker’s Guide to SharePoint BIAndrew Brust
 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsDATAVERSITY
 

What's hot (20)

Big Data and NoSQL in Microsoft-Land
Big Data and NoSQL in Microsoft-LandBig Data and NoSQL in Microsoft-Land
Big Data and NoSQL in Microsoft-Land
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentation
 
Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big Data
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
Relational and non relational database 7
Relational and non relational database 7Relational and non relational database 7
Relational and non relational database 7
 
NoSQL – Back to the Future or Yet Another DB Feature?
NoSQL – Back to the Future or Yet Another DB Feature?NoSQL – Back to the Future or Yet Another DB Feature?
NoSQL – Back to the Future or Yet Another DB Feature?
 
Sql no sql
Sql no sqlSql no sql
Sql no sql
 
Nosql Now 2012: MongoDB Use Cases
Nosql Now 2012: MongoDB Use CasesNosql Now 2012: MongoDB Use Cases
Nosql Now 2012: MongoDB Use Cases
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World
 
Evolved BI with SQL Server 2012
Evolved BIwith SQL Server 2012Evolved BIwith SQL Server 2012
Evolved BI with SQL Server 2012
 
An Evening with MongoDB Detroit 2013
An Evening with MongoDB Detroit 2013An Evening with MongoDB Detroit 2013
An Evening with MongoDB Detroit 2013
 
NoSQL Architecture Overview
NoSQL Architecture OverviewNoSQL Architecture Overview
NoSQL Architecture Overview
 
Social network architecture - Part 3. Big data - Machine learning
Social network architecture - Part 3. Big data - Machine learningSocial network architecture - Part 3. Big data - Machine learning
Social network architecture - Part 3. Big data - Machine learning
 
redis
redisredis
redis
 
Australia SharePoint Conference 2012 - Quest Governance Solutions
Australia SharePoint Conference 2012 - Quest Governance SolutionsAustralia SharePoint Conference 2012 - Quest Governance Solutions
Australia SharePoint Conference 2012 - Quest Governance Solutions
 
Optimizing and Accelerating your SharePoint Farm
Optimizing and Accelerating your SharePoint FarmOptimizing and Accelerating your SharePoint Farm
Optimizing and Accelerating your SharePoint Farm
 
Treasure Data and Heroku
Treasure Data and HerokuTreasure Data and Heroku
Treasure Data and Heroku
 
Hitchhiker’s Guide to SharePoint BI
Hitchhiker’s Guide to SharePoint BIHitchhiker’s Guide to SharePoint BI
Hitchhiker’s Guide to SharePoint BI
 
NoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture PatternsNoSQL Now! NoSQL Architecture Patterns
NoSQL Now! NoSQL Architecture Patterns
 

Viewers also liked

O Diferencial de uma Estratégia Mobile...e Multiplataforma!
O Diferencial de uma Estratégia Mobile...e Multiplataforma!O Diferencial de uma Estratégia Mobile...e Multiplataforma!
O Diferencial de uma Estratégia Mobile...e Multiplataforma!Xpand IT
 
Av capabilities presentation
Av capabilities presentationAv capabilities presentation
Av capabilities presentationNAISales2
 
Data meets Creativity - Webbdagarna 2015
Data meets Creativity - Webbdagarna 2015Data meets Creativity - Webbdagarna 2015
Data meets Creativity - Webbdagarna 2015Webrepublic
 
Cartagena Data Festival | Telling Stories with Data 2015 04-21
Cartagena Data Festival | Telling Stories with Data 2015 04-21Cartagena Data Festival | Telling Stories with Data 2015 04-21
Cartagena Data Festival | Telling Stories with Data 2015 04-21ulrichatz
 
онлайн бронирование модуль для турагенств
онлайн бронирование модуль для турагенствонлайн бронирование модуль для турагенств
онлайн бронирование модуль для турагенствAdrian Parker
 
VirtualSense presentation at FBK
VirtualSense presentation at FBKVirtualSense presentation at FBK
VirtualSense presentation at FBKAlessandro Bogliolo
 
Review: Leadership Frameworks
Review: Leadership FrameworksReview: Leadership Frameworks
Review: Leadership FrameworksMariam Nazarudin
 
Heavy Metal PowerPivot Remastered
Heavy Metal PowerPivot RemasteredHeavy Metal PowerPivot Remastered
Heavy Metal PowerPivot RemasteredJason Himmelstein
 
Division of roles and responsibilities
Division of roles and responsibilitiesDivision of roles and responsibilities
Division of roles and responsibilitieskausargulaid
 
Samanage-Website-Redesign-Jan2017
Samanage-Website-Redesign-Jan2017Samanage-Website-Redesign-Jan2017
Samanage-Website-Redesign-Jan2017WhatConts
 
Online Travel: Today and Tomorrow
Online Travel: Today and TomorrowOnline Travel: Today and Tomorrow
Online Travel: Today and TomorrowYanis Dzenis
 
Old & wise(에듀시니어)
Old & wise(에듀시니어)Old & wise(에듀시니어)
Old & wise(에듀시니어)Jungku Hong
 
MongoDB and AWS Best Practices
MongoDB and AWS Best PracticesMongoDB and AWS Best Practices
MongoDB and AWS Best PracticesMongoDB
 
R Statistics With MongoDB
R Statistics With MongoDBR Statistics With MongoDB
R Statistics With MongoDBMongoDB
 
NOSQL Session GlueCon May 2010
NOSQL Session GlueCon May 2010NOSQL Session GlueCon May 2010
NOSQL Session GlueCon May 2010MongoDB
 
Secret Life of a Weather Datum end of project event
Secret Life of a Weather Datum end of project eventSecret Life of a Weather Datum end of project event
Secret Life of a Weather Datum end of project eventlifeofdata
 
Revving Up Revenue By Replenishing
Revving Up Revenue By ReplenishingRevving Up Revenue By Replenishing
Revving Up Revenue By ReplenishingWhatConts
 

Viewers also liked (20)

O Diferencial de uma Estratégia Mobile...e Multiplataforma!
O Diferencial de uma Estratégia Mobile...e Multiplataforma!O Diferencial de uma Estratégia Mobile...e Multiplataforma!
O Diferencial de uma Estratégia Mobile...e Multiplataforma!
 
Av capabilities presentation
Av capabilities presentationAv capabilities presentation
Av capabilities presentation
 
Creative Overview
Creative OverviewCreative Overview
Creative Overview
 
Data meets Creativity - Webbdagarna 2015
Data meets Creativity - Webbdagarna 2015Data meets Creativity - Webbdagarna 2015
Data meets Creativity - Webbdagarna 2015
 
Cartagena Data Festival | Telling Stories with Data 2015 04-21
Cartagena Data Festival | Telling Stories with Data 2015 04-21Cartagena Data Festival | Telling Stories with Data 2015 04-21
Cartagena Data Festival | Telling Stories with Data 2015 04-21
 
Ov big data
Ov big dataOv big data
Ov big data
 
GIT Best Practices V 0.1
GIT Best Practices V 0.1GIT Best Practices V 0.1
GIT Best Practices V 0.1
 
онлайн бронирование модуль для турагенств
онлайн бронирование модуль для турагенствонлайн бронирование модуль для турагенств
онлайн бронирование модуль для турагенств
 
VirtualSense presentation at FBK
VirtualSense presentation at FBKVirtualSense presentation at FBK
VirtualSense presentation at FBK
 
Review: Leadership Frameworks
Review: Leadership FrameworksReview: Leadership Frameworks
Review: Leadership Frameworks
 
Heavy Metal PowerPivot Remastered
Heavy Metal PowerPivot RemasteredHeavy Metal PowerPivot Remastered
Heavy Metal PowerPivot Remastered
 
Division of roles and responsibilities
Division of roles and responsibilitiesDivision of roles and responsibilities
Division of roles and responsibilities
 
Samanage-Website-Redesign-Jan2017
Samanage-Website-Redesign-Jan2017Samanage-Website-Redesign-Jan2017
Samanage-Website-Redesign-Jan2017
 
Online Travel: Today and Tomorrow
Online Travel: Today and TomorrowOnline Travel: Today and Tomorrow
Online Travel: Today and Tomorrow
 
Old & wise(에듀시니어)
Old & wise(에듀시니어)Old & wise(에듀시니어)
Old & wise(에듀시니어)
 
MongoDB and AWS Best Practices
MongoDB and AWS Best PracticesMongoDB and AWS Best Practices
MongoDB and AWS Best Practices
 
R Statistics With MongoDB
R Statistics With MongoDBR Statistics With MongoDB
R Statistics With MongoDB
 
NOSQL Session GlueCon May 2010
NOSQL Session GlueCon May 2010NOSQL Session GlueCon May 2010
NOSQL Session GlueCon May 2010
 
Secret Life of a Weather Datum end of project event
Secret Life of a Weather Datum end of project eventSecret Life of a Weather Datum end of project event
Secret Life of a Weather Datum end of project event
 
Revving Up Revenue By Replenishing
Revving Up Revenue By ReplenishingRevving Up Revenue By Replenishing
Revving Up Revenue By Replenishing
 

Similar to Anti-social Databases

An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBWilliam LaForest
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011jexp
 
NoSQL in the context of Social Web
NoSQL in the context of Social WebNoSQL in the context of Social Web
NoSQL in the context of Social WebBogdan Gaza
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloudboorad
 
DevNation Atlanta
DevNation AtlantaDevNation Atlanta
DevNation Atlantaboorad
 
NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011David Funaro
 
Size does not matter (if your data is in a silo)
Size does not matter (if your data is in a silo)Size does not matter (if your data is in a silo)
Size does not matter (if your data is in a silo)Ora Lassila
 
Gilbane Boston 2011 big data
Gilbane Boston 2011 big dataGilbane Boston 2011 big data
Gilbane Boston 2011 big dataPeter O'Kelly
 
How to Get Started with Your MongoDB Pilot Project
How to Get Started with Your MongoDB Pilot ProjectHow to Get Started with Your MongoDB Pilot Project
How to Get Started with Your MongoDB Pilot ProjectDATAVERSITY
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDBDATAVERSITY
 
No Sql Movement
No Sql MovementNo Sql Movement
No Sql MovementAjit Koti
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsGeorge Stathis
 
Big Data Warehousing Meetup with Riak
Big Data Warehousing Meetup with RiakBig Data Warehousing Meetup with Riak
Big Data Warehousing Meetup with RiakCaserta
 
NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)
NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)
NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)Emil Eifrem
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageBethmi Gunasekara
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep divelucenerevolution
 
CSC 8101 Non Relational Databases
CSC 8101 Non Relational DatabasesCSC 8101 Non Relational Databases
CSC 8101 Non Relational Databasessjwoodman
 
Polyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4jPolyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4jCorie Pollock
 

Similar to Anti-social Databases (20)

An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDB
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011
 
NoSQL in the context of Social Web
NoSQL in the context of Social WebNoSQL in the context of Social Web
NoSQL in the context of Social Web
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloud
 
DevNation Atlanta
DevNation AtlantaDevNation Atlanta
DevNation Atlanta
 
NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011NoSQL overview #phptostart turin 11.07.2011
NoSQL overview #phptostart turin 11.07.2011
 
Size does not matter (if your data is in a silo)
Size does not matter (if your data is in a silo)Size does not matter (if your data is in a silo)
Size does not matter (if your data is in a silo)
 
Gilbane Boston 2011 big data
Gilbane Boston 2011 big dataGilbane Boston 2011 big data
Gilbane Boston 2011 big data
 
How to Get Started with Your MongoDB Pilot Project
How to Get Started with Your MongoDB Pilot ProjectHow to Get Started with Your MongoDB Pilot Project
How to Get Started with Your MongoDB Pilot Project
 
Intro to NoSQL and MongoDB
Intro to NoSQL and MongoDBIntro to NoSQL and MongoDB
Intro to NoSQL and MongoDB
 
No Sql Movement
No Sql MovementNo Sql Movement
No Sql Movement
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
 
Big Data Warehousing Meetup with Riak
Big Data Warehousing Meetup with RiakBig Data Warehousing Meetup with Riak
Big Data Warehousing Meetup with Riak
 
NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)
NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)
NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)
 
No SQL- The Future Of Data Storage
No SQL- The Future Of Data StorageNo SQL- The Future Of Data Storage
No SQL- The Future Of Data Storage
 
Solr cloud the 'search first' nosql database extended deep dive
Solr cloud the 'search first' nosql database   extended deep diveSolr cloud the 'search first' nosql database   extended deep dive
Solr cloud the 'search first' nosql database extended deep dive
 
CSC 8101 Non Relational Databases
CSC 8101 Non Relational DatabasesCSC 8101 Non Relational Databases
CSC 8101 Non Relational Databases
 
Polyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4jPolyglot Persistence with MongoDB and Neo4j
Polyglot Persistence with MongoDB and Neo4j
 
Drill njhug -19 feb2013
Drill njhug -19 feb2013Drill njhug -19 feb2013
Drill njhug -19 feb2013
 

Recently uploaded

Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024D Cloud Solutions
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxYounusS2
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Commit University
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncObject Automation
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServiceRenan Moreira de Oliveira
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfDianaGray10
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.francesco barbera
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Will Schroeder
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureEric D. Schabell
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URLRuncy Oommen
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxMatsuo Lab
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaborationbruanjhuli
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024SkyPlanner
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdfJamie (Taka) Wang
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataSafe Software
 

Recently uploaded (20)

Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024Artificial Intelligence & SEO Trends for 2024
Artificial Intelligence & SEO Trends for 2024
 
Babel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptxBabel Compiler - Transforming JavaScript for All Browsers.pptx
Babel Compiler - Transforming JavaScript for All Browsers.pptx
 
Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)Crea il tuo assistente AI con lo Stregatto (open source python framework)
Crea il tuo assistente AI con lo Stregatto (open source python framework)
 
GenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation IncGenAI and AI GCC State of AI_Object Automation Inc
GenAI and AI GCC State of AI_Object Automation Inc
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer ServicePicPay - GenAI Finance Assistant - ChatGPT for Customer Service
PicPay - GenAI Finance Assistant - ChatGPT for Customer Service
 
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdfUiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
UiPath Solutions Management Preview - Northern CA Chapter - March 22.pdf
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.Digital magic. A small project for controlling smart light bulbs.
Digital magic. A small project for controlling smart light bulbs.
 
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
Apres-Cyber - The Data Dilemma: Bridging Offensive Operations and Machine Lea...
 
OpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability AdventureOpenShift Commons Paris - Choose Your Own Observability Adventure
OpenShift Commons Paris - Choose Your Own Observability Adventure
 
Designing A Time bound resource download URL
Designing A Time bound resource download URLDesigning A Time bound resource download URL
Designing A Time bound resource download URL
 
Introduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptxIntroduction to Matsuo Laboratory (ENG).pptx
Introduction to Matsuo Laboratory (ENG).pptx
 
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online CollaborationCOMPUTER 10: Lesson 7 - File Storage and Online Collaboration
COMPUTER 10: Lesson 7 - File Storage and Online Collaboration
 
Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024Salesforce Miami User Group Event - 1st Quarter 2024
Salesforce Miami User Group Event - 1st Quarter 2024
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
20200723_insight_release_plan_v6.pdf20200723_insight_release_plan_v6.pdf
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 

Anti-social Databases

  • 1. Open source, high performance database Anti-social Databases: NoSQL and MongoDB Will LaForest Senior Director of 10gen Federal will@10gen.com @WLaForest 1
  • 2. SQL Dynamic invented Web Content released 10gen Web applications founded Oracle Client IBM’s IMS founded Server SOA BigTable 1965 1970 1975 1980 1985 1990 1995 2000 2005 2010 Codd publishes PC’s gain 3 tier Cloud relational model paper traction architecture Computing in 1970 Brewer’s Cap NoSQL WWW born Movement born 2
  • 3. Attribute Tuple Relation 3
  • 4. Category BlogCategory N 1 N 1 N 1 N 1 Author Blog BlogTag N N Content Tag 4
  • 5. • Data stored in a RDBMS is very compact (disk was more expensive) • SQL and RDBMS made queries flexible with rigid schemas • Rigid schemas helps optimize joins and storage • Massive ecosystem of tools, libraries and integrations • Been around 40 years! 5
  • 6. • Gartner uses the 3Vs to define • Volume - Big/difficult/extreme volume is relative • Variety – Changing or evolving data – Uncontrolled formats – Does not easily adhere to a single schema – Unknown at design time • Velocity – High or volatile inbound data – High query and read operations – Low latency 6
  • 7. VOLUME/VELOCITY & NEW ARCHITECTURES • Systems need to scale horizontal not vertically • Commodity servers • Cloud Computing DATA VARIETY & VOLATILITY • Extremely difficult to find a single fixed schema • Don’t know data schema a-priori AGILE DEVELOPMENT • Iterative & continuous • New and emerging Apps 7
  • 8. • Non-relational been hanging around (heard of MUMPS?) • Modern NoSQL theory and offerings started in early 2000s • NoSQL = Not Only SQL • A collection of very different products • Alternatives to relational databases when a bad fit • Common motivations – Horizontally scalable (commodity server/cloud computing) – Schema Flexibility 8
  • 9. • I group by data model – Some people use or include data arangement • Data arrangement is important and cuts across data models (see my talk at BigData DC next week) – Column/Document – Consistent hashing partitioning – Range based partitioning • Key/Value • Big Table Descendents • Document Oriented • Graph 9
  • 10. Value (Data) mapped to a key (think primary) • Some typed, some just BLOBs • Fast hash based queries • No range queries, no ordering, simple data model “Will” • “will@10gen.com” “Chris” • “chris@10gen.com” Will-obj • [4e61 6d65 3a57 … 10
  • 11. Data stored on disk in a column oriented fashion • Predominantly hash based indexing • Rudimentary or no secondary indexes • Range queries and ordering on one dimension (row key) • Some consistent hashing some range based row keys column family column family “contact” “personal” “twitter”: “wlaforest” “bio”: “Will attended … ” row Will “email”: “will@10gen.com “picture”: … “phone”: “555-555-5555” “bio”: “ … ” “email”: “chris@10gen.com” row Chris “picture”: … “phone”: “555-555-5555” “hobby”: “golf” 11
  • 12. Data modeled as documents or objects (XML and JSON) • No fixed schema • Richest data model • Consistent hashing and range based partitioning {name: “will”, name: “jeff”, {name: “brendan”, eyes: “blue”, eyes: “blue”, aliases: [“el diablo”]} birthplace: “NY”, height: 72, aliases: [“bill”, “la boss: “ben”} ciacco”], {name: “matt”, gender: ”???”, pizza: “DiGiorno”, boss: ”ben”} name: “ben”, height: 72, hat: ”yes”} boss: 555.555.1212} 12
  • 13. Not a database! • Map Reduce on HDFS or other data source • Great for grinding through data • When you want to use it – Can’t use a index – Distributing custom algorithms – ETL • Many NoSQL offerings have native map reduce functionality 13
  • 14. 14
  • 15. 2007 founded • 2009 first release of MongoDB • MongoDB is open source • 10gen has a Redhat-like business model – Subscriptions (subscriber build, support) – Training – Consulting • ~80M in funding – Sequoia, NEA, In-Q-Tel, Union Square Ventures, Flybridge Capital, 15
  • 16. #2 on Indeed’s Fastest Growing Jobs Jaspersoft BigData Index Demand for MongoDB, the document-oriented NoSQL database, saw the biggest spike with over 200% growth in 2011. 451 Group Google Searches “MongoDB increasing its dominance” 16
  • 17. #2 ON INDEED’S FASTEST GROWING JOBS 17
  • 18. “MongoDB INCREASING ITS DOMINANCE” 18
  • 19. • Scale horizontally over commodity hardware • Agility essential (schema free/heterogeneous interface) • RDBMSs great so keep what works – Rich data models – Adhoc queries – Fully featured indexes • What doesn’t distribute well? – Long running multi-row transactions – Join – Both artifacts of the relational data model 19
  • 20. Data stored as documents (JSON) • Schema free • CRUD operations – (Create Read Update Delete) • Atomic document operations • Consistent (but is tunable… advanced topic) • Rich indexing (secondary, geospatial, covered) • Ad hoc Queries like SQL – Equality – Ranges – Regular expression searches – Geospatial • Replication – HA, read scalability, geo centric reads • Sharding (sometimes called partitioning) for scalability 20
  • 21. RDBMS MongoDB Database Database Table Collection Row Document 21
  • 22. 22
  • 23. var p = { author: “roger”, date: new Date(), text: “Spirited Away”, tags: *“Tezuka”, “Manga”+- > db.posts.save(p) 23
  • 24. >db.posts.find() { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "roger", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : "Spirited Away", tags : [ "Tezuka", "Manga" ] } Notes: - _id is unique, but can be anything you’d like 24
  • 25. Create index on any Field in Document // 1 means ascending, -1 means descending >db.posts.ensureIndex({author: 1}) >db.posts.find({author: 'roger'}) { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "roger", ... } 25
  • 26. • Conditional Operators – $all, $exists, $mod, $ne, $in, $nin, $nor, $or, $size, $type – $lt, $lte, $gt, $gte // find posts with any tags > db.posts.find( {tags: {$exists: true }} ) // find posts matching a regular expression > db.posts.find( {author: /^rog*/i } ) // count posts by an author before a certain date > db.posts.find( {author: ‘roger’, date:, $lt: Sat… -- ).count() 26
  • 27. • $set, $unset, $inc, $push, $pushAll, $pull, $pullAll, $bit > comment = { author: “fred”, date: new Date(), text: “Best Movie Ever”- > db.posts.update( { _id: “...” -, $push: {comments: comment} ); 27
  • 28. { _id : ObjectId("4c4ba5c0672c685e5e8aabf3"), author : "roger", date : "Sat Jul 24 2010 19:47:11 GMT-0700 (PDT)", text : "Spirited Away", tags : [ "Tezuka", "Manga" ], comments : [ { author : "Fred", date : "Sat Jul 24 2010 20:51:03 GMT-0700 (PDT)", text : "Best Movie Ever" } ]} 28
  • 29. // Index nested documents > db.posts.ensureIndex( “comments.author”:1 ) db.posts.find(,‘comments.author’:’Fred’-) // Index on tags > db.posts.ensureIndex( tags: 1) > db.posts.find( { tags: ’Manga’ - ) // geospatial index > db.posts.ensureIndex( “author.location”: “2d” ) > db.posts.find( “author.location” : , $near : [22,42] } ) 29
  • 30. 30
  • 31. Do them client side (Python or R) • Native Map/Reduce in JS in MongoDB – Distributes across the cluster with good data locality • New aggregation framework – Declarative (no JS required) – Pipeline approach (like Unix ps -ax | tee processes.txt | more) • Hadoop – Intersect the indexing of MongoDB with the brute force parallelization of hadoop – Hadoop MongoDB connector 31
  • 32. 32
  • 33. • Prod clusters on order of • 100B objects • 50k qps per server • ~1400 nodes Better data locality In-Memory Auto-Sharding Caching Replication /HA Horizontal Scaling 33
  • 34. Sharding Details • Replica Set Details • Consistency Details • Common Deployment Scenarios • Citations 34
  • 35. 35
  • 36. Write Primary Secondary Read Secondary Driver Read 36
  • 37. Write Primary Driver Read Secondary Secondary 37
  • 38. 1. Write Primary Secondary 1. Replicate 2. Read Secondary Driver 2. Read 38
  • 39. • Fire and forget • Wait for error • Wait for fsync • Wait for journal sync • Wait for replication 39
  • 40. Driver Primary write apply in memory 40
  • 41. Driver Primary write getLastError apply in memory 41
  • 42. Driver Primary write getLastError apply in memory j:true Write to journal 42
  • 43. Driver Primary write getLastError apply in memory fsync:true fsync 43
  • 44. Driver Primary Secondary write getLastError apply in memory w:2 replicate 44
  • 45. Value Meaning <n:integer> Replicate to N members of replica set “majority” Replicate to a majority of replica set members <m:modeName> Use cutom error mode name 45
  • 46. 46
  • 47. > db.runCommand( { shardcollection: “test.users”, key: { email: 1 }} ) { name: “Jared”, email: “jsr@10gen.com”, } { name: “Scott”, email: “scott@10gen.com”, } { name: “Dan”, email: “dan@10gen.com”, } 47
  • 48. -∞ +∞ 48
  • 49. -∞ +∞ dan@10gen.com scott@10gen.com jsr@10gen.com 49
  • 50. Split! -∞ +∞ dan@10gen.com scott@10gen.com jsr@10gen.com 50
  • 51. This is a Split! This is a chunk chunk -∞ +∞ dan@10gen.com scott@10gen.com jsr@10gen.com 51
  • 52. -∞ +∞ dan@10gen.com scott@10gen.com jsr@10gen.com 52
  • 53. Split! -∞ +∞ dan@10gen.com scott@10gen.com jsr@10gen.com 53
  • 54. -∞ adam@10gen.com 1 adam@10gen.com jared@10gen.com 1 jared@10gen.com scott@10gen.com 1 scott@10gen.com +∞ 1 • Stored in the config serers • Cached in mongos • Used to route requests and keep cluster balanced 54
  • 55. mongos config balancer config Chunks! config 1 2 3 4 13 14 15 16 25 26 27 28 37 38 39 40 5 6 7 8 17 18 19 20 29 30 31 32 41 42 43 44 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 55
  • 56. mongos config balancer config Imbalance config 1 2 3 4 5 6 7 8 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 56
  • 57. mongos config balancer config Move chunk 1 to config Shard 2 1 2 3 4 5 6 7 8 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 57
  • 58. mongos config balancer config config 1 2 3 4 5 6 7 8 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 58
  • 59. mongos config balancer config Chunks 1,2, and 3 have migrated config 4 5 6 7 8 1 2 3 9 10 11 12 21 22 23 24 33 34 35 36 45 46 47 48 Shard 1 Shard 2 Shard 3 Shard 4 59
  • 60. 60
  • 61. 1 1.Query arrives at mongos 4 mongos 2.mongos routes query to a single shard 3.Shard returns results 2 of query 3 4.Results returned to client Shard 1 Shard 2 Shard 3 61
  • 62. 1 1.Query arrives at mongos 4 mongos 2.mongos broadcasts query to all shards 3.Each shard returns 2 results for query 2 2 3 3 3 4.Results combined and returned to client Shard 1 Shard 2 Shard 3 62
  • 63. 1 1.Query arrives at mongos 6 2.mongos broadcasts query mongos to all shards 5 3.Each shard locally sorts results 2 4.Results returned to 2 mongos 2 4 4 4 5.mongos merge sorts individual results 3 3 3 6.Combined sorted result Shard 1 Shard 2 Shard 3 returned to client 63
  • 64. Inserts Requires shard db.users.insert({ key name: “Jared”, email: “jsr@10gen.com”}) Removes Routed db.users.delete({ email: “jsr@10gen.com”}) Scattered db.users.delete({name: “Jared”}) Updates Routed db.users.update( {email: “jsr@10gen.com”}, {$set: { state: “CA”}}) Scattered db.users.update( {state: “FZ”}, {$set:{ state: “CA”}} ) 64
  • 65. By Shard Routed db.users.find( {email: “jsr@10gen.com”}) Key Sorted by Routed in order db.users.find().sort({email:-1}) shard key Find by non Scatter Gather db.users.find({state:”CA”}) shard key Sorted by Distributed merge db.users.find().sort({state:1}) sort non shard key 65
  • 66. 66
  • 67. Data Center Primary Secondary Secondary 67
  • 68. Data Center Primary Secondary Secondary hidden=true backups 68
  • 69. Active Data Center Standby Data Center Primary Secondary Secondary priority = 1 priority = 1 69
  • 70. West Coast DC Central DC East Coast DC Secondary Primary Secondary priority = 1 70
  • 71. 71
  • 72. History of Database Management (http://bit.ly/w3r0dv) • EMC IDC Study (http://bit.ly/y1mJgJ) • Gartner & Big Data (http://bit.ly/xvRP3a) • SQL (http://en.wikipedia.org/wiki/SQL) • Database Management Systems http://en.wikipedia.org/wiki/Dbms) • Dynamo: Amazon’s Highly Available Key-value Store (http://bit.ly/A8F8oy) • CAP Theorem (http://bit.ly/zvA6O6) • NoSQL Google File System and BigTable (http://oreil.ly/wOXliP) • NoSQL Movement whitepaper (http://bit.ly/A8RBuJ) • Sample ERD diagram (http://bit.ly/xV30v) 72
  • 73. 73
  • 74. 74
  • 75. $project $match $limit $skip $unwind $group $sort { db.article.aggregate( title : “this is my title” , { $project : { author : “bob” , author : 1, posted : new Date () , tags : 1, pageViews : 5 , }}, tags : [ “fun” , “good” , “fun” ] , { $unwind : "$tags" }, comments : [ { $group : { { author :“joe” , text : “this is cool” } , _id : “$tags”, { author :“sam” , text : “this is bad” } authors : { $addToSet : "$author" } ], }} other : { foo : 5 } ); } 75
  • 76. 76
  • 77. Input data MAP Intermediate data REDUCE Output data 1 1 2 2 3 3 77
  • 78. Input data MAP Intermediate data REDUCE Output data 1 1 2 2 3 3 78
  • 79. • Impossible for a distributed computer system to simultaneously provide all three of the following guarantees – Consistency - All nodes see the same data at the same time. – Availability - A guarantee that every request receives a response about whether it was successful or failed. – Partition tolerance - No set of failures less than total network failure is allowed to cause the system to respond incorrectly 79

Editor's Notes

  1. Database requirements are changing … because of i) volume ii) Type of data iii) Agile Development, iv) New architectures.. V) New Apps