SlideShare a Scribd company logo
1 of 58
No SQL?




          Image credit: http://browsertoolkit.com/fault-tolerance.png
No SQL?




          Image credit: http://browsertoolkit.com/fault-tolerance.png
No SQL?




          Image credit: http://browsertoolkit.com/fault-tolerance.png
Neo4j       and a
                 NOSQL overview



                                  #neo4j
Emil Eifrem                       @emileifrem
CEO, Neo Technology               emil@neotechnology.com
What's the plan?

  Why now? – Four trends

  NOSQL overview

  Graph databases && Neo4j

  Conclusions
Trend 1:
data set size




40
2007            Source: IDC 2007
988

Trend 1:
data set size




40
2007            2010   Source: IDC 2007
Trend 2: connectedness
                                                                                           Giant
                                                                                          Global
                                                                                          Graph
                                                                                          (GGG)
 Information connectivity




                                                                             Ontologies


                                                                      RDF

                                                                                  Folksonomies
                                                                  Tagging


                                                      Wikis             User-
                                                                      generated
                                                                       content
                                                              Blogs


                                                     RSS


                                         Hypertext


                               Text
                            documents     web 1.0             web 2.0             “web 3.0”
                                        1990         2000                   2010                   2020
Trend 3: semi-structure
  Individualization of content!
     In the salary lists of the 1970s, all elements had exactly
     one job
     In the salary lists of the 2000s, we need 5 job columns!
     Or 8? Or 15?

  Trend accelerated by the decentralization of content
  generation that is the hallmark of the age of participation
  (“web 2.0”)
Aside: RDBMS performance
                                                            Relational database
               Salary List
 Performance




                             Majority of
                             Webapps



                                           Social network

                                                                 Semantic Trading




                                                  }           custom


                                                Data complexity
Trend 4: architecture


         1990s: Database as integration hub


          Application   Application   Application




                           DB
Trend 4: architecture


        2000s:   (Slowly towards...)   Decoupled services
                    with their own backend

            Service             Service        Service




             DB                   DB            DB
Why NOSQL 2009?

 Trend 1: Size.

 Trend 2: Connectivity.

 Trend 3: Semi-structure.

 Trend 4: Architecture.
NOSQL
overview
First off: the name

  NoSQL is NOT “Never SQL”

  NoSQL is NOT “No To SQL”
NOSQL
      is simply


N ot O nly S Q L !
Four (emerging) NOSQL categories
 Key-value stores
   Based on Amazon's Dynamo paper
   Data model: (global) collection of K-V pairs
   Example: Dynomite, Voldemort, Tokyo*

 ColumnFamily / BigTable clones
   Based on Google's BigTable paper
   Data model: big table, column families
   Example: HBase, Hypertable, Cassandra
Four (emerging) NOSQL categories
 Document databases
   Inspired by Lotus Notes
   Data model: collections of K-V collections
   Example: CouchDB, MongoDB, Riak

 Graph databases
   Inspired by Euler & graph theory
   Data model: nodes, rels, K-V on both
   Example: AllegroGraph, Sones, Neo4j
NOSQL data models
            Key-value stores
Data size




                          Bigtable clones


                                            Document
                                            databases


                                                           Graph databases




                                                        Data complexity
NOSQL data models
               Key-value stores
   Data size




                             Bigtable clones


                                               Document
                                               databases


                                                              Graph databases

                                                                          (This is still billio ns of
 90%                                                                      nodes & relationships)
  of
 use
cases




                                                           Data complexity
Graph DBs
& Neo4j intro
The Graph DB model: representation
 Core abstractions:
                                              name = “Emil”

   Nodes
                                              age = 29
                                              sex = “yes”


   Relationships between nodes
   Properties on both                     1                         2



                         type = KNOWS
                         time = 4 years                       3

                                                                  type = car
                                                                  vendor = “SAAB”
                                                                  model = “95 Aero”
Example: The Matrix
                                                                           name = “The Architect”
                             name = “Morpheus”
                             rank = “Captain”
name = “Thomas Anderson”
                             occupation = “Total badass”                                            42
age = 29
                                                 disclosure = public


                 KNOWS                            KNOWS                                              CODED_BY
                                                                                KNO
     1                                  7                              3           W   S

                 KN
                                                                                                13
                                             S

                    OW                                   name = “Cypher”
                                        KNOW



                         S                               last name = “Reagan”
                                                                                             name = “Agent Smith”
                                                                       disclosure = secret   version = 1.0b
         age = 3 days                                                  age = 6 months        language = C++
                                    2

                             name = “Trinity”
Code (1): Building a node space
GraphDatabaseService graphDb = ... // Get factory


// Create Thomas 'Neo' Anderson
Node mrAnderson = graphDb.createNode();
mrAnderson.setProperty( "name", "Thomas Anderson" );
mrAnderson.setProperty( "age", 29 );

// Create Morpheus
Node morpheus = graphDb.createNode();
morpheus.setProperty( "name", "Morpheus" );
morpheus.setProperty( "rank", "Captain" );
morpheus.setProperty( "occupation", "Total bad ass" );

// Create a relationship representing that they know each other
mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );
// ...create Trinity, Cypher, Agent Smith, Architect similarly
Code (1): Building a node space
GraphDatabaseService graphDb = ... // Get factory
Transaction tx = graphDb.beginTx();

// Create Thomas 'Neo' Anderson
Node mrAnderson = graphDb.createNode();
mrAnderson.setProperty( "name", "Thomas Anderson" );
mrAnderson.setProperty( "age", 29 );

// Create Morpheus
Node morpheus = graphDb.createNode();
morpheus.setProperty( "name", "Morpheus" );
morpheus.setProperty( "rank", "Captain" );
morpheus.setProperty( "occupation", "Total bad ass" );

// Create a relationship representing that they know each other
mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS );
// ...create Trinity, Cypher, Agent Smith, Architect similarly

tx.commit();
Code (1b): Def ning RelationshipTypes
             i
// In package org.neo4j.graphdb
public interface RelationshipType
{
   String name();
}

// In package org.yourdomain.yourapp
// Example on how to roll dynamic RelationshipTypes
class MyDynamicRelType implements RelationshipType
{
   private final String name;
   MyDynamicRelType( String name ){ this.name = name; }
   public String name() { return this.name; }
}

// Example on how to kick it, static-RelationshipType-like
enum MyStaticRelTypes implements RelationshipType
{
   KNOWS,
   WORKS_FOR,
}
Whiteboard friendly



                                 owns
                      Björn                  Big Car
                        build             drives


                                DayCare
The Graph DB model: traversal
 Traverser framework for
 high-performance traversing                    name = “Emil”
                                                age = 31

 across the node space                          sex = “yes”




                                            1                         2



                           type = KNOWS
                           time = 4 years                       3

                                                                    type = car
                                                                    vendor = “SAAB”
                                                                    model = “95 Aero”
Example: Mr Anderson’s friends
                                                                           name = “The Architect”
                             name = “Morpheus”
                             rank = “Captain”
name = “Thomas Anderson”
                             occupation = “Total badass”                                            42
age = 29
                                                 disclosure = public


                 KNOWS                            KNOWS                                              CODED_BY
                                                                                KNO
     1                                  7                              3           W   S

                 KN
                                                                                                13
                                             S

                    OW                                   name = “Cypher”
                                        KNOW



                         S                               last name = “Reagan”
                                                                                             name = “Agent Smith”
                                                                       disclosure = secret   version = 1.0b
         age = 3 days                                                  age = 6 months        language = C++
                                    2

                             name = “Trinity”
Code (2): Traversing a node space
// Instantiate a traverser that returns Mr Anderson's friends
Traverser friendsTraverser = mrAnderson.traverse(
   Traverser.Order.BREADTH_FIRST,
   StopEvaluator.END_OF_GRAPH,
   ReturnableEvaluator.ALL_BUT_START_NODE,
   RelTypes.KNOWS,
   Direction.OUTGOING );

// Traverse the node space and print out the result
System.out.println( "Mr Anderson's friends:" );
for ( Node friend : friendsTraverser )
{
   System.out.printf( "At depth %d => %s%n",
       friendsTraverser.currentPosition().getDepth(),
       friend.getProperty( "name" ) );
}
name = “The Architect”
                              name = “Morpheus”
                              rank = “Captain”
 name = “Thomas Anderson”
                              occupation = “Total badass”                                            42
 age = 29
                                                  disclosure = public


                  KNOWS                            KNOWS                                              CODED_BY
                                                                                 KNO
      1                                  7                              3           W     S

                  KN                                                                              13

                                              S
                     OW                                   name = “Cypher”

                                         KNOW
                          S                               last name = “Reagan”
                                                                                               name = “Agent Smith”
                                                                        disclosure = secret    version = 1.0b
          age = 3 days                                                  age = 6 months         language = C++
                                     2

                              name = “Trinity”
                                                                    $ bin/start-neo-example
                                                                    Mr Anderson's friends:

                                                                    At      depth     1   =>   Morpheus
friendsTraverser = mrAnderson.traverse(
  Traverser.Order.BREADTH_FIRST,                                    At      depth     1   =>   Trinity
  StopEvaluator.END_OF_GRAPH,                                       At      depth     2   =>   Cypher
  ReturnableEvaluator.ALL_BUT_START_NODE,
  RelTypes.KNOWS,
                                                                    At      depth     3   =>   Agent Smith
  Direction.OUTGOING );                                             $
Example: Friends in love?
                                                                            name = “The Architect”
                               name = “Morpheus”
                               rank = “Captain”
name = “Thomas Anderson”
                               occupation = “Total badass”                                           42
age = 29
                                                  disclosure = public


                   KNOWS                           KNOWS                                              CODED_BY
                                                                                 KNO
     1                                    7                             3           W   S

                   KN                                                                            13
                                             S
                                        KNOW


                      OW                                  name = “Cypher”
                           S                              last name = “Reagan”
                                                                                              name = “Agent Smith”
         LO                                                             disclosure = secret   version = 1.0b
            VE                                                          age = 6 months        language = C++
               S
                                      2

                               name = “Trinity”
Code (3a): Custom traverser
// Create a traverser that returns all “friends in love”
Traverser loveTraverser = mrAnderson.traverse(
   Traverser.Order.BREADTH_FIRST,
   StopEvaluator.END_OF_GRAPH,
   new ReturnableEvaluator()
   {
       public boolean isReturnableNode( TraversalPosition pos )
       {
          return pos.currentNode().hasRelationship(
              RelTypes.LOVES, Direction.OUTGOING );
       }
   },
   RelTypes.KNOWS,
   Direction.OUTGOING );
Code (3a): Custom traverser
// Traverse the node space and print out the result
System.out.println( "Who’s a lover?" );
for ( Node person : loveTraverser )
{
   System.out.printf( "At depth %d => %s%n",
       loveTraverser.currentPosition().getDepth(),
       person.getProperty( "name" ) );
}
name = “The Architect”
                                name = “Morpheus”
                                rank = “Captain”
 name = “Thomas Anderson”
                                occupation = “Total badass”                                            42
 age = 29
                                                    disclosure = public


                    KNOWS                            KNOWS                         KNO                  CODED_BY
      1                                    7                              3           W   S

                    KN                                                                             13

                                                S
                                                            name = “Cypher”

                                           KNOW
                       OW
                            S                               last name = “Reagan”
                                                                                                name = “Agent Smith”
          LO                                                              disclosure = secret   version = 1.0b
             VE                                                           age = 6 months        language = C++
                S
                                       2

                                name = “Trinity”
                                                                     $ bin/start-neo-example
new ReturnableEvaluator()
                                                                     Who’s a lover?
{
  public boolean isReturnableNode(
    TraversalPosition pos)
                                                                     At depth 1 => Trinity
  {                                                                  $
    return pos.currentNode().
      hasRelationship( RelTypes.LOVES,
         Direction.OUTGOING );
  }
},
Bonus code: domain model
    How do you implement your domain model?
    Use the delegator pattern, i.e. every domain entity wraps a
    Neo4j primitive:
// In package org.yourdomain.yourapp
class PersonImpl implements Person
{
   private final Node underlyingNode;
   PersonImpl( Node node ){ this.underlyingNode = node; }

    public String getName()
    {
       return (String) this.underlyingNode.getProperty( "name" );
    }
    public void setName( String name )
    {
       this.underlyingNode.setProperty( "name", name );
    }
}
Domain layer frameworks
 Qi4j (www.qi4j.org)
   Framework for doing DDD in pure Java5
   Def nes Entities / Associations / Properties
      i
      Sound familiar? Nodes / Rel’s / Properties!
   Neo4j is an “EntityStore” backend

 Jo4neo (http://code.google.com/p/jo4neo)
   Annotation driven
   Weaves Neo4j-backed persistence into domain objects
   at runtime
Neo4j system characteristics
  Disk-based
     Native graph storage engine with custom binary on-disk
     format
  Transactional
    JTA/JTS, XA, 2PC, Tx recovery, deadlock detection,
    MVCC, etc
  Scales up
    Many billions of nodes/rels/props on single JVM
  Robust
    6+ years in 24/7 production
Social network pathE xists()
          12
                               ~1k persons
7         1
                           3   Avg 50 friends per
                               person
                               pathExists(a, b) limit
                               depth 4
    36
         41
                 5
                      77
                               Two backends
                               Eliminate disk IO so
                               warm up caches
Social network pathE xists()

                 2
                Emil
         1                                    5
                                      7
        Mike                                Kevin
                           3        John
                         Marcus
                   9                4
                 Bruce            Leigh

                                  # persons query time
Relational database                   1 000 2 000 ms
Graph database (Neo4j)                1 000      2 ms
Graph database (Neo4j)            1 000 000      2 ms
Pros & Cons compared to RDBMS
+ No O/R impedance mismatch (whiteboard friendly)
+ Can easily evolve schemas
+ Can represent semi-structured info
+ Can represent graphs/networks (with performance)

-   Lacks in tool and framework support
-   Few other implementations => potential lock in
-   No support for ad-hoc queries
Query languages
 SPARQL – “SQL for linked data”
    Ex: ”SELECT ?person WHERE {
          ?person neo4j:KNOWS ?friend .
          ?friend neo4j:KNOWS ?foe .
          ?foe neo4j:name “Larry Ellison” .
     }”


 Gremlin – “perl for graphs”
    Ex: ”./outE[@label='KNOWS']/inV[@age   > 30]/@name”
    Chec it out at http://try.neo4j.org
The Neo4j ecosystem
 Neo4j is an embedded database
   Tiny teeny lil jar f le
                      i
 Component ecosystem
   index
   meta-model
   graph-matching
   remote-graphdb
   sparql-engine
   ...
 See http://components.neo4j.org
Language bindings
 Neo4j.py – bindings for Jython and CPython
   http://components.neo4j.org/neo4j.py
 Neo4jrb – bindings for JRuby (incl RESTful API)
   http://wiki.neo4j.org/content/Ruby
 Neo4jrb-simple
   http://github.com/mdeiters/neo4jr-simple
 Clojure
   http://wiki.neo4j.org/content/Clojure
 Scala (incl RESTful API)
   http://wiki.neo4j.org/content/Scala
Grails Neoclipse screendump
Scale out – replication
  Neo4j HA in internal beta testing with cust's now
  Master-slave replication, 1st configuration
     MySQL style... ish
     Except all instances can write, synchronously between
     writing slave & master (strong consistency)
     Updates are asynchronously propagated to the other
     slaves (eventual consistency)
  This can handle billions of entities...
  … but not 100B
Scale out – partitioning
  “Sharding possible today”
     … but you have to do manual work
     … just as with MySQL
  Transparent partitioning? Neo4j 2.0
    100B? Easy to say. Sliiiiightly harder to do.
    Fundamentals: BASE & eventual consistency
    Generic clustering algorithm as base case, but give lots of
    knobs for developers
While we're on future stuff
  Neo4j 1.1 [July 2010]
     Full JMX / SNMP support
     Event framework (feedback plz!)
     Traverser framework iteration
     …
  Neo4j HA FCS this summer, GA this fall
  REST-ful API (already out in 0.8, please give feedback for 1.0!)
  Framework integration: Roo, Grails
  Database integration: CouchDB, MySQL
  Neo4j Spatial (uDig integr, {R,Quad}-tree etc, OSM imports)
  Client tools: Webling, Neoclipse, Monitoring console
How ego are you? (aka other impls?)
  Franz’ A lleg roG ra ph      (http://agraph.franz.com)

     Proprietary, Lisp, RDF-oriented but real graphdb
  Sones g ra phD B      (http://sones.com)

     Proprietary, .NET, in beta
  Twitter's Floc k D B    (http://github.com/twitter/flockdb)

     Twitter's graph database for large and shallow graphs
  Google P reg el   (http://bit.ly/dP9IP)

    We are oh-so-secret
  Some academic papers from ~10 years ago
     G = {V, E} #FAIL
Conclusion
 Graphs && Neo4j => teh awesome!
 Available NOW under AGPLv3 / commercial license
   AGPLv3: “if you’re open source, we’re open source”
   If you have proprietary software? Must buy a commercial
   license
   But the first one is free!
 Download
   http://neo4j.org
 Feedback
   http://lists.neo4j.org
Questions?




             Image credit: lost again! Sorry :(
http://neotechnology.com

More Related Content

Viewers also liked

Graph Databases for SQL Server Professionals
Graph Databases for SQL Server ProfessionalsGraph Databases for SQL Server Professionals
Graph Databases for SQL Server ProfessionalsStéphane Fréchette
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL ServerStéphane Fréchette
 
Intro to Spring Data Neo4j
Intro to Spring Data Neo4jIntro to Spring Data Neo4j
Intro to Spring Data Neo4jjexp
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentationjexp
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph DatabasesMax De Marzi
 
Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater
Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater
Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater Neo4j
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 

Viewers also liked (7)

Graph Databases for SQL Server Professionals
Graph Databases for SQL Server ProfessionalsGraph Databases for SQL Server Professionals
Graph Databases for SQL Server Professionals
 
Data Analytics with R and SQL Server
Data Analytics with R and SQL ServerData Analytics with R and SQL Server
Data Analytics with R and SQL Server
 
Intro to Spring Data Neo4j
Intro to Spring Data Neo4jIntro to Spring Data Neo4j
Intro to Spring Data Neo4j
 
Intro to Neo4j presentation
Intro to Neo4j presentationIntro to Neo4j presentation
Intro to Neo4j presentation
 
Introduction to Graph Databases
Introduction to Graph DatabasesIntroduction to Graph Databases
Introduction to Graph Databases
 
Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater
Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater
Neo4j Partner Tag Berlin - Potential für System-Integratoren und Berater
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 

Similar to NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)

A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)Emil Eifrem
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011jexp
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assTobias Lindaaker
 
NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jTobias Lindaaker
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaConnected Data World
 
using Spring and MongoDB on Cloud Foundry
using Spring and MongoDB on Cloud Foundryusing Spring and MongoDB on Cloud Foundry
using Spring and MongoDB on Cloud FoundryJoshua Long
 
CSC 8101 Non Relational Databases
CSC 8101 Non Relational DatabasesCSC 8101 Non Relational Databases
CSC 8101 Non Relational Databasessjwoodman
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBWilliam LaForest
 
No Sql Movement
No Sql MovementNo Sql Movement
No Sql MovementAjit Koti
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jDebanjan Mahata
 
Graph database in sv meetup
Graph database in sv meetupGraph database in sv meetup
Graph database in sv meetupJoshua Bae
 
20130204 graph to-pacer-xml
20130204 graph to-pacer-xml20130204 graph to-pacer-xml
20130204 graph to-pacer-xmlDavid Colebatch
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 OverviewDavid Chou
 
Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseDipti Borkar
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsGeorge Stathis
 

Similar to NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010) (20)

A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
A NOSQL Overview And The Benefits Of Graph Databases (nosql east 2009)
 
Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011Spring Data Neo4j Intro SpringOne 2011
Spring Data Neo4j Intro SpringOne 2011
 
Django and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks assDjango and Neo4j - Domain modeling that kicks ass
Django and Neo4j - Domain modeling that kicks ass
 
NOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4jNOSQLEU - Graph Databases and Neo4j
NOSQLEU - Graph Databases and Neo4j
 
No Sql
No SqlNo Sql
No Sql
 
Anti-social Databases
Anti-social DatabasesAnti-social Databases
Anti-social Databases
 
Graph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora LassilaGraph Abstractions Matter by Ora Lassila
Graph Abstractions Matter by Ora Lassila
 
Grails goes Graph
Grails goes GraphGrails goes Graph
Grails goes Graph
 
using Spring and MongoDB on Cloud Foundry
using Spring and MongoDB on Cloud Foundryusing Spring and MongoDB on Cloud Foundry
using Spring and MongoDB on Cloud Foundry
 
CSC 8101 Non Relational Databases
CSC 8101 Non Relational DatabasesCSC 8101 Non Relational Databases
CSC 8101 Non Relational Databases
 
An Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDBAn Introduction to Big Data, NoSQL and MongoDB
An Introduction to Big Data, NoSQL and MongoDB
 
STI Summit 2011 - Digital Worlds
STI Summit 2011 - Digital WorldsSTI Summit 2011 - Digital Worlds
STI Summit 2011 - Digital Worlds
 
No Sql Movement
No Sql MovementNo Sql Movement
No Sql Movement
 
An Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4jAn Introduction to NOSQL, Graph Databases and Neo4j
An Introduction to NOSQL, Graph Databases and Neo4j
 
Graph database in sv meetup
Graph database in sv meetupGraph database in sv meetup
Graph database in sv meetup
 
20130204 graph to-pacer-xml
20130204 graph to-pacer-xml20130204 graph to-pacer-xml
20130204 graph to-pacer-xml
 
Graph Theory and Databases
Graph Theory and DatabasesGraph Theory and Databases
Graph Theory and Databases
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 Overview
 
Introduction to NoSQL and Couchbase
Introduction to NoSQL and CouchbaseIntroduction to NoSQL and Couchbase
Introduction to NoSQL and Couchbase
 
Sharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data LessonsSharing a Startup’s Big Data Lessons
Sharing a Startup’s Big Data Lessons
 

More from Emil Eifrem

Startups in Sweden vs Startups in Silicon Valley, 2015 edition
Startups in Sweden vs Startups in Silicon Valley, 2015 editionStartups in Sweden vs Startups in Silicon Valley, 2015 edition
Startups in Sweden vs Startups in Silicon Valley, 2015 editionEmil Eifrem
 
GraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteGraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteEmil Eifrem
 
An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)Emil Eifrem
 
Startups in Sweden vs Startups in Silicon Valley
Startups in Sweden vs Startups in Silicon ValleyStartups in Sweden vs Startups in Silicon Valley
Startups in Sweden vs Startups in Silicon ValleyEmil Eifrem
 
An intro to Neo4j and some use cases (JFokus 2011)
An intro to Neo4j and some use cases (JFokus 2011)An intro to Neo4j and some use cases (JFokus 2011)
An intro to Neo4j and some use cases (JFokus 2011)Emil Eifrem
 
An overview of NOSQL (JFokus 2011)
An overview of NOSQL (JFokus 2011)An overview of NOSQL (JFokus 2011)
An overview of NOSQL (JFokus 2011)Emil Eifrem
 
NOSQL part of the SpringOne 2GX 2010 keynote
NOSQL part of the SpringOne 2GX 2010 keynoteNOSQL part of the SpringOne 2GX 2010 keynote
NOSQL part of the SpringOne 2GX 2010 keynoteEmil Eifrem
 

More from Emil Eifrem (7)

Startups in Sweden vs Startups in Silicon Valley, 2015 edition
Startups in Sweden vs Startups in Silicon Valley, 2015 editionStartups in Sweden vs Startups in Silicon Valley, 2015 edition
Startups in Sweden vs Startups in Silicon Valley, 2015 edition
 
GraphConnect SF 2013 Keynote
GraphConnect SF 2013 KeynoteGraphConnect SF 2013 Keynote
GraphConnect SF 2013 Keynote
 
An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)An Overview of the Emerging Graph Landscape (Oct 2013)
An Overview of the Emerging Graph Landscape (Oct 2013)
 
Startups in Sweden vs Startups in Silicon Valley
Startups in Sweden vs Startups in Silicon ValleyStartups in Sweden vs Startups in Silicon Valley
Startups in Sweden vs Startups in Silicon Valley
 
An intro to Neo4j and some use cases (JFokus 2011)
An intro to Neo4j and some use cases (JFokus 2011)An intro to Neo4j and some use cases (JFokus 2011)
An intro to Neo4j and some use cases (JFokus 2011)
 
An overview of NOSQL (JFokus 2011)
An overview of NOSQL (JFokus 2011)An overview of NOSQL (JFokus 2011)
An overview of NOSQL (JFokus 2011)
 
NOSQL part of the SpringOne 2GX 2010 keynote
NOSQL part of the SpringOne 2GX 2010 keynoteNOSQL part of the SpringOne 2GX 2010 keynote
NOSQL part of the SpringOne 2GX 2010 keynote
 

Recently uploaded

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 

Recently uploaded (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 

NOSQL overview and intro to graph databases with Neo4j (Geeknight May 2010)

  • 1. No SQL? Image credit: http://browsertoolkit.com/fault-tolerance.png
  • 2. No SQL? Image credit: http://browsertoolkit.com/fault-tolerance.png
  • 3. No SQL? Image credit: http://browsertoolkit.com/fault-tolerance.png
  • 4. Neo4j and a NOSQL overview #neo4j Emil Eifrem @emileifrem CEO, Neo Technology emil@neotechnology.com
  • 5. What's the plan? Why now? – Four trends NOSQL overview Graph databases && Neo4j Conclusions
  • 6. Trend 1: data set size 40 2007 Source: IDC 2007
  • 7. 988 Trend 1: data set size 40 2007 2010 Source: IDC 2007
  • 8. Trend 2: connectedness Giant Global Graph (GGG) Information connectivity Ontologies RDF Folksonomies Tagging Wikis User- generated content Blogs RSS Hypertext Text documents web 1.0 web 2.0 “web 3.0” 1990 2000 2010 2020
  • 9. Trend 3: semi-structure Individualization of content! In the salary lists of the 1970s, all elements had exactly one job In the salary lists of the 2000s, we need 5 job columns! Or 8? Or 15? Trend accelerated by the decentralization of content generation that is the hallmark of the age of participation (“web 2.0”)
  • 10. Aside: RDBMS performance Relational database Salary List Performance Majority of Webapps Social network Semantic Trading } custom Data complexity
  • 11. Trend 4: architecture 1990s: Database as integration hub Application Application Application DB
  • 12. Trend 4: architecture 2000s: (Slowly towards...) Decoupled services with their own backend Service Service Service DB DB DB
  • 13. Why NOSQL 2009? Trend 1: Size. Trend 2: Connectivity. Trend 3: Semi-structure. Trend 4: Architecture.
  • 15. First off: the name NoSQL is NOT “Never SQL” NoSQL is NOT “No To SQL”
  • 16. NOSQL is simply N ot O nly S Q L !
  • 17. Four (emerging) NOSQL categories Key-value stores Based on Amazon's Dynamo paper Data model: (global) collection of K-V pairs Example: Dynomite, Voldemort, Tokyo* ColumnFamily / BigTable clones Based on Google's BigTable paper Data model: big table, column families Example: HBase, Hypertable, Cassandra
  • 18. Four (emerging) NOSQL categories Document databases Inspired by Lotus Notes Data model: collections of K-V collections Example: CouchDB, MongoDB, Riak Graph databases Inspired by Euler & graph theory Data model: nodes, rels, K-V on both Example: AllegroGraph, Sones, Neo4j
  • 19. NOSQL data models Key-value stores Data size Bigtable clones Document databases Graph databases Data complexity
  • 20. NOSQL data models Key-value stores Data size Bigtable clones Document databases Graph databases (This is still billio ns of 90% nodes & relationships) of use cases Data complexity
  • 22. The Graph DB model: representation Core abstractions: name = “Emil” Nodes age = 29 sex = “yes” Relationships between nodes Properties on both 1 2 type = KNOWS time = 4 years 3 type = car vendor = “SAAB” model = “95 Aero”
  • 23. Example: The Matrix name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY KNO 1 7 3 W S KN 13 S OW name = “Cypher” KNOW S last name = “Reagan” name = “Agent Smith” disclosure = secret version = 1.0b age = 3 days age = 6 months language = C++ 2 name = “Trinity”
  • 24. Code (1): Building a node space GraphDatabaseService graphDb = ... // Get factory // Create Thomas 'Neo' Anderson Node mrAnderson = graphDb.createNode(); mrAnderson.setProperty( "name", "Thomas Anderson" ); mrAnderson.setProperty( "age", 29 ); // Create Morpheus Node morpheus = graphDb.createNode(); morpheus.setProperty( "name", "Morpheus" ); morpheus.setProperty( "rank", "Captain" ); morpheus.setProperty( "occupation", "Total bad ass" ); // Create a relationship representing that they know each other mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS ); // ...create Trinity, Cypher, Agent Smith, Architect similarly
  • 25. Code (1): Building a node space GraphDatabaseService graphDb = ... // Get factory Transaction tx = graphDb.beginTx(); // Create Thomas 'Neo' Anderson Node mrAnderson = graphDb.createNode(); mrAnderson.setProperty( "name", "Thomas Anderson" ); mrAnderson.setProperty( "age", 29 ); // Create Morpheus Node morpheus = graphDb.createNode(); morpheus.setProperty( "name", "Morpheus" ); morpheus.setProperty( "rank", "Captain" ); morpheus.setProperty( "occupation", "Total bad ass" ); // Create a relationship representing that they know each other mrAnderson.createRelationshipTo( morpheus, RelTypes.KNOWS ); // ...create Trinity, Cypher, Agent Smith, Architect similarly tx.commit();
  • 26. Code (1b): Def ning RelationshipTypes i // In package org.neo4j.graphdb public interface RelationshipType { String name(); } // In package org.yourdomain.yourapp // Example on how to roll dynamic RelationshipTypes class MyDynamicRelType implements RelationshipType { private final String name; MyDynamicRelType( String name ){ this.name = name; } public String name() { return this.name; } } // Example on how to kick it, static-RelationshipType-like enum MyStaticRelTypes implements RelationshipType { KNOWS, WORKS_FOR, }
  • 27. Whiteboard friendly owns Björn Big Car build drives DayCare
  • 28.
  • 29. The Graph DB model: traversal Traverser framework for high-performance traversing name = “Emil” age = 31 across the node space sex = “yes” 1 2 type = KNOWS time = 4 years 3 type = car vendor = “SAAB” model = “95 Aero”
  • 30. Example: Mr Anderson’s friends name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY KNO 1 7 3 W S KN 13 S OW name = “Cypher” KNOW S last name = “Reagan” name = “Agent Smith” disclosure = secret version = 1.0b age = 3 days age = 6 months language = C++ 2 name = “Trinity”
  • 31. Code (2): Traversing a node space // Instantiate a traverser that returns Mr Anderson's friends Traverser friendsTraverser = mrAnderson.traverse( Traverser.Order.BREADTH_FIRST, StopEvaluator.END_OF_GRAPH, ReturnableEvaluator.ALL_BUT_START_NODE, RelTypes.KNOWS, Direction.OUTGOING ); // Traverse the node space and print out the result System.out.println( "Mr Anderson's friends:" ); for ( Node friend : friendsTraverser ) { System.out.printf( "At depth %d => %s%n", friendsTraverser.currentPosition().getDepth(), friend.getProperty( "name" ) ); }
  • 32. name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY KNO 1 7 3 W S KN 13 S OW name = “Cypher” KNOW S last name = “Reagan” name = “Agent Smith” disclosure = secret version = 1.0b age = 3 days age = 6 months language = C++ 2 name = “Trinity” $ bin/start-neo-example Mr Anderson's friends: At depth 1 => Morpheus friendsTraverser = mrAnderson.traverse( Traverser.Order.BREADTH_FIRST, At depth 1 => Trinity StopEvaluator.END_OF_GRAPH, At depth 2 => Cypher ReturnableEvaluator.ALL_BUT_START_NODE, RelTypes.KNOWS, At depth 3 => Agent Smith Direction.OUTGOING ); $
  • 33. Example: Friends in love? name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS CODED_BY KNO 1 7 3 W S KN 13 S KNOW OW name = “Cypher” S last name = “Reagan” name = “Agent Smith” LO disclosure = secret version = 1.0b VE age = 6 months language = C++ S 2 name = “Trinity”
  • 34. Code (3a): Custom traverser // Create a traverser that returns all “friends in love” Traverser loveTraverser = mrAnderson.traverse( Traverser.Order.BREADTH_FIRST, StopEvaluator.END_OF_GRAPH, new ReturnableEvaluator() { public boolean isReturnableNode( TraversalPosition pos ) { return pos.currentNode().hasRelationship( RelTypes.LOVES, Direction.OUTGOING ); } }, RelTypes.KNOWS, Direction.OUTGOING );
  • 35. Code (3a): Custom traverser // Traverse the node space and print out the result System.out.println( "Who’s a lover?" ); for ( Node person : loveTraverser ) { System.out.printf( "At depth %d => %s%n", loveTraverser.currentPosition().getDepth(), person.getProperty( "name" ) ); }
  • 36. name = “The Architect” name = “Morpheus” rank = “Captain” name = “Thomas Anderson” occupation = “Total badass” 42 age = 29 disclosure = public KNOWS KNOWS KNO CODED_BY 1 7 3 W S KN 13 S name = “Cypher” KNOW OW S last name = “Reagan” name = “Agent Smith” LO disclosure = secret version = 1.0b VE age = 6 months language = C++ S 2 name = “Trinity” $ bin/start-neo-example new ReturnableEvaluator() Who’s a lover? { public boolean isReturnableNode( TraversalPosition pos) At depth 1 => Trinity { $ return pos.currentNode(). hasRelationship( RelTypes.LOVES, Direction.OUTGOING ); } },
  • 37. Bonus code: domain model How do you implement your domain model? Use the delegator pattern, i.e. every domain entity wraps a Neo4j primitive: // In package org.yourdomain.yourapp class PersonImpl implements Person { private final Node underlyingNode; PersonImpl( Node node ){ this.underlyingNode = node; } public String getName() { return (String) this.underlyingNode.getProperty( "name" ); } public void setName( String name ) { this.underlyingNode.setProperty( "name", name ); } }
  • 38. Domain layer frameworks Qi4j (www.qi4j.org) Framework for doing DDD in pure Java5 Def nes Entities / Associations / Properties i Sound familiar? Nodes / Rel’s / Properties! Neo4j is an “EntityStore” backend Jo4neo (http://code.google.com/p/jo4neo) Annotation driven Weaves Neo4j-backed persistence into domain objects at runtime
  • 39. Neo4j system characteristics Disk-based Native graph storage engine with custom binary on-disk format Transactional JTA/JTS, XA, 2PC, Tx recovery, deadlock detection, MVCC, etc Scales up Many billions of nodes/rels/props on single JVM Robust 6+ years in 24/7 production
  • 40. Social network pathE xists() 12 ~1k persons 7 1 3 Avg 50 friends per person pathExists(a, b) limit depth 4 36 41 5 77 Two backends Eliminate disk IO so warm up caches
  • 41. Social network pathE xists() 2 Emil 1 5 7 Mike Kevin 3 John Marcus 9 4 Bruce Leigh # persons query time Relational database 1 000 2 000 ms Graph database (Neo4j) 1 000 2 ms Graph database (Neo4j) 1 000 000 2 ms
  • 42.
  • 43.
  • 44. Pros & Cons compared to RDBMS + No O/R impedance mismatch (whiteboard friendly) + Can easily evolve schemas + Can represent semi-structured info + Can represent graphs/networks (with performance) - Lacks in tool and framework support - Few other implementations => potential lock in - No support for ad-hoc queries
  • 45. Query languages SPARQL – “SQL for linked data” Ex: ”SELECT ?person WHERE { ?person neo4j:KNOWS ?friend . ?friend neo4j:KNOWS ?foe . ?foe neo4j:name “Larry Ellison” . }” Gremlin – “perl for graphs” Ex: ”./outE[@label='KNOWS']/inV[@age > 30]/@name” Chec it out at http://try.neo4j.org
  • 46. The Neo4j ecosystem Neo4j is an embedded database Tiny teeny lil jar f le i Component ecosystem index meta-model graph-matching remote-graphdb sparql-engine ... See http://components.neo4j.org
  • 47. Language bindings Neo4j.py – bindings for Jython and CPython http://components.neo4j.org/neo4j.py Neo4jrb – bindings for JRuby (incl RESTful API) http://wiki.neo4j.org/content/Ruby Neo4jrb-simple http://github.com/mdeiters/neo4jr-simple Clojure http://wiki.neo4j.org/content/Clojure Scala (incl RESTful API) http://wiki.neo4j.org/content/Scala
  • 48.
  • 49.
  • 51. Scale out – replication Neo4j HA in internal beta testing with cust's now Master-slave replication, 1st configuration MySQL style... ish Except all instances can write, synchronously between writing slave & master (strong consistency) Updates are asynchronously propagated to the other slaves (eventual consistency) This can handle billions of entities... … but not 100B
  • 52. Scale out – partitioning “Sharding possible today” … but you have to do manual work … just as with MySQL Transparent partitioning? Neo4j 2.0 100B? Easy to say. Sliiiiightly harder to do. Fundamentals: BASE & eventual consistency Generic clustering algorithm as base case, but give lots of knobs for developers
  • 53. While we're on future stuff Neo4j 1.1 [July 2010] Full JMX / SNMP support Event framework (feedback plz!) Traverser framework iteration … Neo4j HA FCS this summer, GA this fall REST-ful API (already out in 0.8, please give feedback for 1.0!) Framework integration: Roo, Grails Database integration: CouchDB, MySQL Neo4j Spatial (uDig integr, {R,Quad}-tree etc, OSM imports) Client tools: Webling, Neoclipse, Monitoring console
  • 54. How ego are you? (aka other impls?) Franz’ A lleg roG ra ph (http://agraph.franz.com) Proprietary, Lisp, RDF-oriented but real graphdb Sones g ra phD B (http://sones.com) Proprietary, .NET, in beta Twitter's Floc k D B (http://github.com/twitter/flockdb) Twitter's graph database for large and shallow graphs Google P reg el (http://bit.ly/dP9IP) We are oh-so-secret Some academic papers from ~10 years ago G = {V, E} #FAIL
  • 55. Conclusion Graphs && Neo4j => teh awesome! Available NOW under AGPLv3 / commercial license AGPLv3: “if you’re open source, we’re open source” If you have proprietary software? Must buy a commercial license But the first one is free! Download http://neo4j.org Feedback http://lists.neo4j.org
  • 56.
  • 57. Questions? Image credit: lost again! Sorry :(