22. • Millions of ‘joins’ per second
• Consistent query times as dataset
grows
• Join Complexity and Performance
• Easy to evolve data model
• Easy to ‘layer’ different types of
data together
Properties of graph databases
25. • Used to represent entity attributes and/or metadata
(e.g. timestamps, version)
• Key-value pairs
• Java primitives
• Arrays
• null is not a valid value
• Every node can have different properties
Nodes can have properties
28. • Relationships are first class citizens
• Every relationship has a name and a direction
– Add structure to the graph
– Provide semantic context for nodes
• Properties used to represent quality or weight
of relationship, or metadata
• Every relationship must have a start node and
end node
Relationships
29. Nodes can be connected by
more than one relationship
Nodes can have more
than one relationship
Self relationships are allowed
Relationships
32. • Nodes
– Entities
• Relationships
– Connect entities and structure domain
• Properties
– Entity attributes, relationship qualities, and
metadata
• Labels
– Group nodes by role
Four Building Blocks
59. • Goals scored in each month by
Michu
• Tottenham results when Gareth Bale
scores
• What did Wayne Rooney do in April?
• Which players only score when a
game is televised?
Other football queries
62. Relational
Graphs
Tables
Nodes
- no need to set a property if it
- assume records all have the
same structure
doesn’t exist
Foreign keys between tables Relationships
- joins calculated at run time
- stored as a ‘Pre-computed
- the more tables you join to a
query the slower the query gets
index’ at write time
- very easy to do lots of ‘hops’
between relationships
Graph vs Relational
In this talk, we'll look at how graph data and Neo4j can be used to model the English Premier League. We'll see how the graph model and Cypher query language makes it natural and fun to query multidimensional semi-structured data. We'll also see how graphs encourage discoverability so that we can spot interesting correlations and become king of the arcane football facts (e.g. how many goals have been scored at grounds in the North West of England by players originating from South America) at your local pub quiz. We'll also see what the graph model would look like if modeled in a relational way and show where the approach reaches its limits and the graph addresses and resolves those challenges.
Let’s get started and talk about graphs. Now in this context we’re thinking more of what are sometimes known as networks and…
…many people when they hear the word graph think of this.
Which isn’t what we’re going to be talking about today!
It’s not a new thing, you’ll already be familiar with lots of things that are graphs but perhaps you don’t know it yet. The London tube is perhaps the most famous example that Londoners at least use every day
It’s not a new thing, you’ll already be familiar with lots of things that are graphs but perhaps you don’t know it yet. The London tube is perhaps the most famous example that Londoners at least use every day
Or if not then you’ve certainly heard of the social network (graph)
An organisational hierarchy is a common model
An organisational hierarchy is a common model
Or of course as we mentioned earlier, a social network of friends of friends and so on is a popular graph
Null values all over the place
Now, as I say, graph databases allow you to store, manage and query your data as a graph. Neo4j adopts a very particular graph model, which we call the property graph model.So I’m going to spend the next few minutes talking about the important aspects of this model in more detail.In fact, I’m going to talk about the enhanced property graph model, which will be available in Neo4j 2.0 sometime later this year.
Pointer in memory and ultimately on disk
Analogy: Gmail labels. Every mail can have zero or more labels attached. Allow you to associate filters with groups of emails.
Always motivated by needs, problems, goals: not transparent window onto realityC18: Seven Bridges of KönigsbergGoal: Find path through the city that crosses each bridge once and once only
Which leads us perfectly into neo4j’s query language
Football is quite a nice domain for
Football is quite a nice domain for modelling in graphs because the data has a lot of dimensions to it
Football is quite a nice domain for
Football is quite a nice domain for
Football is quite a nice domain for
Football is quite a nice domain for
Football is quite a nice domain for
Football is quite a nice domain for
Football is quite a nice domain for
Football is quite a nice domain for
Football is quite a nice domain for
Football is quite a nice domain for
Football is quite a nice domain for
Football is quite a nice domain for
Football is quite a nice domain for
Football is quite a nice domain for
Football is quite a nice domain for
-> SQL - define your tables and relationships and generally don’t change that.Might denormalise or add indexes to speed up queries-> Graphs – define your initial nodes and relationships. May then add ‘layers’ to the graph to make implicit relationships explicit
Football is quite a nice domain for
How is this different to a relational database? We have tables (nodes) and foreign keys between tables (relationships)Those are calculated at run time – in a graph a relationship is a first Class citizen. Effectively a pre-computed indexYou can also traverse lots of ‘hops’ which becomes quite expensive when You do
If it’s not fun and It seems cumbersome then perhaps it’s the wrong tool for that particular data problem or it’s modeled in the wrong way. Might be worth asking
Might be worth asking for help if that isn’t happening or you’re stuck. We have a good community on Stack Overflow and a mailing list as well. You’ll get answers to any questions you have pretty quickly.
Please take a copy of t
Might be worth asking for help if that isn’t happening or you’re stuck. We have a good community on Stack Overflow and a mailing list as well. You’ll get answers to any questions you have pretty quickly.