2. Background
● Bitsy is a small, fast, embeddable, durable,
in-memory graph database that implements
the Tinkerpop Blueprints API.
● The original presentation on Bitsy is
available at
http://slideshare.net/lambdazen/bitsy-graphdatabase
● Bitsy 1.5 is faster and leaner than before!
○ Has a smaller memory footprint
○ Uses (mostly) lock-free read algorithms
● This presentation covers the improvements
in the 1.5 release.
3. Major features in the 1.5 release
● The 1.5 release features:
○ Memory-efficient data structures
○ Mostly lock-free read algorithms
● Bitsy’s new memory-efficient data structures
are designed to reduce the overhead of
maintaining adjacency lists and properties.
● Bitsy’s new read algorithms are designed to
use the latest Java “compare-and-set” (CAS)
concurrency features to reduce the overhead
of locks in highly threaded scenarios.
4. Memory-efficient data structures
● Bitsy 1.0 relied on Java Collections to
maintain adjacency lists and properties of
vertices.
● Java Collections aren’t memory efficient for
small-sized data structures because they
create many holder objects.
● The 1.5 release stores small adjacency lists
(N<24) and small properties (N<16) in hand-
coded objects with minimal overhead.
5. Memory-efficient data structures
● Different concrete
classes capture
adjacency lists and
properties for small N.
○ This approach reduces
the overall number of
objects.
○ Large adjacency lists are
stored in a compact hash-
set by label referring to
memory-efficient lists.
Adjacency lists for out-degree 0, 1 and 2
Vertex properties for N = 0, 1 and 2
6. Lock-free reading
● Bitsy 1.5 also introduces lock-free reading
using sequential locks (seqlock).
● Read operations track the sequence
numbers at the start and end.
○ If they are the same -- Success.
○ If they are different -- Retry!
● Reads don’t start till the counter is even.
● Writers increment the counters twice
○ Before the write to make the counter an odd number
○ After the write to make the counter an even number
7. (Mostly) lock-free reading
● Bitsy’s sequential locks can cause “live lock”
situations when there are too many writers.
● To avoid this, readers degrade to RW locks
after a certain number of retries.
● Seqlock are faster than RW locks in highly
threaded environments where the # of active
threads exceed the # of cores.
● Bitsy uses locks on writes because
○ write-retries are complex with transactions, and
○ locking is not the bottleneck for writes -- the file
system is the bottleneck.
8. Benchmarks
● The plot below shows the read throughput*
of a test!
application that repeatedly loops through a graph.
*
Tests performed on a $600 HP p7-1287c desktop PC with a single 7200rpm hard disk.
!
The code for this test can be found in BitsyGraphTest.java under the method testMultiThreadedCommits().
9. Benchmarks
● The lock-free read algorithms in Bitsy 1.5 show a
significantly higher throughput than Bitsy 1.0.
○ Bitsy 1.0 had a drop in performance when the
number of threads exceeded the number of cores.
○ The read throughput exceeds 10M reads/sec!
● Bitsy is now comparable to Neo4J in read throughput*
.
○ This is an apples-to-apples comparison since Neo4J
is embedded and the graph is fully cached.
○ Most “bad” Neo4J benchmarks are taken when the
graph doesn’t fit in memory.
○ Neo4J is extremely fast when the graph fits in
memory -- and now, so is Bitsy!
10. Another read benchmark
● The following plot shows the traversal performance of
Bitsy 1.5 vs Neo4J 1.9.2 in a multi-threaded setting on a
bipartite graph with 1M vertices and out-degree of 3.
● Again, you can see that the performance is comparable.
11. Benchmarks for write
● As with 1.0 release, Bitsy’s write throughput is much
higher than Neo4J because of the “No Seek” principle.
○ For more info, please refer to the project page at
http://bitbucket.org/lambdazen/bitsy/
12. Wrap-up
● The 1.5 release introduces memory-efficient
data structures and (mostly) lock-free
reading to the Bitsy graph database.
○ With these improvements, Bitsy’s read performance
is comparable to Neo4J’s cache.
○ Bitsy’s “No Seek” write algorithms continue to
outperform other graph databases, including Neo4J.
● Bitsy is a dual-licensed product with
○ an AGPL license for open-source projects, and
○ a liberal unlimited-use OEM/end-user license for
commercial projects. Details at lambdazen.com.