Patrick McFadin presented on new features in Cassandra 2.0 including lightweight transactions, triggers, CQL improvements, and performance enhancements. Some key points are that lightweight transactions allow conditional updates in a single atomic operation, triggers allow custom Java code to modify mutations before writing, and various optimizations improve query performance, server performance, and operation throughput. Removed features include SuperColumns and on-heap row caching.
4. SELECT * FROM users
WHERE username = ’jbellis’
[empty resultset]
Session 1
SELECT * FROM users
WHERE username = ’jbellis’
[empty resultset]
Session 2
Lightweight transactions: the problem
INSERT INTO users
(username,password)
VALUES (’jbellis’,‘xdg44hh’)
INSERT INTO users
(userName,password)
VALUES (’jbellis’,‘8dhh43k’)
It’s a Race!
Who wins?
Thursday, October 3, 13
9. Paxos
• Consensus algorithm
• All operations are quorum-based
• Each replica sends information about unfinished operations to the leader
during prepare
• Paxos made Simple
Thursday, October 3, 13
10. LWT: details
• 4 round trips vs 1 for normal updates
• Paxos state is durable
• Immediate consistency with no leader election or failover
• ConsistencyLevel.SERIAL
• http://www.datastax.com/dev/blog/lightweight-transactions-in-
cassandra-2-0
Thursday, October 3, 13
11. LWT: Use with caution
• Great for 1% of your application
• Eventual consistency is your friend
• http://www.slideshare.net/planetcassandra/c-summit-2013-eventual-consistency-
hopeful-consistency-by-christos-kalantzis
Thursday, October 3, 13
12. UPDATE USERS
SET email = ’jonathan@datastax.com’, ...
WHERE username = ’jbellis’
IF email = ’jbellis@datastax.com’;
INSERT INTO USERS (username, email, ...)
VALUES (‘jbellis’, ‘jbellis@datastax.com’, ... )
IF NOT EXISTS;
Using LWT
• Don’t overwrite an existing record
• Only update record if condition is met
Thursday, October 3, 13
13. Triggers
CREATE TRIGGER <name> ON <table> USING <classname>;
DROP TRIGGER <name> ON [<keyspace>.]<table>;
• Executed on the coordinator before mutation
• Takes original mutation and adds any new
• Jars deployed per server
Thursday, October 3, 13
14. Trigger implementation
class MyTrigger implements ITrigger
{
public Collection<RowMutation> augment(ByteBuffer key, ColumnFamily update)
{
...
}
}
• You have to implement your own ITrigger (for now)
• Compile and deploy to each server
Thursday, October 3, 13
15. Experimental!
• Relies on internal RowMutation, ColumnFamily classes
• Not sandboxed. Be careful!
• Expect changes in 2.1
Thursday, October 3, 13
16. CQL Improvements
• ALTER DROP
• Remove a field from a CQL table
• Conditional schema changes
• Only execute if condition met
CREATE KEYSPACE IF NOT EXISTS ks
WITH replication = { 'class': 'SimpleStrategy','replication_factor' :
3 };
CREATE TABLE IF NOT EXISTS test (k int PRIMARY KEY);
DROP KEYSPACE IF EXISTS ks;
ALTER TABLE users DROP address3;
Thursday, October 3, 13
17. CQL Improvements
• Aliases in SELECT
• Limit and TTL in prepared statements
SELECT event_id, dateOf(created_at) AS creation_date,
blobAsText(content) AS content
FROM timeline;
event_id | creation_date | content
-------------------------+--------------------------+----------------------
550e8400-e29b-41d4-a716 | 2013-07-26 10:44:33+0200 | Something happened!?
SELECT * FROM myTable LIMIT ?;
UPDATE myTable USING TTL ? SET v = 2 WHERE k = 'foo';
Thursday, October 3, 13
19. Query performance
• Hint when reading time series data
• Time series slices find data faster
• Hybrid approach to Leveled Compaction under stress
• Use size tiered until we catch up
• Reduce read latency impact
• Off-heap memory speedup
• Bytes moved on and off 10x faster
• Removal of row-level bloom filters
Thursday, October 3, 13
20. Server performance
• Single pass compaction
• No more incremental compaction for large storage rows
• LMAX Disruptor on Thrift interface
• Crazy fast and efficient concurrent threads. Faster HSHA
• Support for pluggable off-heap memory allocators
• JEMalloc support to start. Faster memory access.
• Bigger Level 0 file size
• 5M was just too small. Now 256M
Thursday, October 3, 13
21. Removed features
• SuperColumns are gone!
• Not the API just the underlying implementation
• On-heap row cache
• Row cache is no longer an option in the JVM
• Memory pressure relief valves - Gone from yaml
• flush_largest_memtables_at
• reduce_cache_sizes_at
• reduce_cache_sizes_to
Thursday, October 3, 13
22. Operation Changes
• JDK 7 now required
• Vnodes are default
• Streaming overhaul
• Control. Streams are grouped and broken into plans
• Traceability. Each stream has an ID. Monitor each stream.
• Performance. Streams are now pipelined. No waiting for ACK
Thursday, October 3, 13
23. Thank you!
Apache Cassandra 2.0 - Data model on fire
Next talk in my data model series!
Thursday, October 3, 13