Presented at Cassandra London (April 7, 2014); The challenges of time-series storage and analytics in OpenNMS, with an introduction to Newts, a new Cassandra-based time-series data store.
5. OpenNMS: What It Is
● Network Management System
○ Discovery and Provisioning
○ Service monitoring
○ Data collection
○ Event management and notifications
● Java, open source, GPLv3
● Since 1999
7. RRDTool
● Round robin database
● First released 1999
● Time-series storage
● File-based
● Constant-size
● Automatic, amortized aggregation
8. Consider
● 2 IOPs per update (read-update-write)
● 1 RRD per data source (storeByGroup=false)
● 100,000s of data sources, 1,000s IOPS
● 1,000,000s of data sources, 10,000s IOPS
● 15,000 RPM SAS drive, ~175-200 IOPS
9. Also
● Not everything is a graph
● Inflexible
● Incremental backups impractical
● ...
10. Observation #1
We collect and write a great deal; We read
(graph) relatively little.
We are optimized for reading everything,
always.
11. Observation #2
Samples are naturally collected, and graphed
together in groups.
Grouping samples that are accessed together
is an easy optimization.
12. Project: Newts
Goals:
● Stand-alone time-series data store
● High-throughput
● Horizontally scalable
● Grouped metric storage/retrieval
● Late-aggregating