SlideShare a Scribd company logo
1 of 27
Introduction to SolrCloud
Timothy Potter, LucidWorks
My SolrCloud Experience
• Solr Committer; currently working on hardening SolrCloud
• Operated 36 node cluster in AWS for Dachis Group (1.5 years
ago, 18 shards ~900M docs)
• Built a Fabric/boto framework for deploying and managing a
cluster in the cloud
– https://github.com/LucidWorks/solr-scale-tk
• Co-author of Solr In Action; wrote CH 13 which covers
SolrCloud
What is SolrCloud?
Subset of optional features in Solr to enable and
simplify horizontal scaling a search index using
sharding and replication.
Goals
performance, scalability, high-availability,
simplicity, and elasticity
Terminology
• ZooKeeper: Distributed coordination service that provides centralized configuration, cluster state
management, and leader election
• Node: JVM process bound to a specific port on a machine; hosts the Solr web application
• Collection: Search index distributed across multiple nodes; each collection has a name, shard
count, and replication factor
• Replication Factor: Number of copies of a document in a collection
• Shard: Logical slice of a collection; each shard has a name, hash range, leader, and replication factor.
Documents are assigned to one and only one shard per collection using a hash-based document
routing strategy.
• Replica: Solr index that hosts a copy of a shard in a collection; behind the scenes, each replica is
implemented as a Solr core
• Leader: Replica in a shard that assumes special duties needed to support distributed indexing in Solr;
each shard has one and only one leader at any time and leaders are elected using ZooKeeper
SolrCloud High-level Architecture
Java VM (J2SE v. 7)
Jetty (node 1) on port: 8984
Solr Web app
collection
shard1 - Leader
Jetty (node 3) on port: 8984
collection
shard1 - Replica
Java VM (J2SE v. 7)
Solr Web app
Java VM (J2SE v. 7)
Jetty (node 2) on port: 8985
Solr Web app
collection
shard2 - Leader
Jetty (node 4) on port: 8985
collection
shard2 - Replica
Java VM (J2SE v. 7)
Solr Web app
Zookeeper1
Zookeeper2
Zookeeper3
ZooKeeper Ensemble
Leader
Election
Replication
Replication
Sharding
Centralized
Configuration
Management
REST Web Services
XML / JSON / HTTP
Millions of
Documents
Millions of
Users
Server 1 Server 2
Load Balancer
Collection == Distributed Index
A collection is a distributed index defined by:
– named configuration stored in ZooKeeper
– number of shards: documents are distributed across N partitions of the index
– document routing strategy: how documents get assigned to shards
– replication factor: how many copies of each document in the collection
Collections API:
curl "http://localhost:8983/solr/admin/collections?
action=CREATE&name=logstash4solr&replicationFactor=2&
numShards=2&collection.configName=logs"
Demo
1. Start-up bootstrap node with embedded ZooKeeper
2. Add another shard
3. Add some replicas
4. Index some docs
5. Distributed queries
6. Knock-over a node, see cluster stay operational
Sharding
• Collection has a fixed number of shards
– existing shards can be split
• When to shard?
– Large number of docs
– Large document sizes
– Parallelization during indexing and queries
– Data partitioning (custom hashing)
Document Routing
• Each shard covers a hash-range
• Default: Hash ID into 32-bit integer, map to range
– leads to balanced (roughly) shards
• Custom-hashing (example in a few slides)
• Tri-level: app!user!doc
• Implicit: no hash-range set for shards
Replication
• Why replicate?
– High-availability
– Load balancing
• How does it work in SolrCloud?
– Near-real-time, not master-slave
– Leader forwards to replicas in parallel, waits for response
– Error handling during indexing is tricky
Distributed Indexing
View of cluster state from Zk
Shard 1
Leader
Node 1 Node 2
Shard 2
Replica
Shard 2
Leader
Shard 1
Replica
Zookeeper
CloudSolrServer
“smart client”
1
2
4
tlogtlog
Get URLs of current leaders?
35
shard1 range:
80000000-ffffffff
shard2 range:
0-7fffffff
1. Get cluster state from ZK
2. Route document directly to
leader (hash on doc ID)
3. Persist document on durable
storage (tlog)
4. Forward to healthy replicas
5. Acknowledge write
succeed to client
Shard Leader
• Additional responsibilities during indexing only! Not a
master node
• Leader is a replica (handles queries)
• Accepts update requests for the shard
• Increments the _version_ on the new or updated doc
• Sends updates (in parallel) to all replicas
Distributed Queries
View of cluster state from Zk
Shard 1
Leader
Node 1 Node 2
Shard 2
Leader
Shard 2
Replica
Shard 1
Replica
Zookeeper
CloudSolrServer
3
q=*:*
Get URLs of all live nodes
4
2
Query controller
Or just a load balancer works too
get fields
1
1. Query client can be ZK aware or just
query thru a load balancer
2. Client can send query to any node in
the cluster
3. Controller node distributes the query
to a replica for each shard to identify
documents matching query
4. Controller node sorts the results from
step 3 and issues a second query for
all fields for a page of results
Scalability / Stability Highlights
• All nodes in cluster perform indexing and execute queries; no master
node
• Distributed indexing: No SPoF, high throughput via direct updates to
leaders, automated failover to new leader
• Distributed queries: Add replicas to scale-out qps; parallelize complex
query computations; fault-tolerance
• Indexing / queries continue so long as there is 1 healthy replica per
shard
SolrCloud and CAP
• A distributed system should be: Consistent, Available, and Partition tolerant
– CAP says pick 2 of the 3! (slightly more nuanced than that in reality)
• SolrCloud favors consistency over write-availability (CP)
– All replicas in a shard have the same data
– Active replica sets concept (writes accepted so long as a shard has at least one active
replica available)
• No tools to detect or fix consistency issues in Solr
– Reads go to one replica; no concept of quorum
– Writes must fail if consistency cannot be guaranteed (SOLR-5468)
ZooKeeper
• Is a very good thing ... clusters are a zoo!
• Centralized configuration management
• Cluster state management
• Leader election (shard leader and overseer)
• Overseer distributed work queue
• Live Nodes
– Ephemeral znodes used to signal a server is gone
• Needs 3 nodes for quorum in production
ZooKeeper: Centralized Configuration
• Store config files in ZooKeeper
• Solr nodes pull config during core
initialization
• Config sets can be “shared” across
collections
• Changes are uploaded to ZK and
then collections should be reloaded
ZooKeeper: State management
• Keep track of live nodes /live_nodes znode
– ephemeral nodes
– ZooKeeper client timeout
• Collection metadata and replica state in /clusterstate.json
– Every core has watchers for /live_nodes and /clusterstate.json
• Leader election
– ZooKeeper sequence number on ephemeral znodes
Overseer
• What does it do?
– Persists collection state change events to
ZooKeeper
– Controller for Collection API commands
– Ordered updates
– One per cluster (for all collections); elected using
leader election
• How does it work?
– Asynchronous (pub/sub messaging)
– ZooKeeper as distributed queue recipe
– Automated failover to a healthy node
– Can be assigned to a dedicated node (SOLR-
5476)
Collection Aliases
Indexing
Client 1
Indexing
Client 2
Indexing
Client N...
logstash4solr collection
Search
Client 1
Search
Client 2
Search
Client N
...
logstash4solr-write
collection alias
logstash4solr-read
collection alias
Update requests
Query requests
logstash4solr collection
Queries continue to execute
against the logstash4solr collection
while the new one is building
Use the Collections API to
create a new collection named logstash4solr2
and update the logstash4solr-write alias
to direct writes to the new collection
Custom Hashing
{
"id" : ”httpd!2",
"level_s" : ”ERROR",
"lang_s" : "en",
...
},
Hash:
shardKey!docID
shard1 range:
80000000-ffffffff
shard2 range:
0-7fffffff
Shard 1
Leader
Shard 2
Leader
• Route documents to specific shards based on a shard key component in
the document ID
– Send all log messages from the same system to the same shard
• Direct queries to specific shards: q=...&_route_=httpd
Custom Hashing Highlights
• Co-locate documents having a common property in the same shard
– e.g. docs having IDs httpd!21 and httpd!33 will be in the same shard
• Scale-up the replicas for specific shards to address high query and/or
indexing volume from specific apps
• Not as much control over the distribution of keys
– httpd, mysql, and collectd all in same shard
• Can split unbalanced shards when using custom hashing
Shard Splitting
• Split range in half
Shard 1_1
Leader
Node 1 Node 2
Shard 2
Leader
Shard 2
Replica
Shard 1_1
Replica
shard1_0 range:
80000000-bfffffff
shard2 range:
0-7fffffff
Shard 1_0
Leader
Shard 1_0
Replica
shard1_1 range:
c0000000-ffffffff
Shard 1
Leader
Node 1 Node 2
Shard 2
Leader
Shard 2
Replica
Shard 1
Replica
shard1 range:
80000000-ffffffff
shard2 range:
0-7fffffff
Other Features / Highlights
• Near-Real-Time Search: Documents are visible within a second or so after being
indexed
• Partial Document Update: Just update the fields you need to change on existing
documents
• Optimistic Locking: Ensure updates are applied to the correct version of a
document
• Transaction log: Better recoverability; peer-sync between nodes after hiccups
• HTTPS
• Use HDFS for storing indexes
• Use MapReduce for building index (SOLR-1301)
What’s Next?
• Constantly hardening existing features
– More Chaos monkey tests to cover tricky areas in the code
• Large-scale performance testing; 1000’s of collections, 100’s of Solr nodes,
billions of documents
• Splitting collection state into separate znodes (SOLR-5473)
• Collection management UI (SOLR-4388)
• Cluster deployment / management tools
– My talk tomorrow: http://sched.co/1bsKUMn
• Ease of use!
– Please contribute to the mailing list, wiki, JIRA
Wrap-up / Questions
• LucidWorks: http://www.lucidworks.com
• Solr Scale Toolkit: https://github.com/LucidWorks/solr-scale-tk
• SiLK: http://www.lucidworks.com/lucidworks-silk/
• Solr In Action: http://www.manning.com/grainger/
• Connect: @thelabdude / tim.potter@lucidworks.com

More Related Content

What's hot

A Hands-on Introduction on Terraform Best Concepts and Best Practices
A Hands-on Introduction on Terraform Best Concepts and Best Practices A Hands-on Introduction on Terraform Best Concepts and Best Practices
A Hands-on Introduction on Terraform Best Concepts and Best Practices Nebulaworks
 
Custom DevOps Monitoring System in MelOn (with InfluxDB + Telegraf + Grafana)
Custom DevOps Monitoring System in MelOn (with InfluxDB + Telegraf + Grafana)Custom DevOps Monitoring System in MelOn (with InfluxDB + Telegraf + Grafana)
Custom DevOps Monitoring System in MelOn (with InfluxDB + Telegraf + Grafana)Seungmin Yu
 
Efficient Kubernetes scaling using Karpenter
Efficient Kubernetes scaling using KarpenterEfficient Kubernetes scaling using Karpenter
Efficient Kubernetes scaling using KarpenterMarko Bevc
 
The journey to GitOps
The journey to GitOpsThe journey to GitOps
The journey to GitOpsNicola Baldi
 
Docker and Kubernetes 101 workshop
Docker and Kubernetes 101 workshopDocker and Kubernetes 101 workshop
Docker and Kubernetes 101 workshopSathish VJ
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Cloudera, Inc.
 
AWS 기반 블록체인 (2부) - 블록체인 서비스 개발하기 (김준형 & 박천구, AWS 솔루션즈 아키텍트) :: AWS DevDay2018
AWS 기반 블록체인 (2부) - 블록체인 서비스 개발하기 (김준형 & 박천구, AWS 솔루션즈 아키텍트) :: AWS DevDay2018AWS 기반 블록체인 (2부) - 블록체인 서비스 개발하기 (김준형 & 박천구, AWS 솔루션즈 아키텍트) :: AWS DevDay2018
AWS 기반 블록체인 (2부) - 블록체인 서비스 개발하기 (김준형 & 박천구, AWS 솔루션즈 아키텍트) :: AWS DevDay2018Amazon Web Services Korea
 
Implementing Domain Events with Kafka
Implementing Domain Events with KafkaImplementing Domain Events with Kafka
Implementing Domain Events with KafkaAndrei Rugina
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsKetan Gote
 
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaReal time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaTimothy Spann
 
HDFSネームノードのHAについて #hcj13w
HDFSネームノードのHAについて #hcj13wHDFSネームノードのHAについて #hcj13w
HDFSネームノードのHAについて #hcj13wCloudera Japan
 
Scouter와 influx db – grafana 연동 가이드
Scouter와 influx db – grafana 연동 가이드Scouter와 influx db – grafana 연동 가이드
Scouter와 influx db – grafana 연동 가이드Ji-Woong Choi
 
Kubernetes design principles, patterns and ecosystem
Kubernetes design principles, patterns and ecosystemKubernetes design principles, patterns and ecosystem
Kubernetes design principles, patterns and ecosystemSreenivas Makam
 

What's hot (20)

A Hands-on Introduction on Terraform Best Concepts and Best Practices
A Hands-on Introduction on Terraform Best Concepts and Best Practices A Hands-on Introduction on Terraform Best Concepts and Best Practices
A Hands-on Introduction on Terraform Best Concepts and Best Practices
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Custom DevOps Monitoring System in MelOn (with InfluxDB + Telegraf + Grafana)
Custom DevOps Monitoring System in MelOn (with InfluxDB + Telegraf + Grafana)Custom DevOps Monitoring System in MelOn (with InfluxDB + Telegraf + Grafana)
Custom DevOps Monitoring System in MelOn (with InfluxDB + Telegraf + Grafana)
 
The basics of fluentd
The basics of fluentdThe basics of fluentd
The basics of fluentd
 
What Is Helm
 What Is Helm What Is Helm
What Is Helm
 
Efficient Kubernetes scaling using Karpenter
Efficient Kubernetes scaling using KarpenterEfficient Kubernetes scaling using Karpenter
Efficient Kubernetes scaling using Karpenter
 
The journey to GitOps
The journey to GitOpsThe journey to GitOps
The journey to GitOps
 
The basics of fluentd
The basics of fluentdThe basics of fluentd
The basics of fluentd
 
Kafka Security
Kafka SecurityKafka Security
Kafka Security
 
Docker and Kubernetes 101 workshop
Docker and Kubernetes 101 workshopDocker and Kubernetes 101 workshop
Docker and Kubernetes 101 workshop
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
AWS 기반 블록체인 (2부) - 블록체인 서비스 개발하기 (김준형 & 박천구, AWS 솔루션즈 아키텍트) :: AWS DevDay2018
AWS 기반 블록체인 (2부) - 블록체인 서비스 개발하기 (김준형 & 박천구, AWS 솔루션즈 아키텍트) :: AWS DevDay2018AWS 기반 블록체인 (2부) - 블록체인 서비스 개발하기 (김준형 & 박천구, AWS 솔루션즈 아키텍트) :: AWS DevDay2018
AWS 기반 블록체인 (2부) - 블록체인 서비스 개발하기 (김준형 & 박천구, AWS 솔루션즈 아키텍트) :: AWS DevDay2018
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
Vert.x vs akka
Vert.x vs akkaVert.x vs akka
Vert.x vs akka
 
Implementing Domain Events with Kafka
Implementing Domain Events with KafkaImplementing Domain Events with Kafka
Implementing Domain Events with Kafka
 
APACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka StreamsAPACHE KAFKA / Kafka Connect / Kafka Streams
APACHE KAFKA / Kafka Connect / Kafka Streams
 
Real time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafkaReal time stock processing with apache nifi, apache flink and apache kafka
Real time stock processing with apache nifi, apache flink and apache kafka
 
HDFSネームノードのHAについて #hcj13w
HDFSネームノードのHAについて #hcj13wHDFSネームノードのHAについて #hcj13w
HDFSネームノードのHAについて #hcj13w
 
Scouter와 influx db – grafana 연동 가이드
Scouter와 influx db – grafana 연동 가이드Scouter와 influx db – grafana 연동 가이드
Scouter와 influx db – grafana 연동 가이드
 
Kubernetes design principles, patterns and ecosystem
Kubernetes design principles, patterns and ecosystemKubernetes design principles, patterns and ecosystem
Kubernetes design principles, patterns and ecosystem
 

Similar to Solr Exchange: Introduction to SolrCloud

GIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataGIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataShalin Shekhar Mangar
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloudVarun Thacker
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...Lucidworks
 
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Nitin S
 
Solr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin PresentationSolr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin PresentationNitin Sharma
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Shalin Shekhar Mangar
 
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - NitinSolr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitinbloomreacheng
 
Benchmarking Solr Performance
Benchmarking Solr PerformanceBenchmarking Solr Performance
Benchmarking Solr PerformanceLucidworks
 
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...Lucidworks
 
First oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyFirst oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyCominvent AS
 
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale ToolkitDeploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkitthelabdude
 
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systexJames Chen
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scaleAnshum Gupta
 
Meetup on Apache Zookeeper
Meetup on Apache ZookeeperMeetup on Apache Zookeeper
Meetup on Apache ZookeeperAnshul Patel
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkRahul Jain
 
Scaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of CollectionsScaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of CollectionsAnshum Gupta
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introducejhao niu
 

Similar to Solr Exchange: Introduction to SolrCloud (20)

GIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataGIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big Data
 
Introduction to SolrCloud
Introduction to SolrCloudIntroduction to SolrCloud
Introduction to SolrCloud
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
 
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure Solr Compute Cloud - An Elastic SolrCloud Infrastructure
Solr Compute Cloud - An Elastic SolrCloud Infrastructure
 
Solr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin PresentationSolr Lucene Conference 2014 - Nitin Presentation
Solr Lucene Conference 2014 - Nitin Presentation
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
 
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - NitinSolr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
Solr Lucene Revolution 2014 - Solr Compute Cloud - Nitin
 
Benchmarking Solr Performance
Benchmarking Solr PerformanceBenchmarking Solr Performance
Benchmarking Solr Performance
 
Scaling search with SolrCloud
Scaling search with SolrCloudScaling search with SolrCloud
Scaling search with SolrCloud
 
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
Scaling SolrCloud to a Large Number of Collections: Presented by Shalin Shekh...
 
First oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyFirst oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoy
 
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale ToolkitDeploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit
 
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
[Hic2011] using hadoop lucene-solr-for-large-scale-search by systex
 
Deploying and managing Solr at scale
Deploying and managing Solr at scaleDeploying and managing Solr at scale
Deploying and managing Solr at scale
 
Meetup on Apache Zookeeper
Meetup on Apache ZookeeperMeetup on Apache Zookeeper
Meetup on Apache Zookeeper
 
Real time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache SparkReal time Analytics with Apache Kafka and Apache Spark
Real time Analytics with Apache Kafka and Apache Spark
 
Scaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of CollectionsScaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of Collections
 
Zookeeper Introduce
Zookeeper IntroduceZookeeper Introduce
Zookeeper Introduce
 

More from thelabdude

Running Solr in the Cloud at Memory Speed with Alluxio
Running Solr in the Cloud at Memory Speed with AlluxioRunning Solr in the Cloud at Memory Speed with Alluxio
Running Solr in the Cloud at Memory Speed with Alluxiothelabdude
 
NYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / SolrNYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / Solrthelabdude
 
ApacheCon NA 2015 Spark / Solr Integration
ApacheCon NA 2015 Spark / Solr IntegrationApacheCon NA 2015 Spark / Solr Integration
ApacheCon NA 2015 Spark / Solr Integrationthelabdude
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsthelabdude
 
Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4thelabdude
 
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...thelabdude
 
Boosting Documents in Solr (Lucene Revolution 2011)
Boosting Documents in Solr (Lucene Revolution 2011)Boosting Documents in Solr (Lucene Revolution 2011)
Boosting Documents in Solr (Lucene Revolution 2011)thelabdude
 
Dachis Group Pig Hackday: Pig 202
Dachis Group Pig Hackday: Pig 202Dachis Group Pig Hackday: Pig 202
Dachis Group Pig Hackday: Pig 202thelabdude
 

More from thelabdude (8)

Running Solr in the Cloud at Memory Speed with Alluxio
Running Solr in the Cloud at Memory Speed with AlluxioRunning Solr in the Cloud at Memory Speed with Alluxio
Running Solr in the Cloud at Memory Speed with Alluxio
 
NYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / SolrNYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / Solr
 
ApacheCon NA 2015 Spark / Solr Integration
ApacheCon NA 2015 Spark / Solr IntegrationApacheCon NA 2015 Spark / Solr Integration
ApacheCon NA 2015 Spark / Solr Integration
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4Scaling Through Partitioning and Shard Splitting in Solr 4
Scaling Through Partitioning and Shard Splitting in Solr 4
 
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
Lucene Revolution 2013 - Scaling Solr Cloud for Large-scale Social Media Anal...
 
Boosting Documents in Solr (Lucene Revolution 2011)
Boosting Documents in Solr (Lucene Revolution 2011)Boosting Documents in Solr (Lucene Revolution 2011)
Boosting Documents in Solr (Lucene Revolution 2011)
 
Dachis Group Pig Hackday: Pig 202
Dachis Group Pig Hackday: Pig 202Dachis Group Pig Hackday: Pig 202
Dachis Group Pig Hackday: Pig 202
 

Recently uploaded

VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Developmentvyaparkranti
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfkalichargn70th171
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...OnePlan Solutions
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecturerahul_net
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...OnePlan Solutions
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Anthony Dahanne
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogueitservices996
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Angel Borroy López
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprisepreethippts
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptxVinzoCenzo
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...Bert Jan Schrijver
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesVictoriaMetrics
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slidesvaideheekore1
 
VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfRTS corp
 

Recently uploaded (20)

VK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web DevelopmentVK Business Profile - provides IT solutions and Web Development
VK Business Profile - provides IT solutions and Web Development
 
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdfExploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
Exploring Selenium_Appium Frameworks for Seamless Integration with HeadSpin.pdf
 
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
Tech Tuesday Slides - Introduction to Project Management with OnePlan's Work ...
 
Understanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM ArchitectureUnderstanding Flamingo - DeepMind's VLM Architecture
Understanding Flamingo - DeepMind's VLM Architecture
 
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
Tech Tuesday - Mastering Time Management Unlock the Power of OnePlan's Timesh...
 
Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024Not a Kubernetes fan? The state of PaaS in 2024
Not a Kubernetes fan? The state of PaaS in 2024
 
Ronisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited CatalogueRonisha Informatics Private Limited Catalogue
Ronisha Informatics Private Limited Catalogue
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
Alfresco TTL#157 - Troubleshooting Made Easy: Deciphering Alfresco mTLS Confi...
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Odoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 EnterpriseOdoo 14 - eLearning Module In Odoo 14 Enterprise
Odoo 14 - eLearning Module In Odoo 14 Enterprise
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Osi security architecture in network.pptx
Osi security architecture in network.pptxOsi security architecture in network.pptx
Osi security architecture in network.pptx
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
JavaLand 2024 - Going serverless with Quarkus GraalVM native images and AWS L...
 
What’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 UpdatesWhat’s New in VictoriaMetrics: Q1 2024 Updates
What’s New in VictoriaMetrics: Q1 2024 Updates
 
Introduction to Firebase Workshop Slides
Introduction to Firebase Workshop SlidesIntroduction to Firebase Workshop Slides
Introduction to Firebase Workshop Slides
 
VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024VictoriaMetrics Anomaly Detection Updates: Q1 2024
VictoriaMetrics Anomaly Detection Updates: Q1 2024
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdfEnhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
Enhancing Supply Chain Visibility with Cargo Cloud Solutions.pdf
 

Solr Exchange: Introduction to SolrCloud

  • 2.
  • 3. My SolrCloud Experience • Solr Committer; currently working on hardening SolrCloud • Operated 36 node cluster in AWS for Dachis Group (1.5 years ago, 18 shards ~900M docs) • Built a Fabric/boto framework for deploying and managing a cluster in the cloud – https://github.com/LucidWorks/solr-scale-tk • Co-author of Solr In Action; wrote CH 13 which covers SolrCloud
  • 4. What is SolrCloud? Subset of optional features in Solr to enable and simplify horizontal scaling a search index using sharding and replication. Goals performance, scalability, high-availability, simplicity, and elasticity
  • 5. Terminology • ZooKeeper: Distributed coordination service that provides centralized configuration, cluster state management, and leader election • Node: JVM process bound to a specific port on a machine; hosts the Solr web application • Collection: Search index distributed across multiple nodes; each collection has a name, shard count, and replication factor • Replication Factor: Number of copies of a document in a collection • Shard: Logical slice of a collection; each shard has a name, hash range, leader, and replication factor. Documents are assigned to one and only one shard per collection using a hash-based document routing strategy. • Replica: Solr index that hosts a copy of a shard in a collection; behind the scenes, each replica is implemented as a Solr core • Leader: Replica in a shard that assumes special duties needed to support distributed indexing in Solr; each shard has one and only one leader at any time and leaders are elected using ZooKeeper
  • 6. SolrCloud High-level Architecture Java VM (J2SE v. 7) Jetty (node 1) on port: 8984 Solr Web app collection shard1 - Leader Jetty (node 3) on port: 8984 collection shard1 - Replica Java VM (J2SE v. 7) Solr Web app Java VM (J2SE v. 7) Jetty (node 2) on port: 8985 Solr Web app collection shard2 - Leader Jetty (node 4) on port: 8985 collection shard2 - Replica Java VM (J2SE v. 7) Solr Web app Zookeeper1 Zookeeper2 Zookeeper3 ZooKeeper Ensemble Leader Election Replication Replication Sharding Centralized Configuration Management REST Web Services XML / JSON / HTTP Millions of Documents Millions of Users Server 1 Server 2 Load Balancer
  • 7. Collection == Distributed Index A collection is a distributed index defined by: – named configuration stored in ZooKeeper – number of shards: documents are distributed across N partitions of the index – document routing strategy: how documents get assigned to shards – replication factor: how many copies of each document in the collection Collections API: curl "http://localhost:8983/solr/admin/collections? action=CREATE&name=logstash4solr&replicationFactor=2& numShards=2&collection.configName=logs"
  • 8. Demo 1. Start-up bootstrap node with embedded ZooKeeper 2. Add another shard 3. Add some replicas 4. Index some docs 5. Distributed queries 6. Knock-over a node, see cluster stay operational
  • 9. Sharding • Collection has a fixed number of shards – existing shards can be split • When to shard? – Large number of docs – Large document sizes – Parallelization during indexing and queries – Data partitioning (custom hashing)
  • 10. Document Routing • Each shard covers a hash-range • Default: Hash ID into 32-bit integer, map to range – leads to balanced (roughly) shards • Custom-hashing (example in a few slides) • Tri-level: app!user!doc • Implicit: no hash-range set for shards
  • 11. Replication • Why replicate? – High-availability – Load balancing • How does it work in SolrCloud? – Near-real-time, not master-slave – Leader forwards to replicas in parallel, waits for response – Error handling during indexing is tricky
  • 12. Distributed Indexing View of cluster state from Zk Shard 1 Leader Node 1 Node 2 Shard 2 Replica Shard 2 Leader Shard 1 Replica Zookeeper CloudSolrServer “smart client” 1 2 4 tlogtlog Get URLs of current leaders? 35 shard1 range: 80000000-ffffffff shard2 range: 0-7fffffff 1. Get cluster state from ZK 2. Route document directly to leader (hash on doc ID) 3. Persist document on durable storage (tlog) 4. Forward to healthy replicas 5. Acknowledge write succeed to client
  • 13. Shard Leader • Additional responsibilities during indexing only! Not a master node • Leader is a replica (handles queries) • Accepts update requests for the shard • Increments the _version_ on the new or updated doc • Sends updates (in parallel) to all replicas
  • 14. Distributed Queries View of cluster state from Zk Shard 1 Leader Node 1 Node 2 Shard 2 Leader Shard 2 Replica Shard 1 Replica Zookeeper CloudSolrServer 3 q=*:* Get URLs of all live nodes 4 2 Query controller Or just a load balancer works too get fields 1 1. Query client can be ZK aware or just query thru a load balancer 2. Client can send query to any node in the cluster 3. Controller node distributes the query to a replica for each shard to identify documents matching query 4. Controller node sorts the results from step 3 and issues a second query for all fields for a page of results
  • 15. Scalability / Stability Highlights • All nodes in cluster perform indexing and execute queries; no master node • Distributed indexing: No SPoF, high throughput via direct updates to leaders, automated failover to new leader • Distributed queries: Add replicas to scale-out qps; parallelize complex query computations; fault-tolerance • Indexing / queries continue so long as there is 1 healthy replica per shard
  • 16. SolrCloud and CAP • A distributed system should be: Consistent, Available, and Partition tolerant – CAP says pick 2 of the 3! (slightly more nuanced than that in reality) • SolrCloud favors consistency over write-availability (CP) – All replicas in a shard have the same data – Active replica sets concept (writes accepted so long as a shard has at least one active replica available) • No tools to detect or fix consistency issues in Solr – Reads go to one replica; no concept of quorum – Writes must fail if consistency cannot be guaranteed (SOLR-5468)
  • 17. ZooKeeper • Is a very good thing ... clusters are a zoo! • Centralized configuration management • Cluster state management • Leader election (shard leader and overseer) • Overseer distributed work queue • Live Nodes – Ephemeral znodes used to signal a server is gone • Needs 3 nodes for quorum in production
  • 18. ZooKeeper: Centralized Configuration • Store config files in ZooKeeper • Solr nodes pull config during core initialization • Config sets can be “shared” across collections • Changes are uploaded to ZK and then collections should be reloaded
  • 19. ZooKeeper: State management • Keep track of live nodes /live_nodes znode – ephemeral nodes – ZooKeeper client timeout • Collection metadata and replica state in /clusterstate.json – Every core has watchers for /live_nodes and /clusterstate.json • Leader election – ZooKeeper sequence number on ephemeral znodes
  • 20. Overseer • What does it do? – Persists collection state change events to ZooKeeper – Controller for Collection API commands – Ordered updates – One per cluster (for all collections); elected using leader election • How does it work? – Asynchronous (pub/sub messaging) – ZooKeeper as distributed queue recipe – Automated failover to a healthy node – Can be assigned to a dedicated node (SOLR- 5476)
  • 21. Collection Aliases Indexing Client 1 Indexing Client 2 Indexing Client N... logstash4solr collection Search Client 1 Search Client 2 Search Client N ... logstash4solr-write collection alias logstash4solr-read collection alias Update requests Query requests logstash4solr collection Queries continue to execute against the logstash4solr collection while the new one is building Use the Collections API to create a new collection named logstash4solr2 and update the logstash4solr-write alias to direct writes to the new collection
  • 22. Custom Hashing { "id" : ”httpd!2", "level_s" : ”ERROR", "lang_s" : "en", ... }, Hash: shardKey!docID shard1 range: 80000000-ffffffff shard2 range: 0-7fffffff Shard 1 Leader Shard 2 Leader • Route documents to specific shards based on a shard key component in the document ID – Send all log messages from the same system to the same shard • Direct queries to specific shards: q=...&_route_=httpd
  • 23. Custom Hashing Highlights • Co-locate documents having a common property in the same shard – e.g. docs having IDs httpd!21 and httpd!33 will be in the same shard • Scale-up the replicas for specific shards to address high query and/or indexing volume from specific apps • Not as much control over the distribution of keys – httpd, mysql, and collectd all in same shard • Can split unbalanced shards when using custom hashing
  • 24. Shard Splitting • Split range in half Shard 1_1 Leader Node 1 Node 2 Shard 2 Leader Shard 2 Replica Shard 1_1 Replica shard1_0 range: 80000000-bfffffff shard2 range: 0-7fffffff Shard 1_0 Leader Shard 1_0 Replica shard1_1 range: c0000000-ffffffff Shard 1 Leader Node 1 Node 2 Shard 2 Leader Shard 2 Replica Shard 1 Replica shard1 range: 80000000-ffffffff shard2 range: 0-7fffffff
  • 25. Other Features / Highlights • Near-Real-Time Search: Documents are visible within a second or so after being indexed • Partial Document Update: Just update the fields you need to change on existing documents • Optimistic Locking: Ensure updates are applied to the correct version of a document • Transaction log: Better recoverability; peer-sync between nodes after hiccups • HTTPS • Use HDFS for storing indexes • Use MapReduce for building index (SOLR-1301)
  • 26. What’s Next? • Constantly hardening existing features – More Chaos monkey tests to cover tricky areas in the code • Large-scale performance testing; 1000’s of collections, 100’s of Solr nodes, billions of documents • Splitting collection state into separate znodes (SOLR-5473) • Collection management UI (SOLR-4388) • Cluster deployment / management tools – My talk tomorrow: http://sched.co/1bsKUMn • Ease of use! – Please contribute to the mailing list, wiki, JIRA
  • 27. Wrap-up / Questions • LucidWorks: http://www.lucidworks.com • Solr Scale Toolkit: https://github.com/LucidWorks/solr-scale-tk • SiLK: http://www.lucidworks.com/lucidworks-silk/ • Solr In Action: http://www.manning.com/grainger/ • Connect: @thelabdude / tim.potter@lucidworks.com

Editor's Notes

  1. Tag cloud showing the major concepts in SolrCloud
  2. Optional: You don’t have to use SolrCloud if you don’t need itHorizontal scaling: add more nodesSharding: split a large index into slices, where each slice contains a subset of the entire document setReplication: add copies of each document in an index to support more queries per second and high-availability
  3. This slide just for reference
  4. Shard Leaders are elected using ZooKeeperLeaders forward documents to replicas in real-timeZooKeeper provides centralized configuration, leader election, and cluster state managementZooKeeper can be clustered into multiple nodes called an “ensemble” and is highly scalable and fault tolerant
  5. Logstash4SolrCome to my other talk if you want to see more SolrClouddev-opsdebug=track, shards.info
  6. Parallelize during indexing and query executionData partitioning
  7. http://searchhub.org/2014/01/06/10590/
  8. Near-real-time
  9. TODO: mention streamingTODO: Better diagramTODO: Better coverage of TLOGEach shard covers a unique hash rangeShard leader applies document versioning to support optimistic locking and directs update requests to healthy replicas.CloudSolrServer supports high-throughput indexing by sending batches of documents in parallel directly to shard leaders.CloudSolrServer is a “smart client” in that it queries ZooKeeper for cluster state (and watches for cluster state changes).If you provide a batch of 100 documents to CloudSolrServer, it will break the batch up into sub-batches for each shard and then send the sub-batches in parallel directly to the shard leaders.
  10. TODO: this slide is weakAutomated failoverWhy do we need a leader?
  11. Better diagram
  12. TODO: Work this slide into others
  13. Why consistency?What does that require? ie. what do I give up?How does this affect me in reality?
  14. Mention other uses of Collection Aliases tooShard aliases (future)
  15. Allows you to target queries to specific shards (when that makes sense)Non-distributed queries
  16. Split range by _route_ (SOLR-5308, 5338)Mention over-sharding (diagram maybe)TODO: Animate and show moving to another nodeShow the Collections API command