SlideShare a Scribd company logo
1 of 21
Search | Discover | Analyze
Confidential and Proprietary © Copyright 2013
Deploying and Managing
SolrCloud in the Cloud
ApacheCon, April 8, 2014
Timothy Potter
Confidential and Proprietary © Copyright 2013
My SolrCloud Experience
• Currently, working on scaling up to a 200+ node deployment at
LucidWorks
• Operated 36 node cluster in AWS for Dachis Group (1.5 years ago, 18
shards ~900M docs)
• Contributed several tests and patches to the code base
• Built a Fabric/boto framework for deploying and managing a cluster in
EC2
• Co-author of Solr In Action; wrote CH 13 which covers SolrCloud
Confidential and Proprietary © Copyright 2013
Solr Scaling Toolkit
• Requirements
• High-level overview
• Nuts and Bolts (live demo)
• Roadmap
• Q&A
Confidential and Proprietary © Copyright 2013
• Provisioning N machine instances in EC2
• Configuring / starting ZooKeeper (1 to n servers)
• Configuring / starting N Solr instances in cloud
mode (M x N nodes)
• Integrating with Logstash4Solr and other supporting
services, e.g. collectd
• Day-to-day operations on an existing cluster
Tasks to Automate
Confidential and Proprietary © Copyright 2013
Python-based Tools
boto – Python API for AWS (EC2, S3, etc)
Fabric – Python-based tool for automating system admin tasks
over SSH
pysolr – Python library for Solr (sending commits, queries, ...)
kazoo – Python client tools for ZooKeeper
Supporting Cast:
JMeter – run tests, generate reports
collectd – system monitoring
Logstash4Solr – log aggregation
JConsole/VisualVM – monitor JVM during indexing / queries
Confidential and Proprietary © Copyright 2013
Fabric in 3 minutes or Less ...
Fabric helps you do common system administration tasks on
multiple hosts over SSH ...
• Just Python
• Easy to install / learn; good documentation
• http://docs.fabfile.org/en/1.8/
def kill(cluster):
ec2 = _connect_ec2()
taggedInstances = _find_instances_in_cluster(ec2, cluster)
instance_ids = taggedInstances.keys()
if confirm(('Found %d instances to terminate, continue? '
% len(instance_ids))):
ec2.terminate_instances(instance_ids)
ec2.close()
Confidential and Proprietary © Copyright 2013
Fabric in 3 minutes or Less, cont. ...
• Define all commands in a file named: fabfile.py
• Get a list of supported commands with short description
• Get extended documentation for a command
$ fab -l
Available commands:
backup_to_s3 Backup an existing collection to S3
check_zk Performs health check against all ...
commit Sends a hard commit to the ...
...
$ fab -d new_solr_cloud
Displaying detailed information for task 'new_solrcloud’:
Provisions n EC2 instances and then deploys SolrCloud; uses
the new_ec2_instances and setup_solrcloud commands ...
Confidential and Proprietary © Copyright 2013
Meta Node
SiLK
SolrCloud Nodes (NxM nodes)
Node 1: Custom AMI
...
...
Solr Node 1: 8983
...
core
core
Solr Node N: 898x
...core
core
M of these machines
system monitoring
of M machines w/
collectd and JMX
deploy and manage
SolrCloud cluster
Solr-Scale-Toolkit
ZooKeeper-1
ZK Host 1
ZooKeeper-N
ZK Host N
ZooKeeper Ensemble
...
Solr Scale Toolkit: Architecture
Confidential and Proprietary © Copyright 2013
Solr Scale Toolkit: Demo
• Launch a meta node
– Log agg / basic monitoring using SiLK
• Launch ZooKeeper Ensemble
– 3 nodes to establish quorum
– Setup cron job to clean-up snapshots
• Launch SolrCloud cluster
• Create new collection and index some docs
– Attach JConsole while indexing
• Run a healthcheck on the collection
• Checkout Banana Dashboard
• Backup / Restore
– Requires patch for SOLR-5956
– Use fab patch_jars to update jars and do a rolling restart
Confidential and Proprietary © Copyright 2013
• Custom built AMI?
• Block device mapping
– dedicated disk per Solr node
• Launch and then poll status until they are live
– verify SSH connectivity
• Tag each instance with a cluster ID and username
Provisioning machines
fab new_ec2_instances:test1,n=3,instance_type=m3.xlarge
Confidential and Proprietary © Copyright 2013
• Two options:
– provision 1 to N nodes when you launch Solr cluster
– use existing named ensemble
• Fabric command simply creates the myid
files and zoo.cfg file for the ensemble
– and some cron scripts for managing snapshots
• Basic health checking of ZooKeeper status:
– echo srvr | nc localhost 2181
ZooKeeper
fab new_zk_ensemble:zk1,n=3
Confidential and Proprietary © Copyright 2013
SolrCloud Cluster: NxM nodes
EC2 Instance: RedHat Enterprise Linux, 64-bit
Solr 4.7.1 Node 1
MMapDirectory
dedicated
disk 1
Limit to 50-100M docs across all cores per node
Solr 4.7.1 Node N
MMapDirectory
...
dedicated
disk N
... x M
instances
OS
cache
memory
mapped
I/O
collection1
shard1 / replica1
(Solr core)
...
collection2
shard2 / replica1
(Solr core)
collection3
shard1 / replica1
(Solr core)
...
collection1
shard2 / replica1
(Solr core)
...
Must design to give bulk of
the memory to OS cache
Confidential and Proprietary © Copyright 2013
• Upload a BASH script that starts/stops Solr
• Set system props: jetty.port, host, zkHost, JVM
opts
• One or more Solr nodes per machine
• JVM mem opts dependent on instance type and
# of Solr nodes per instance
• Optionally configure log4j.properties to append
messages to Rabbitmq for Logstash4Solr
integration
SolrCloud
fab new_solrcloud:test1,zk=zk1,nodesPerHost=2
Confidential and Proprietary © Copyright 2013
• BASH script that implements:
– start/stop Solr nodes on each EC2 instance
– sets JVM memory options, system properties
(jetty.port), enable remote JMX, etc
– backup log files before restarting nodes
– ensure JVM is killed correctly before restarting
• Environment variables in:
solr-ctl-env.sh
solr-ctl.sh
Confidential and Proprietary © Copyright 2013
• Deploy a configuration directory to ZooKeeper
• Create a new collection
• Attach a local JConsole/VisualVM to a remote JVM
• Rolling restart (with Overseer awareness)
• Build Solr locally and patch remote
– Use a relay server to scp the JARs to Amazon network once and then
scp them to other nodes from within the network
• Put/get files
• Grep over all log files (across the cluster)
Miscellaneous Utility Tasks
Confidential and Proprietary © Copyright 2013
• fab mine: See clusters I’m running (or for other users too)
• fab kill_mine: Terminate all instances I’m running
– Use termination protection in production
• fab ssh_to: Quick way to SSH to one of the nodes in a
cluster
• fab stop/recover/kill: Basic commands for controlling
specific Solr nodes in the cluster
• fab jmeter: Execute a JMeter test plan against your cluster
– Example test plan and Java sampler is included with the source
Other useful stuff ...
Confidential and Proprietary © Copyright 2013
• Java-based command-line application that uses SolrJ’s
CloudSolrServer to perform advanced cluster
management operations:
– healthcheck: collect metadata and health information from all
replicas for a collection from ZooKeeper
– backup: create a snapshot of each shard in a collection for
backing up to remote storage (S3)
• Framework for building complex tools that benefit from
having access to cluster state information in ZooKeeper
SolrCloud Tools (SolrJ client app)
./tools.sh –tool healthcheck
Confidential and Proprietary © Copyright 2013
SiLK Integration
• SiLK: Solr integrated with Logstash and Kibana
– Index time-series data, such as log data (collectd, Solr logs, ...)
– Build cool dashboards with Banana (fork of Kibana)
• Easily aggregate all WARN and more severe log
messages from all Solr servers into logstash4solr
• Send collectd metrics to logstash4solr
Confidential and Proprietary © Copyright 2013
SiLK Integration
Solr Node 1: 8983
...
core
core
AMQP
Log4J
Appender
logstash4solr
logstash4solr
index
parsing/
indexing
decouple
log write
performance
from log
indexing
Ad hoc log
analysis
Solr Node N: 8983
...
core
core
...
many of these
Log Records Include:
- host:port
- collection
- shard
- test label
+ standard Log4J message fields
MQ
banana
Dashboard
Confidential and Proprietary © Copyright 2013
What’s Next?
• Migrate to using Apache libcloud instead of using boto
directly
• Use this framework to perform large-scale performance
testing
– Report results back to community
• Ability to request spot instances
– Good for testing only
• Chaos monkey tests
– integrate jepsen?
• Open source so please kick the tires!
Confidential and Proprietary © Copyright 2013
Wrap-up
• Solr Scale Toolkit: https://github.com/LucidWorks/solr-scale-tk
• LucidWorks: http://www.lucidworks.com
• SiLK: http://www.lucidworks.com/lucidworks-silk/
• Solr In Action: http://www.manning.com/grainger/
• Connect: @thelabdude / tim.potter@lucidworks.com
Questions?

More Related Content

What's hot

Solrcloud Leader Election
Solrcloud Leader ElectionSolrcloud Leader Election
Solrcloud Leader Electionravikgiitk
 
Call me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksCall me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksShalin Shekhar Mangar
 
NYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / SolrNYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / Solrthelabdude
 
What's New on AWS and What it Means to You
What's New on AWS and What it Means to YouWhat's New on AWS and What it Means to You
What's New on AWS and What it Means to YouAmazon Web Services
 
SolrCloud Failover and Testing
SolrCloud Failover and TestingSolrCloud Failover and Testing
SolrCloud Failover and TestingMark Miller
 
Scaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of CollectionsScaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of CollectionsAnshum Gupta
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceLucidworks (Archived)
 
GIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataGIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataShalin Shekhar Mangar
 
Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Shalin Shekhar Mangar
 
First oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyFirst oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyCominvent AS
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudAnshum Gupta
 
What's new in Solr 5.0
What's new in Solr 5.0What's new in Solr 5.0
What's new in Solr 5.0Anshum Gupta
 
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAwareLeveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAwareLucidworks
 
Meetup on Apache Zookeeper
Meetup on Apache ZookeeperMeetup on Apache Zookeeper
Meetup on Apache ZookeeperAnshul Patel
 
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBM
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBMBuilding and Running Solr-as-a-Service: Presented by Shai Erera, IBM
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBMLucidworks
 
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...Lucidworks
 
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...Lucidworks
 
Distributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperDistributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperAlex Ehrnschwender
 
Running Solr at Memory Speed with Alluxio - Timothy Potter, Lucidworks
Running Solr at Memory Speed with Alluxio - Timothy Potter, LucidworksRunning Solr at Memory Speed with Alluxio - Timothy Potter, Lucidworks
Running Solr at Memory Speed with Alluxio - Timothy Potter, LucidworksLucidworks
 

What's hot (20)

Solrcloud Leader Election
Solrcloud Leader ElectionSolrcloud Leader Election
Solrcloud Leader Election
 
Call me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networksCall me maybe: Jepsen and flaky networks
Call me maybe: Jepsen and flaky networks
 
NYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / SolrNYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / Solr
 
What's New on AWS and What it Means to You
What's New on AWS and What it Means to YouWhat's New on AWS and What it Means to You
What's New on AWS and What it Means to You
 
SolrCloud Failover and Testing
SolrCloud Failover and TestingSolrCloud Failover and Testing
SolrCloud Failover and Testing
 
Scaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of CollectionsScaling SolrCloud to a large number of Collections
Scaling SolrCloud to a large number of Collections
 
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr PerformanceSFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
SFBay Area Solr Meetup - June 18th: Benchmarking Solr Performance
 
GIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big DataGIDS2014: SolrCloud: Searching Big Data
GIDS2014: SolrCloud: Searching Big Data
 
Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6Cross Datacenter Replication in Apache Solr 6
Cross Datacenter Replication in Apache Solr 6
 
SolrCloud on Hadoop
SolrCloud on HadoopSolrCloud on Hadoop
SolrCloud on Hadoop
 
First oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoyFirst oslo solr community meetup lightning talk janhoy
First oslo solr community meetup lightning talk janhoy
 
Best practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloudBest practices for highly available and large scale SolrCloud
Best practices for highly available and large scale SolrCloud
 
What's new in Solr 5.0
What's new in Solr 5.0What's new in Solr 5.0
What's new in Solr 5.0
 
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAwareLeveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
Leveraging the Power of Solr with Spark: Presented by Johannes Weigend, QAware
 
Meetup on Apache Zookeeper
Meetup on Apache ZookeeperMeetup on Apache Zookeeper
Meetup on Apache Zookeeper
 
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBM
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBMBuilding and Running Solr-as-a-Service: Presented by Shai Erera, IBM
Building and Running Solr-as-a-Service: Presented by Shai Erera, IBM
 
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
Rebalance API for SolrCloud: Presented by Nitin Sharma, Netflix & Suruchi Sha...
 
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
 
Distributed Applications with Apache Zookeeper
Distributed Applications with Apache ZookeeperDistributed Applications with Apache Zookeeper
Distributed Applications with Apache Zookeeper
 
Running Solr at Memory Speed with Alluxio - Timothy Potter, Lucidworks
Running Solr at Memory Speed with Alluxio - Timothy Potter, LucidworksRunning Solr at Memory Speed with Alluxio - Timothy Potter, Lucidworks
Running Solr at Memory Speed with Alluxio - Timothy Potter, Lucidworks
 

Viewers also liked

Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Shalin Shekhar Mangar
 
Boosting Documents in Solr (Lucene Revolution 2011)
Boosting Documents in Solr (Lucene Revolution 2011)Boosting Documents in Solr (Lucene Revolution 2011)
Boosting Documents in Solr (Lucene Revolution 2011)thelabdude
 
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...rhatr
 
Faceting optimizations for Solr
Faceting optimizations for SolrFaceting optimizations for Solr
Faceting optimizations for SolrToke Eskildsen
 
ApacheCon NA 2015 Spark / Solr Integration
ApacheCon NA 2015 Spark / Solr IntegrationApacheCon NA 2015 Spark / Solr Integration
ApacheCon NA 2015 Spark / Solr Integrationthelabdude
 
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...Lucidworks
 
Top apache solr features
Top apache solr featuresTop apache solr features
Top apache solr featuresIntellipaat
 
Solr: 4 big features
Solr: 4 big featuresSolr: 4 big features
Solr: 4 big featuresDavid Smiley
 
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...Lucidworks
 
Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, Etsy
Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, EtsyLessons From Sharding Solr At Etsy: Presented by Gregg Donovan, Etsy
Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, EtsyLucidworks
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersSematext Group, Inc.
 
How SolrCloud Changes the User Experience In a Sharded Environment
How SolrCloud Changes the User Experience In a Sharded EnvironmentHow SolrCloud Changes the User Experience In a Sharded Environment
How SolrCloud Changes the User Experience In a Sharded Environmentlucenerevolution
 
SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, ...
SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, ...SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, ...
SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, ...Lucidworks
 
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupInside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupShalin Shekhar Mangar
 
Benchmarking Solr Performance
Benchmarking Solr PerformanceBenchmarking Solr Performance
Benchmarking Solr PerformanceLucidworks
 
Why Is My Solr Slow?: Presented by Mike Drob, Cloudera
Why Is My Solr Slow?: Presented by Mike Drob, ClouderaWhy Is My Solr Slow?: Presented by Mike Drob, Cloudera
Why Is My Solr Slow?: Presented by Mike Drob, ClouderaLucidworks
 

Viewers also liked (20)

Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Apache SolrCloud
Apache SolrCloudApache SolrCloud
Apache SolrCloud
 
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
Scaling SolrCloud to a Large Number of Collections - Fifth Elephant 2014
 
Boosting Documents in Solr (Lucene Revolution 2011)
Boosting Documents in Solr (Lucene Revolution 2011)Boosting Documents in Solr (Lucene Revolution 2011)
Boosting Documents in Solr (Lucene Revolution 2011)
 
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
Building Google-in-a-box: using Apache SolrCloud and Bigtop to index your big...
 
Faceting optimizations for Solr
Faceting optimizations for SolrFaceting optimizations for Solr
Faceting optimizations for Solr
 
ApacheCon NA 2015 Spark / Solr Integration
ApacheCon NA 2015 Spark / Solr IntegrationApacheCon NA 2015 Spark / Solr Integration
ApacheCon NA 2015 Spark / Solr Integration
 
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
 
Top apache solr features
Top apache solr featuresTop apache solr features
Top apache solr features
 
Solr: 4 big features
Solr: 4 big featuresSolr: 4 big features
Solr: 4 big features
 
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
Solr Compute Cloud – An Elastic Solr Infrastructure: Presented by Nitin Sharm...
 
Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, Etsy
Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, EtsyLessons From Sharding Solr At Etsy: Presented by Gregg Donovan, Etsy
Lessons From Sharding Solr At Etsy: Presented by Gregg Donovan, Etsy
 
Administering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud ClustersAdministering and Monitoring SolrCloud Clusters
Administering and Monitoring SolrCloud Clusters
 
How SolrCloud Changes the User Experience In a Sharded Environment
How SolrCloud Changes the User Experience In a Sharded EnvironmentHow SolrCloud Changes the User Experience In a Sharded Environment
How SolrCloud Changes the User Experience In a Sharded Environment
 
SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, ...
SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, ...SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, ...
SolrCloud - High Availability and Fault Tolerance: Presented by Mark Miller, ...
 
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene MeetupInside Solr 5 - Bangalore Solr/Lucene Meetup
Inside Solr 5 - Bangalore Solr/Lucene Meetup
 
Benchmarking Solr Performance
Benchmarking Solr PerformanceBenchmarking Solr Performance
Benchmarking Solr Performance
 
SolrCloud and Shard Splitting
SolrCloud and Shard SplittingSolrCloud and Shard Splitting
SolrCloud and Shard Splitting
 
Why Is My Solr Slow?: Presented by Mike Drob, Cloudera
Why Is My Solr Slow?: Presented by Mike Drob, ClouderaWhy Is My Solr Slow?: Presented by Mike Drob, Cloudera
Why Is My Solr Slow?: Presented by Mike Drob, Cloudera
 
Intro to Apache Solr
Intro to Apache SolrIntro to Apache Solr
Intro to Apache Solr
 

Similar to Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit

Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCLucidworks (Archived)
 
Docker Security Paradigm
Docker Security ParadigmDocker Security Paradigm
Docker Security ParadigmAnis LARGUEM
 
Codership's galera cluster installation and quickstart webinar march 2016
Codership's galera cluster installation and quickstart webinar march 2016Codership's galera cluster installation and quickstart webinar march 2016
Codership's galera cluster installation and quickstart webinar march 2016Sakari Keskitalo
 
Codership's galera cluster installation and quickstart webinar march 2016
Codership's galera cluster installation and quickstart webinar march 2016Codership's galera cluster installation and quickstart webinar march 2016
Codership's galera cluster installation and quickstart webinar march 2016Sakari Keskitalo
 
Unraveling Docker Security: Lessons From a Production Cloud
Unraveling Docker Security: Lessons From a Production CloudUnraveling Docker Security: Lessons From a Production Cloud
Unraveling Docker Security: Lessons From a Production CloudSalman Baset
 
Tokyo OpenStack Summit 2015: Unraveling Docker Security
Tokyo OpenStack Summit 2015: Unraveling Docker SecurityTokyo OpenStack Summit 2015: Unraveling Docker Security
Tokyo OpenStack Summit 2015: Unraveling Docker SecurityPhil Estes
 
Docker Security Overview
Docker Security OverviewDocker Security Overview
Docker Security OverviewSreenivas Makam
 
Docker Runtime Security
Docker Runtime SecurityDocker Runtime Security
Docker Runtime SecuritySysdig
 
OpenStack hands-on (All-in-One)
OpenStack hands-on (All-in-One)OpenStack hands-on (All-in-One)
OpenStack hands-on (All-in-One)JeSam Kim
 
Docker Container Security
Docker Container SecurityDocker Container Security
Docker Container SecuritySuraj Khetani
 
Best Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker ContainersBest Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker ContainersBlueData, Inc.
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered LuceneErik Hatcher
 
Lightweight Virtualization in Linux
Lightweight Virtualization in LinuxLightweight Virtualization in Linux
Lightweight Virtualization in LinuxSadegh Dorri N.
 
Managing Oracle Enterprise Manager Cloud Control 12c with Oracle Clusterware
Managing Oracle Enterprise Manager Cloud Control 12c with Oracle ClusterwareManaging Oracle Enterprise Manager Cloud Control 12c with Oracle Clusterware
Managing Oracle Enterprise Manager Cloud Control 12c with Oracle ClusterwareLeighton Nelson
 
Continuous Deployment with Akka.Cluster and Kubernetes (Akka.NET)
Continuous Deployment with Akka.Cluster and Kubernetes (Akka.NET)Continuous Deployment with Akka.Cluster and Kubernetes (Akka.NET)
Continuous Deployment with Akka.Cluster and Kubernetes (Akka.NET)petabridge
 
State of Containers in OpenStack
State of Containers in OpenStackState of Containers in OpenStack
State of Containers in OpenStackopenstackindia
 
State of Containers in Openstack
State of Containers in OpenstackState of Containers in Openstack
State of Containers in OpenstackMadhuri Kumari
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introductionkanedafromparis
 

Similar to Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit (20)

Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DCIntro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
Intro to Solr Cloud, Presented by Tim Potter at SolrExchage DC
 
Effective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant ClustersEffective Spark on Multi-Tenant Clusters
Effective Spark on Multi-Tenant Clusters
 
Docker Security Paradigm
Docker Security ParadigmDocker Security Paradigm
Docker Security Paradigm
 
Codership's galera cluster installation and quickstart webinar march 2016
Codership's galera cluster installation and quickstart webinar march 2016Codership's galera cluster installation and quickstart webinar march 2016
Codership's galera cluster installation and quickstart webinar march 2016
 
Codership's galera cluster installation and quickstart webinar march 2016
Codership's galera cluster installation and quickstart webinar march 2016Codership's galera cluster installation and quickstart webinar march 2016
Codership's galera cluster installation and quickstart webinar march 2016
 
Codership's galera cluster installation and quickstart webinar march 2016
Codership's galera cluster installation and quickstart webinar march 2016Codership's galera cluster installation and quickstart webinar march 2016
Codership's galera cluster installation and quickstart webinar march 2016
 
Unraveling Docker Security: Lessons From a Production Cloud
Unraveling Docker Security: Lessons From a Production CloudUnraveling Docker Security: Lessons From a Production Cloud
Unraveling Docker Security: Lessons From a Production Cloud
 
Tokyo OpenStack Summit 2015: Unraveling Docker Security
Tokyo OpenStack Summit 2015: Unraveling Docker SecurityTokyo OpenStack Summit 2015: Unraveling Docker Security
Tokyo OpenStack Summit 2015: Unraveling Docker Security
 
Docker Security Overview
Docker Security OverviewDocker Security Overview
Docker Security Overview
 
Docker Runtime Security
Docker Runtime SecurityDocker Runtime Security
Docker Runtime Security
 
OpenStack hands-on (All-in-One)
OpenStack hands-on (All-in-One)OpenStack hands-on (All-in-One)
OpenStack hands-on (All-in-One)
 
Docker Container Security
Docker Container SecurityDocker Container Security
Docker Container Security
 
Best Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker ContainersBest Practices for Running Kafka on Docker Containers
Best Practices for Running Kafka on Docker Containers
 
Solr Powered Lucene
Solr Powered LuceneSolr Powered Lucene
Solr Powered Lucene
 
Lightweight Virtualization in Linux
Lightweight Virtualization in LinuxLightweight Virtualization in Linux
Lightweight Virtualization in Linux
 
Managing Oracle Enterprise Manager Cloud Control 12c with Oracle Clusterware
Managing Oracle Enterprise Manager Cloud Control 12c with Oracle ClusterwareManaging Oracle Enterprise Manager Cloud Control 12c with Oracle Clusterware
Managing Oracle Enterprise Manager Cloud Control 12c with Oracle Clusterware
 
Continuous Deployment with Akka.Cluster and Kubernetes (Akka.NET)
Continuous Deployment with Akka.Cluster and Kubernetes (Akka.NET)Continuous Deployment with Akka.Cluster and Kubernetes (Akka.NET)
Continuous Deployment with Akka.Cluster and Kubernetes (Akka.NET)
 
State of Containers in OpenStack
State of Containers in OpenStackState of Containers in OpenStack
State of Containers in OpenStack
 
State of Containers in Openstack
State of Containers in OpenstackState of Containers in Openstack
State of Containers in Openstack
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
 

Recently uploaded

办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一F sss
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024Susanna-Assunta Sansone
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesTimothy Spann
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Seán Kennedy
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excelysmaelreyes
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhYasamin16
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Cathrine Wilhelmsen
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...Boston Institute of Analytics
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 

Recently uploaded (20)

办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
办理学位证加利福尼亚大学洛杉矶分校毕业证,UCLA成绩单原版一比一
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
FAIR, FAIRsharing, FAIR Cookbook and ELIXIR - Sansone SA - Boston 2024
 
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming PipelinesConf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
Conf42-LLM_Adding Generative AI to Real-Time Streaming Pipelines
 
Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...Student Profile Sample report on improving academic performance by uniting gr...
Student Profile Sample report on improving academic performance by uniting gr...
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Business Analytics using Microsoft Excel
Business Analytics using Microsoft ExcelBusiness Analytics using Microsoft Excel
Business Analytics using Microsoft Excel
 
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲中央昆士兰大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhhThiophen Mechanism khhjjjjjjjhhhhhhhhhhh
Thiophen Mechanism khhjjjjjjjhhhhhhhhhhh
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)Data Factory in Microsoft Fabric (MsBIP #82)
Data Factory in Microsoft Fabric (MsBIP #82)
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
NLP Data Science Project Presentation:Predicting Heart Disease with NLP Data ...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 

Deploying and managing SolrCloud in the cloud using the Solr Scale Toolkit

  • 1. Search | Discover | Analyze Confidential and Proprietary © Copyright 2013 Deploying and Managing SolrCloud in the Cloud ApacheCon, April 8, 2014 Timothy Potter
  • 2. Confidential and Proprietary © Copyright 2013 My SolrCloud Experience • Currently, working on scaling up to a 200+ node deployment at LucidWorks • Operated 36 node cluster in AWS for Dachis Group (1.5 years ago, 18 shards ~900M docs) • Contributed several tests and patches to the code base • Built a Fabric/boto framework for deploying and managing a cluster in EC2 • Co-author of Solr In Action; wrote CH 13 which covers SolrCloud
  • 3. Confidential and Proprietary © Copyright 2013 Solr Scaling Toolkit • Requirements • High-level overview • Nuts and Bolts (live demo) • Roadmap • Q&A
  • 4. Confidential and Proprietary © Copyright 2013 • Provisioning N machine instances in EC2 • Configuring / starting ZooKeeper (1 to n servers) • Configuring / starting N Solr instances in cloud mode (M x N nodes) • Integrating with Logstash4Solr and other supporting services, e.g. collectd • Day-to-day operations on an existing cluster Tasks to Automate
  • 5. Confidential and Proprietary © Copyright 2013 Python-based Tools boto – Python API for AWS (EC2, S3, etc) Fabric – Python-based tool for automating system admin tasks over SSH pysolr – Python library for Solr (sending commits, queries, ...) kazoo – Python client tools for ZooKeeper Supporting Cast: JMeter – run tests, generate reports collectd – system monitoring Logstash4Solr – log aggregation JConsole/VisualVM – monitor JVM during indexing / queries
  • 6. Confidential and Proprietary © Copyright 2013 Fabric in 3 minutes or Less ... Fabric helps you do common system administration tasks on multiple hosts over SSH ... • Just Python • Easy to install / learn; good documentation • http://docs.fabfile.org/en/1.8/ def kill(cluster): ec2 = _connect_ec2() taggedInstances = _find_instances_in_cluster(ec2, cluster) instance_ids = taggedInstances.keys() if confirm(('Found %d instances to terminate, continue? ' % len(instance_ids))): ec2.terminate_instances(instance_ids) ec2.close()
  • 7. Confidential and Proprietary © Copyright 2013 Fabric in 3 minutes or Less, cont. ... • Define all commands in a file named: fabfile.py • Get a list of supported commands with short description • Get extended documentation for a command $ fab -l Available commands: backup_to_s3 Backup an existing collection to S3 check_zk Performs health check against all ... commit Sends a hard commit to the ... ... $ fab -d new_solr_cloud Displaying detailed information for task 'new_solrcloud’: Provisions n EC2 instances and then deploys SolrCloud; uses the new_ec2_instances and setup_solrcloud commands ...
  • 8. Confidential and Proprietary © Copyright 2013 Meta Node SiLK SolrCloud Nodes (NxM nodes) Node 1: Custom AMI ... ... Solr Node 1: 8983 ... core core Solr Node N: 898x ...core core M of these machines system monitoring of M machines w/ collectd and JMX deploy and manage SolrCloud cluster Solr-Scale-Toolkit ZooKeeper-1 ZK Host 1 ZooKeeper-N ZK Host N ZooKeeper Ensemble ... Solr Scale Toolkit: Architecture
  • 9. Confidential and Proprietary © Copyright 2013 Solr Scale Toolkit: Demo • Launch a meta node – Log agg / basic monitoring using SiLK • Launch ZooKeeper Ensemble – 3 nodes to establish quorum – Setup cron job to clean-up snapshots • Launch SolrCloud cluster • Create new collection and index some docs – Attach JConsole while indexing • Run a healthcheck on the collection • Checkout Banana Dashboard • Backup / Restore – Requires patch for SOLR-5956 – Use fab patch_jars to update jars and do a rolling restart
  • 10. Confidential and Proprietary © Copyright 2013 • Custom built AMI? • Block device mapping – dedicated disk per Solr node • Launch and then poll status until they are live – verify SSH connectivity • Tag each instance with a cluster ID and username Provisioning machines fab new_ec2_instances:test1,n=3,instance_type=m3.xlarge
  • 11. Confidential and Proprietary © Copyright 2013 • Two options: – provision 1 to N nodes when you launch Solr cluster – use existing named ensemble • Fabric command simply creates the myid files and zoo.cfg file for the ensemble – and some cron scripts for managing snapshots • Basic health checking of ZooKeeper status: – echo srvr | nc localhost 2181 ZooKeeper fab new_zk_ensemble:zk1,n=3
  • 12. Confidential and Proprietary © Copyright 2013 SolrCloud Cluster: NxM nodes EC2 Instance: RedHat Enterprise Linux, 64-bit Solr 4.7.1 Node 1 MMapDirectory dedicated disk 1 Limit to 50-100M docs across all cores per node Solr 4.7.1 Node N MMapDirectory ... dedicated disk N ... x M instances OS cache memory mapped I/O collection1 shard1 / replica1 (Solr core) ... collection2 shard2 / replica1 (Solr core) collection3 shard1 / replica1 (Solr core) ... collection1 shard2 / replica1 (Solr core) ... Must design to give bulk of the memory to OS cache
  • 13. Confidential and Proprietary © Copyright 2013 • Upload a BASH script that starts/stops Solr • Set system props: jetty.port, host, zkHost, JVM opts • One or more Solr nodes per machine • JVM mem opts dependent on instance type and # of Solr nodes per instance • Optionally configure log4j.properties to append messages to Rabbitmq for Logstash4Solr integration SolrCloud fab new_solrcloud:test1,zk=zk1,nodesPerHost=2
  • 14. Confidential and Proprietary © Copyright 2013 • BASH script that implements: – start/stop Solr nodes on each EC2 instance – sets JVM memory options, system properties (jetty.port), enable remote JMX, etc – backup log files before restarting nodes – ensure JVM is killed correctly before restarting • Environment variables in: solr-ctl-env.sh solr-ctl.sh
  • 15. Confidential and Proprietary © Copyright 2013 • Deploy a configuration directory to ZooKeeper • Create a new collection • Attach a local JConsole/VisualVM to a remote JVM • Rolling restart (with Overseer awareness) • Build Solr locally and patch remote – Use a relay server to scp the JARs to Amazon network once and then scp them to other nodes from within the network • Put/get files • Grep over all log files (across the cluster) Miscellaneous Utility Tasks
  • 16. Confidential and Proprietary © Copyright 2013 • fab mine: See clusters I’m running (or for other users too) • fab kill_mine: Terminate all instances I’m running – Use termination protection in production • fab ssh_to: Quick way to SSH to one of the nodes in a cluster • fab stop/recover/kill: Basic commands for controlling specific Solr nodes in the cluster • fab jmeter: Execute a JMeter test plan against your cluster – Example test plan and Java sampler is included with the source Other useful stuff ...
  • 17. Confidential and Proprietary © Copyright 2013 • Java-based command-line application that uses SolrJ’s CloudSolrServer to perform advanced cluster management operations: – healthcheck: collect metadata and health information from all replicas for a collection from ZooKeeper – backup: create a snapshot of each shard in a collection for backing up to remote storage (S3) • Framework for building complex tools that benefit from having access to cluster state information in ZooKeeper SolrCloud Tools (SolrJ client app) ./tools.sh –tool healthcheck
  • 18. Confidential and Proprietary © Copyright 2013 SiLK Integration • SiLK: Solr integrated with Logstash and Kibana – Index time-series data, such as log data (collectd, Solr logs, ...) – Build cool dashboards with Banana (fork of Kibana) • Easily aggregate all WARN and more severe log messages from all Solr servers into logstash4solr • Send collectd metrics to logstash4solr
  • 19. Confidential and Proprietary © Copyright 2013 SiLK Integration Solr Node 1: 8983 ... core core AMQP Log4J Appender logstash4solr logstash4solr index parsing/ indexing decouple log write performance from log indexing Ad hoc log analysis Solr Node N: 8983 ... core core ... many of these Log Records Include: - host:port - collection - shard - test label + standard Log4J message fields MQ banana Dashboard
  • 20. Confidential and Proprietary © Copyright 2013 What’s Next? • Migrate to using Apache libcloud instead of using boto directly • Use this framework to perform large-scale performance testing – Report results back to community • Ability to request spot instances – Good for testing only • Chaos monkey tests – integrate jepsen? • Open source so please kick the tires!
  • 21. Confidential and Proprietary © Copyright 2013 Wrap-up • Solr Scale Toolkit: https://github.com/LucidWorks/solr-scale-tk • LucidWorks: http://www.lucidworks.com • SiLK: http://www.lucidworks.com/lucidworks-silk/ • Solr In Action: http://www.manning.com/grainger/ • Connect: @thelabdude / tim.potter@lucidworks.com Questions?