SlideShare a Scribd company logo
1 of 106
Download to read offline
OpenSearch
-Abhi Jain
Agenda
● OpenSearch
○ What is it?
○ Benefits/ Uses
○ How to use it
○ Features
● Migrate from Elastic to OpenSearch
● Tools & Plugins
About Me
● Lead Dev
● Located in Florida
● Trainer
● Presenter
● .NET Developer
● Youtuber: Coach4Dev
● Husband/ Father
Amazon Elasticsearch
● Launched in 2015
● Gained popularity for log analytics usage
● Used open-source Elastic under Apache License v2
● Jan 2021
○ Elastic NV changed licensing strategy
○ After ElasticSearch 7.10.2 & Kibana 7.10.2
■ Not release under Apache License v2
■ Release under Elastic License
OpenSearch
● Sep 2021:
○ Renamed from ElasticSearch to OpenSearch
● OpenSource fork from Elastic 7.10.2 and Kibana 7.10.2
● Highly scalable
● Fast access & response to large volumes of data
● Powered by Apache Lucene Search library
Apache Lucene
● Apache Lucene project develops open-source search software
○ Releases a core search library named Lucene core
● Lucene Core
○ Java Library providing powerful indexing and search features
Apache Solr
● Open source search platform
● Built on Apache Lucene
Solr vs ElasticSearch
● Similar performance mostly.
● ES has better support for scalability
○ due to horizontal scaling
■ Better cloud support too
● ES can support multiple doc types in a single index better
○ More difficult to do this in Solr
● ES supports native DSL (Domain Specific Language)
○ Need to program queries in Solr
● https://mindmajix.com/elasticsearch-vs-solr
Why OpenSearch
● Huge amount of machine generated data these days
○ Growing exponentially
● Getting insights is important
● Interactive log analytics
● Real-time application monitoring
● Website Search, etc.
OpenSearch Features
● Easy to set-up and configure
● In-place upgrades
● Enables data monitoring & setting alerts based on thresholds
● Supports authentication, encryption & compliance requirements
OpenSearch vs ElasticSearch
● OpenSearch was forked from Elastic Search
○ Now they are separate from each other
● Each is adding features separately
● OpenSearch
○ Inbuilt support from AWS
OpenSearch features not in ES (free version)
● Centralized user accounts / access control
● Cross-cluster replication
● IP filtering
● Configurable retention period
● Anomaly detection
● Tableau connector
● JDBC driver
● ODBC driver
● Machine learning features such as regression and classification
● Link
ElasticSearch Features
● Based on subscription levels
● https://www.elastic.co/subscriptions
OpenSearch & ElasticSearch Version Support
● Currently supports the following OpenSearch versions:
○ 1.3, 1.2, 1.1, 1.0
● And supports the following ElasticSearch versions:
○ 7.10, 7.9, 7.8, 7.7, 7.4, 7.1
○ 6.8, 6.7, 6.5, 6.4, 6.3, 6.2, 6.0
○ 5.6, 5.5, 5.3, 5.1
○ 2.3
○ 1.5
What is Kibana
● Free & open front end application
● Charting tool for Elastic Stack
● Sits on top of Elastic Stack
● Sample Dashboard
OpenSearch Dashboards
● Default visualization tool for data in OpenSearch
● Filter data with queries
● Comes with opensearch service
Terminologies
OpenSearch Cluster
● Synonymous to domain
● Domains are clusters with
○ settings,
○ instance types,
○ instance counts,
○ and storage resources that you specify.
● Group of nodes
○ With same cluster.name attribute
Opensearch Node
● Member of a cluster
● A distinct host
● With IP address
Getting Started
● Create a domain
● Size the domain appropriately for your workload
● Control access to your domain using a domain access policy or fine-grained
access control
● Index data manually or from other AWS services
● Use OpenSearch Dashboards to search your data and create visualizations
Custom Endpoint
● If we want easier to read or custom domain name
● Can use Https
○ Upload SSL certificate
Run OpenSearch locally
● Install docker
● wsl -d docker-desktop
● sysctl -w vm.max_map_count=262144
● Ctrl+C
● docker-compose up
● Visit http://localhost:5601/
● Use admin/admin to login and explore
● Link
Upload Data
● One at a time
● Bulk
Upload Data One At a time
● curl -XPUT -u "master:XXXX"
"https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.a
mazonaws.com/movies/_doc/1" -d "{"director": "Burton, Tim", "genre":
["Comedy","Sci-Fi"], "year": 1996, "actor": ["Jack Nicholson","Pierce
Brosnan","Sarah Jessica Parker"], "title": "Mars Attacks!"}" -H "Content-Type:
application/json"
Upload Data Bulk
● curl -XPOST -u "master:XXXXX"
"https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.a
mazonaws.com/_bulk" --data-binary @bulk_movies.txt -H "Content-Type:
application/json"
How to Query?
Searching Data
● URI Searches
● Command Line
● OpenSearch Dashboards
Searching Data - URI
● GET Request
● https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.am
azonaws.com/movies/_search?q=rebel&pretty=true
● Searches all the indices and properties
URI Search Specific fields
● Search movies index and title property
● GET
https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search?q=ti
tle:house
Get Search Results - Command Line
● curl -XGET -u "master:XXXXX"
"https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.a
mazonaws.com/movies/_search?q=rebel&pretty=true"
Query DSL
● For more complex queries
○ OpenSearch Domain Specific Language (DSL)
● POST request with query body
●
Get Search Results - Dev Tools
● https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.am
azonaws.com/_dashboards/app/dev_tools#/console
○ GET _search
○ {
○ "query": {
○ "match_all": {}
○ }
○ }
Search on only specific fields
GET _search
{
"size": 20,
"query": {
"multi_match": {
"query": "U.S.",
"fields": ["title", "actor", "director"]
}
}
}
Search - Boosting fields
GET _search
{
"size": 20,
"query": {
"multi_match": {
"query": "john",
"fields": ["title^4", "actor", "director^4"]
}
}
}
Search - Pagination
GET _search
{
"from": 0,
"size": 1,
"query": {
"multi_match": {
"query": "Drama",
"fields": ["genre"]
}
}
}
Query -With Highlights In Response
GET _search
{
"size": 20,
"query": {
"multi_match": {
"query": "Manchurian",
"fields": ["title^4", "actor", "director"]
}
},
"highlight": {
"fields": {
"title": {}
},
"pre_tags": "<strong>",
"post_tags": "</strong>",
"fragment_size": 200,
"boundary_chars": ".,!? "
}
}
Query - Count
GET movies/_count
{
"query": {
"multi_match": {
"query": "Manchurian",
"fields": ["title^4", "actor", "director"]
}
}
}
Dashboard Query Language
● Use DQL in Dashboards
○ Search for data and visualizations
● Terms Query
○ Search for any text
■ E.g. www.example.com
○ Access object’s nested field
■ E.g. coordinates.lat:43.7102
○ Leading and trailing wildcards
■ host.keyword:*.example.com/*
● Operators
○ AND
○ OR
Dashboard Query Language
● Date and range Queries
○ bytes >= 15 and memory < 15
○ @timestamp > "2020-12-14T09:35:33"
● Nested field query
○ superheroes: {hero-name: Superman}
Dashboard Plugins
Query Workbench
● SQL
○ Run SQL
○ Treat indices as tables
● PPL
○ Piped Processing Language
○ Commands delimited by pipes
Reporting
● Multiple file formats
● On demand/ Scheduled
● Generate from
○ Dashboard
○ Visualization
○ Discover
Anomaly Detection
● Detect unusual behavior in time series data
● Anomaly Grade
● Confidence Score
Notifications
● Supported
○ Amazon Chime
○ SNS
○ SES
○ SMTP
○ Slack
○ Custom Webhooks
Observability plugin
● Visualize/Query time series data
● Event analytics
● Compare the data the way you like
Index Management
● Create ISM policy
● To manage your indexes
Security plugin
● Set up RBAC
●
Migrate from ElasticSearch to OpenSearch
Three major approaches
● Snapshot
● Rolling Upgrade
● Cluster Restart
Snapshot Method
● Generate snapshot in ElasticSearch
● Save in shared directory
● Restore in OpenSearch
● Snapshot
○ Backup of entire cluster state
○ Useful for recovery from failure and migration
● Link
Snapshot Method
● Check Index compatibility
○ E.g.: Cant restore 7.6.0 snapshot into 7.5.0 cluster
● Link
● Fastest
● Easiest
● Most efficient
●
Rolling Upgrade
● Official way to migrate cluster
● Without interruption
● Rolling upgrades are supported:
○ Between minor versions
○ From 5.6 to 6.8
○ From 6.8 to 7.14.1
○ From any version since 7.14.0 to 7.14.1
Rolling Upgrade
● Shut down one node at a time
○ Minimal disruption
Cluster Restart Upgrades
● Shut down all nodes
● Perform the upgrade
● Restart the cluster
Mapping
OpenSearch Mapping
● Dynamic
○ When you index a document
○ Opensearch adds fields automatically
○ It deduces their types by itself
● Explicit
○ If you know your data types
○ Preferred way of doing things
OpenSearch Mapping
● If you do not define a mapping ahead of time, OpenSearch dynamically
creates a mapping for you.
● If you do decide to define your own mapping, you can do so at index creation.
● ONE mapping is defined per index. Once the index has been created, we can
only add new fields to a mapping. We CANNOT change the mapping of an
existing field.
● If you must change the type of an existing field, you must create a new index
with the desired mapping, then reindex all documents into the new index.
Text vs keyword data types
● Text type
○ Full text searches
● Keyword type
○ Exact searches
○ Aggregations
○ Sorting
Text vs Keyword
● Inverted Index
Aggregations
OpenSearch Aggregations
● Analyze data
○ In real time too
● Extract statistics
● More expensive than queries
○ Or CPU and Memory
○ In general
Aggregation Query
● Use aggs or aggregations
Example
● Get average of
Data Streams
Data Streams in OpenSearch
● Ingesting time series data
○ Logs
○ Events
○ Metrics, etc.
● Number of documents grows rapidly
● Append Only data
● Don't need to update older documents (Very rarely)
Rollover
● If data is growing rapidly
● Write to index upto certain threshold
○ Then create a new index
○ And start writing to it
● Optimize the active index for high ingest rates on high-performance hot
nodes.
● Optimize for search performance on warm nodes.
● Shift older, less frequently accessed data to less expensive cold nodes,
● Delete data according to your retention policies by removing entire indices.
Index Template
● Data Stream requires an index template
● A name or wildcard (*) pattern for the data stream.
● The data stream’s timestamp field. This field must be mapped as a date or
date_nanos field data type and must be included in every document indexed
to the data stream.
● The mappings and settings applied to each backing index when it’s created.
ILM Policy
● Index Lifecycle Management Policy
● Can be applied to any number of indices
● Usage
○ Allocate
○ Delete
○ Rollover
○ Read Only
○ Wait for snapshot
ILM Policy
● Create a policy:
● Link
Create ILM Policy
Create ILM Policy
Create ILM Policy
Index Template
● Tells ElasticSearch how to configure an index when it is created
● For data streams
○ Configures the stream’s backing indices
○ Configured prior to index creation
Templates Types
● Component Templates
○ Reusable building blocks that configure
■ mappings,
■ settings, and
■ Aliases
○ Not directly applied to indices
● Index Template
○ Collection of component templates
○ Directly applied to indices
○ Some defaults: metrics-*-*, logs-*-*
Create Component Template
● Link
Create Index Template
● Data Stream requires matching index template
● PUT _index_template/{template_name}
Create Index Template
● Link
Create data stream
● Documents must contain timestamp field
● PUT _data_stream/my-data-stream
● Stream’s name must match one of your index template’s index patterns
Get Info About Data Stream
● GET _data_stream/my-data-stream
Delete Data Stream
● DELETE _data_stream/my-data-stream
Cross Cluster Replication
Cross Cluster Replication
● Cross Cluster replication plugin
○ Replicates indexes, mapping & metadata from one cluster to another
● Advantages
○ Continue to handle search requests if there is an outage
○ Can help reduce latency in application
■ Replicating data across geographically distant data centers
Replication
● Active passive model
○ Follower index pulls data from leader index
● It can be
○ Started
○ Paused
○ Stopped
○ Resumed
● Can be secured
○ Security plugin
○ Encrypt cross cluster traffic
Exercise
● Create 2 domains in AWS OpenSearch
● Link
Exercise
● Source Domain Connections Tab -> Outbound ->
○ Create Connection to Destination Domain
● Set access policy on destination domain:
● Link
○
○
Exercise
● Get Connection status
○ GET _plugins/_replication/connect1/_status
● Start syncing
○ PUT _plugins/_replication/connect1/_start
○ {
○ "leader_alias": "Connect1",
○ "leader_index": "movies",
○ "use_roles":{
○ "leader_cluster_role": "all_access",
○ "follower_cluster_role": "all_access"
○ }
○ }
Plugins
Opensearch plugins
● Standalone components
○ That add features and capabilities
● Huge number of plugins available
● E.g.
○ Replication Plugin
○ Security plugin
○ Notification plugin
SQL Plugin
● Lets you run SQL queries on ESDB
● Add data
○ PUT movies/_doc/1
○ { "title": "Spirited Away" }
● Query data
○ POST _plugins/_sql
○ {
○ "query": "SELECT * FROM movies LIMIT 50"
○ }
○
SQL Plugin
● Delete data from ESDB Index
● Enable Delete via SQL plugin
○ PUT _plugins/_query/settings
○ {
○ "transient": {
○ "plugins.sql.delete.enabled": "true"
○ }
○ }
○
SQL PLugin - Delete
● To Delete the data
○ POST _plugins/_sql
○ {
○ "query": "DELETE FROM movies"
○ }
○
Asynchronous Search
● Large volumes of data
● Can take longer to search
● Async
○ Run searches in the background
○ Monitor progress of these searches
○ Get back partial results as they become available
Asynchronous Search
● POST _plugins/_asynchronous_search
● Response contents:
○ ID
■ Can be used to track the state of the search
■ Get partial results
○ State
■ Running
■ Completed
■ Persisted
● Link
OpenSearch Clients
Clients
● OpenSearch Python client
● OpenSearch JavaScript (Node.js) client
● OpenSearch .NET clients
● OpenSearch Go client
● OpenSearch PHP client
Open Search Client for .NET
● OpenSearch.Net
○ Low level client
● OpenSearch.Client
○ High level client
● Sample code: Link
Exercise
● Create a .NET application
● Add a document to OpenSearch using the .NET Application
○ OpenSearch.Client (.NET High level client)
Agents and Ingestion Tools
Beats
● Data shippers
● Agents on servers
● Send data to ES/ Logstash
Grafana
● An open source visualization tool
● Various sources can be used as data source:
○ InfluxDB
○ MySQL
○ ElasticSearch
○ PostgreSQL
● Better suited for metrics visualizations
● Does not allow full text data querying
Logstash
● Free/ Open-Source
● Data processing pipeline
● Ingests data from multitude of sources
● Transforms it
● Sends it to your favorite stash
Logstash - Ingestion
● Data of all shapes/ sizes/ source
○ Can be ingested
● It can parse/ transform your data
Logstash - Output
● ElasticSearch
● Mongodb
● S3
● Etc.
● Link
AWS OpenSearch Security
● Use multi-factor authentication (MFA) with each account.
● Use SSL/TLS to communicate with AWS resources. We recommend TLS 1.2
or later.
● Set up API and user activity logging with AWS CloudTrail.
● Use AWS encryption solutions, along with all default security controls within
AWS services.
● Use advanced managed security services such as Amazon Macie, which
assists in discovering and securing personal data that is stored in Amazon S3.
● If you require FIPS 140-2 validated cryptographic modules when accessing
AWS through a command line interface or an API, use a FIPS endpoint.
Summary
● Opensearch
○ Open Source Search solution
● Upcoming and supported by AWS
● Caters to most search use cases
○ Great Query performance
● Powerful tools
● Community Support
Connect with me
● Trainings on various tech topics
● For any questions:
○ https://linkedin.com/in/coach4dev

More Related Content

What's hot

Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Databricks
 
Migration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQLMigration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQLPGConf APAC
 
[pgday.Seoul 2022] PostgreSQL with Google Cloud
[pgday.Seoul 2022] PostgreSQL with Google Cloud[pgday.Seoul 2022] PostgreSQL with Google Cloud
[pgday.Seoul 2022] PostgreSQL with Google CloudPgDay.Seoul
 
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...Edureka!
 
cLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHousecLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHouseAltinity Ltd
 
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...NETWAYS
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversScyllaDB
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overviewABC Talks
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsAlexander Korotkov
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark FundamentalsZahra Eskandari
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internalsKostas Tzoumas
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackRich Lee
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsJonas Bonér
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveDataWorks Summit
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaObjectRocket
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with PrometheusShiao-An Yuan
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodDatabricks
 
MongoDB Fundamentals
MongoDB FundamentalsMongoDB Fundamentals
MongoDB FundamentalsMongoDB
 

What's hot (20)

Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
 
ELK Stack
ELK StackELK Stack
ELK Stack
 
Migration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQLMigration From Oracle to PostgreSQL
Migration From Oracle to PostgreSQL
 
[pgday.Seoul 2022] PostgreSQL with Google Cloud
[pgday.Seoul 2022] PostgreSQL with Google Cloud[pgday.Seoul 2022] PostgreSQL with Google Cloud
[pgday.Seoul 2022] PostgreSQL with Google Cloud
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
Spark Hadoop Tutorial | Spark Hadoop Example on NBA | Apache Spark Training |...
 
cLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHousecLoki: Like Loki but for ClickHouse
cLoki: Like Loki but for ClickHouse
 
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
OSMC 2022 | The Power of Metrics, Logs & Traces with Open Source by Emil-Andr...
 
Apache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the CoversApache Iceberg: An Architectural Look Under the Covers
Apache Iceberg: An Architectural Look Under the Covers
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
Solving PostgreSQL wicked problems
Solving PostgreSQL wicked problemsSolving PostgreSQL wicked problems
Solving PostgreSQL wicked problems
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
 
Scalability, Availability & Stability Patterns
Scalability, Availability & Stability PatternsScalability, Availability & Stability Patterns
Scalability, Availability & Stability Patterns
 
Hive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep DiveHive + Tez: A Performance Deep Dive
Hive + Tez: A Performance Deep Dive
 
An Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and KibanaAn Intro to Elasticsearch and Kibana
An Intro to Elasticsearch and Kibana
 
Monitoring with Prometheus
Monitoring with PrometheusMonitoring with Prometheus
Monitoring with Prometheus
 
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the HoodRadical Speed for SQL Queries on Databricks: Photon Under the Hood
Radical Speed for SQL Queries on Databricks: Photon Under the Hood
 
MongoDB Fundamentals
MongoDB FundamentalsMongoDB Fundamentals
MongoDB Fundamentals
 

Similar to OpenSearch: A Guide to the Powerful Open Source Search and Analytics Engine

Streamsets and spark in Retail
Streamsets and spark in RetailStreamsets and spark in Retail
Streamsets and spark in RetailHari Shreedharan
 
Analytic Insights in Retail Using Apache Spark with Hari Shreedharan
Analytic Insights in Retail Using Apache Spark with Hari ShreedharanAnalytic Insights in Retail Using Apache Spark with Hari Shreedharan
Analytic Insights in Retail Using Apache Spark with Hari ShreedharanDatabricks
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartMukesh Singh
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodbPGConf APAC
 
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaBuilding a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaMushfekur Rahman
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1Ruslan Meshenberg
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixC4Media
 
Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)Marcos García
 
Real-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerReal-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerMichael Spector
 
Serverless Clojure and ML prototyping: an experience report
Serverless Clojure and ML prototyping: an experience reportServerless Clojure and ML prototyping: an experience report
Serverless Clojure and ML prototyping: an experience reportMetosin Oy
 
PostgreSQL and Sphinx pgcon 2013
PostgreSQL and Sphinx   pgcon 2013PostgreSQL and Sphinx   pgcon 2013
PostgreSQL and Sphinx pgcon 2013Emanuel Calvo
 
TRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use CaseTRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use CaseHakan Ilter
 
Journey through high performance django application
Journey through high performance django applicationJourney through high performance django application
Journey through high performance django applicationbangaloredjangousergroup
 
Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014Dimitar Danailov
 
Load testing in Zonky with Gatling
Load testing in Zonky with GatlingLoad testing in Zonky with Gatling
Load testing in Zonky with GatlingPetr Vlček
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional ProgrammerDave Cross
 
Log Management: AtlSecCon2015
Log Management: AtlSecCon2015Log Management: AtlSecCon2015
Log Management: AtlSecCon2015cameronevans
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbWei Shan Ang
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapItai Yaffe
 

Similar to OpenSearch: A Guide to the Powerful Open Source Search and Analytics Engine (20)

Streamsets and spark in Retail
Streamsets and spark in RetailStreamsets and spark in Retail
Streamsets and spark in Retail
 
Analytic Insights in Retail Using Apache Spark with Hari Shreedharan
Analytic Insights in Retail Using Apache Spark with Hari ShreedharanAnalytic Insights in Retail Using Apache Spark with Hari Shreedharan
Analytic Insights in Retail Using Apache Spark with Hari Shreedharan
 
Ledingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @LendingkartLedingkart Meetup #2: Scaling Search @Lendingkart
Ledingkart Meetup #2: Scaling Search @Lendingkart
 
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json  postgre-sql vs. mongodbPGConf APAC 2018 - High performance json  postgre-sql vs. mongodb
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
 
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaBuilding a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
 
Introducing Datawave
Introducing DatawaveIntroducing Datawave
Introducing Datawave
 
NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1NetflixOSS Meetup season 3 episode 1
NetflixOSS Meetup season 3 episode 1
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)Initial presentation of swift (for montreal user group)
Initial presentation of swift (for montreal user group)
 
Real-time analytics with Druid at Appsflyer
Real-time analytics with Druid at AppsflyerReal-time analytics with Druid at Appsflyer
Real-time analytics with Druid at Appsflyer
 
Serverless Clojure and ML prototyping: an experience report
Serverless Clojure and ML prototyping: an experience reportServerless Clojure and ML prototyping: an experience report
Serverless Clojure and ML prototyping: an experience report
 
PostgreSQL and Sphinx pgcon 2013
PostgreSQL and Sphinx   pgcon 2013PostgreSQL and Sphinx   pgcon 2013
PostgreSQL and Sphinx pgcon 2013
 
TRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use CaseTRHUG 2015 - Veloxity Big Data Migration Use Case
TRHUG 2015 - Veloxity Big Data Migration Use Case
 
Journey through high performance django application
Journey through high performance django applicationJourney through high performance django application
Journey through high performance django application
 
Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014Google app engine - Soft Uni 19.06.2014
Google app engine - Soft Uni 19.06.2014
 
Load testing in Zonky with Gatling
Load testing in Zonky with GatlingLoad testing in Zonky with Gatling
Load testing in Zonky with Gatling
 
The Professional Programmer
The Professional ProgrammerThe Professional Programmer
The Professional Programmer
 
Log Management: AtlSecCon2015
Log Management: AtlSecCon2015Log Management: AtlSecCon2015
Log Management: AtlSecCon2015
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
 
A Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's RoadmapA Day in the Life of a Druid Implementor and Druid's Roadmap
A Day in the Life of a Druid Implementor and Druid's Roadmap
 

Recently uploaded

Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️anilsa9823
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendArshad QA
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 

Recently uploaded (20)

Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Test Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and BackendTest Automation Strategy for Frontend and Backend
Test Automation Strategy for Frontend and Backend
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 

OpenSearch: A Guide to the Powerful Open Source Search and Analytics Engine

  • 2. Agenda ● OpenSearch ○ What is it? ○ Benefits/ Uses ○ How to use it ○ Features ● Migrate from Elastic to OpenSearch ● Tools & Plugins
  • 3. About Me ● Lead Dev ● Located in Florida ● Trainer ● Presenter ● .NET Developer ● Youtuber: Coach4Dev ● Husband/ Father
  • 4. Amazon Elasticsearch ● Launched in 2015 ● Gained popularity for log analytics usage ● Used open-source Elastic under Apache License v2 ● Jan 2021 ○ Elastic NV changed licensing strategy ○ After ElasticSearch 7.10.2 & Kibana 7.10.2 ■ Not release under Apache License v2 ■ Release under Elastic License
  • 5. OpenSearch ● Sep 2021: ○ Renamed from ElasticSearch to OpenSearch ● OpenSource fork from Elastic 7.10.2 and Kibana 7.10.2 ● Highly scalable ● Fast access & response to large volumes of data ● Powered by Apache Lucene Search library
  • 6. Apache Lucene ● Apache Lucene project develops open-source search software ○ Releases a core search library named Lucene core ● Lucene Core ○ Java Library providing powerful indexing and search features
  • 7. Apache Solr ● Open source search platform ● Built on Apache Lucene
  • 8. Solr vs ElasticSearch ● Similar performance mostly. ● ES has better support for scalability ○ due to horizontal scaling ■ Better cloud support too ● ES can support multiple doc types in a single index better ○ More difficult to do this in Solr ● ES supports native DSL (Domain Specific Language) ○ Need to program queries in Solr ● https://mindmajix.com/elasticsearch-vs-solr
  • 9. Why OpenSearch ● Huge amount of machine generated data these days ○ Growing exponentially ● Getting insights is important ● Interactive log analytics ● Real-time application monitoring ● Website Search, etc.
  • 10. OpenSearch Features ● Easy to set-up and configure ● In-place upgrades ● Enables data monitoring & setting alerts based on thresholds ● Supports authentication, encryption & compliance requirements
  • 11. OpenSearch vs ElasticSearch ● OpenSearch was forked from Elastic Search ○ Now they are separate from each other ● Each is adding features separately ● OpenSearch ○ Inbuilt support from AWS
  • 12. OpenSearch features not in ES (free version) ● Centralized user accounts / access control ● Cross-cluster replication ● IP filtering ● Configurable retention period ● Anomaly detection ● Tableau connector ● JDBC driver ● ODBC driver ● Machine learning features such as regression and classification ● Link
  • 13. ElasticSearch Features ● Based on subscription levels ● https://www.elastic.co/subscriptions
  • 14. OpenSearch & ElasticSearch Version Support ● Currently supports the following OpenSearch versions: ○ 1.3, 1.2, 1.1, 1.0 ● And supports the following ElasticSearch versions: ○ 7.10, 7.9, 7.8, 7.7, 7.4, 7.1 ○ 6.8, 6.7, 6.5, 6.4, 6.3, 6.2, 6.0 ○ 5.6, 5.5, 5.3, 5.1 ○ 2.3 ○ 1.5
  • 15. What is Kibana ● Free & open front end application ● Charting tool for Elastic Stack ● Sits on top of Elastic Stack ● Sample Dashboard
  • 16. OpenSearch Dashboards ● Default visualization tool for data in OpenSearch ● Filter data with queries ● Comes with opensearch service
  • 18. OpenSearch Cluster ● Synonymous to domain ● Domains are clusters with ○ settings, ○ instance types, ○ instance counts, ○ and storage resources that you specify. ● Group of nodes ○ With same cluster.name attribute
  • 19. Opensearch Node ● Member of a cluster ● A distinct host ● With IP address
  • 20. Getting Started ● Create a domain ● Size the domain appropriately for your workload ● Control access to your domain using a domain access policy or fine-grained access control ● Index data manually or from other AWS services ● Use OpenSearch Dashboards to search your data and create visualizations
  • 21. Custom Endpoint ● If we want easier to read or custom domain name ● Can use Https ○ Upload SSL certificate
  • 22. Run OpenSearch locally ● Install docker ● wsl -d docker-desktop ● sysctl -w vm.max_map_count=262144 ● Ctrl+C ● docker-compose up ● Visit http://localhost:5601/ ● Use admin/admin to login and explore ● Link
  • 23. Upload Data ● One at a time ● Bulk
  • 24. Upload Data One At a time ● curl -XPUT -u "master:XXXX" "https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.a mazonaws.com/movies/_doc/1" -d "{"director": "Burton, Tim", "genre": ["Comedy","Sci-Fi"], "year": 1996, "actor": ["Jack Nicholson","Pierce Brosnan","Sarah Jessica Parker"], "title": "Mars Attacks!"}" -H "Content-Type: application/json"
  • 25. Upload Data Bulk ● curl -XPOST -u "master:XXXXX" "https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.a mazonaws.com/_bulk" --data-binary @bulk_movies.txt -H "Content-Type: application/json"
  • 27. Searching Data ● URI Searches ● Command Line ● OpenSearch Dashboards
  • 28. Searching Data - URI ● GET Request ● https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.am azonaws.com/movies/_search?q=rebel&pretty=true ● Searches all the indices and properties
  • 29. URI Search Specific fields ● Search movies index and title property ● GET https://search-my-domain.us-west-1.es.amazonaws.com/movies/_search?q=ti tle:house
  • 30. Get Search Results - Command Line ● curl -XGET -u "master:XXXXX" "https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.a mazonaws.com/movies/_search?q=rebel&pretty=true"
  • 31. Query DSL ● For more complex queries ○ OpenSearch Domain Specific Language (DSL) ● POST request with query body ●
  • 32. Get Search Results - Dev Tools ● https://search-test-domain-s7g5csgqurpevadhaonp75mwgm.us-west-1.es.am azonaws.com/_dashboards/app/dev_tools#/console ○ GET _search ○ { ○ "query": { ○ "match_all": {} ○ } ○ }
  • 33. Search on only specific fields GET _search { "size": 20, "query": { "multi_match": { "query": "U.S.", "fields": ["title", "actor", "director"] } } }
  • 34. Search - Boosting fields GET _search { "size": 20, "query": { "multi_match": { "query": "john", "fields": ["title^4", "actor", "director^4"] } } }
  • 35. Search - Pagination GET _search { "from": 0, "size": 1, "query": { "multi_match": { "query": "Drama", "fields": ["genre"] } } }
  • 36. Query -With Highlights In Response GET _search { "size": 20, "query": { "multi_match": { "query": "Manchurian", "fields": ["title^4", "actor", "director"] } }, "highlight": { "fields": { "title": {} }, "pre_tags": "<strong>", "post_tags": "</strong>", "fragment_size": 200, "boundary_chars": ".,!? " } }
  • 37. Query - Count GET movies/_count { "query": { "multi_match": { "query": "Manchurian", "fields": ["title^4", "actor", "director"] } } }
  • 38. Dashboard Query Language ● Use DQL in Dashboards ○ Search for data and visualizations ● Terms Query ○ Search for any text ■ E.g. www.example.com ○ Access object’s nested field ■ E.g. coordinates.lat:43.7102 ○ Leading and trailing wildcards ■ host.keyword:*.example.com/* ● Operators ○ AND ○ OR
  • 39. Dashboard Query Language ● Date and range Queries ○ bytes >= 15 and memory < 15 ○ @timestamp > "2020-12-14T09:35:33" ● Nested field query ○ superheroes: {hero-name: Superman}
  • 41. Query Workbench ● SQL ○ Run SQL ○ Treat indices as tables ● PPL ○ Piped Processing Language ○ Commands delimited by pipes
  • 42. Reporting ● Multiple file formats ● On demand/ Scheduled ● Generate from ○ Dashboard ○ Visualization ○ Discover
  • 43. Anomaly Detection ● Detect unusual behavior in time series data ● Anomaly Grade ● Confidence Score
  • 44. Notifications ● Supported ○ Amazon Chime ○ SNS ○ SES ○ SMTP ○ Slack ○ Custom Webhooks
  • 45. Observability plugin ● Visualize/Query time series data ● Event analytics ● Compare the data the way you like
  • 46. Index Management ● Create ISM policy ● To manage your indexes
  • 47. Security plugin ● Set up RBAC ●
  • 48. Migrate from ElasticSearch to OpenSearch
  • 49. Three major approaches ● Snapshot ● Rolling Upgrade ● Cluster Restart
  • 50. Snapshot Method ● Generate snapshot in ElasticSearch ● Save in shared directory ● Restore in OpenSearch ● Snapshot ○ Backup of entire cluster state ○ Useful for recovery from failure and migration ● Link
  • 51. Snapshot Method ● Check Index compatibility ○ E.g.: Cant restore 7.6.0 snapshot into 7.5.0 cluster ● Link ● Fastest ● Easiest ● Most efficient ●
  • 52. Rolling Upgrade ● Official way to migrate cluster ● Without interruption ● Rolling upgrades are supported: ○ Between minor versions ○ From 5.6 to 6.8 ○ From 6.8 to 7.14.1 ○ From any version since 7.14.0 to 7.14.1
  • 53. Rolling Upgrade ● Shut down one node at a time ○ Minimal disruption
  • 54. Cluster Restart Upgrades ● Shut down all nodes ● Perform the upgrade ● Restart the cluster
  • 56. OpenSearch Mapping ● Dynamic ○ When you index a document ○ Opensearch adds fields automatically ○ It deduces their types by itself ● Explicit ○ If you know your data types ○ Preferred way of doing things
  • 57. OpenSearch Mapping ● If you do not define a mapping ahead of time, OpenSearch dynamically creates a mapping for you. ● If you do decide to define your own mapping, you can do so at index creation. ● ONE mapping is defined per index. Once the index has been created, we can only add new fields to a mapping. We CANNOT change the mapping of an existing field. ● If you must change the type of an existing field, you must create a new index with the desired mapping, then reindex all documents into the new index.
  • 58. Text vs keyword data types ● Text type ○ Full text searches ● Keyword type ○ Exact searches ○ Aggregations ○ Sorting
  • 59. Text vs Keyword ● Inverted Index
  • 61. OpenSearch Aggregations ● Analyze data ○ In real time too ● Extract statistics ● More expensive than queries ○ Or CPU and Memory ○ In general
  • 62. Aggregation Query ● Use aggs or aggregations
  • 65. Data Streams in OpenSearch ● Ingesting time series data ○ Logs ○ Events ○ Metrics, etc. ● Number of documents grows rapidly ● Append Only data ● Don't need to update older documents (Very rarely)
  • 66. Rollover ● If data is growing rapidly ● Write to index upto certain threshold ○ Then create a new index ○ And start writing to it ● Optimize the active index for high ingest rates on high-performance hot nodes. ● Optimize for search performance on warm nodes. ● Shift older, less frequently accessed data to less expensive cold nodes, ● Delete data according to your retention policies by removing entire indices.
  • 67. Index Template ● Data Stream requires an index template ● A name or wildcard (*) pattern for the data stream. ● The data stream’s timestamp field. This field must be mapped as a date or date_nanos field data type and must be included in every document indexed to the data stream. ● The mappings and settings applied to each backing index when it’s created.
  • 68. ILM Policy ● Index Lifecycle Management Policy ● Can be applied to any number of indices ● Usage ○ Allocate ○ Delete ○ Rollover ○ Read Only ○ Wait for snapshot
  • 69. ILM Policy ● Create a policy: ● Link
  • 73. Index Template ● Tells ElasticSearch how to configure an index when it is created ● For data streams ○ Configures the stream’s backing indices ○ Configured prior to index creation
  • 74. Templates Types ● Component Templates ○ Reusable building blocks that configure ■ mappings, ■ settings, and ■ Aliases ○ Not directly applied to indices ● Index Template ○ Collection of component templates ○ Directly applied to indices ○ Some defaults: metrics-*-*, logs-*-*
  • 76. Create Index Template ● Data Stream requires matching index template ● PUT _index_template/{template_name}
  • 78. Create data stream ● Documents must contain timestamp field ● PUT _data_stream/my-data-stream ● Stream’s name must match one of your index template’s index patterns
  • 79. Get Info About Data Stream ● GET _data_stream/my-data-stream
  • 80. Delete Data Stream ● DELETE _data_stream/my-data-stream
  • 82. Cross Cluster Replication ● Cross Cluster replication plugin ○ Replicates indexes, mapping & metadata from one cluster to another ● Advantages ○ Continue to handle search requests if there is an outage ○ Can help reduce latency in application ■ Replicating data across geographically distant data centers
  • 83. Replication ● Active passive model ○ Follower index pulls data from leader index ● It can be ○ Started ○ Paused ○ Stopped ○ Resumed ● Can be secured ○ Security plugin ○ Encrypt cross cluster traffic
  • 84. Exercise ● Create 2 domains in AWS OpenSearch ● Link
  • 85. Exercise ● Source Domain Connections Tab -> Outbound -> ○ Create Connection to Destination Domain ● Set access policy on destination domain: ● Link ○ ○
  • 86. Exercise ● Get Connection status ○ GET _plugins/_replication/connect1/_status ● Start syncing ○ PUT _plugins/_replication/connect1/_start ○ { ○ "leader_alias": "Connect1", ○ "leader_index": "movies", ○ "use_roles":{ ○ "leader_cluster_role": "all_access", ○ "follower_cluster_role": "all_access" ○ } ○ }
  • 88. Opensearch plugins ● Standalone components ○ That add features and capabilities ● Huge number of plugins available ● E.g. ○ Replication Plugin ○ Security plugin ○ Notification plugin
  • 89. SQL Plugin ● Lets you run SQL queries on ESDB ● Add data ○ PUT movies/_doc/1 ○ { "title": "Spirited Away" } ● Query data ○ POST _plugins/_sql ○ { ○ "query": "SELECT * FROM movies LIMIT 50" ○ } ○
  • 90. SQL Plugin ● Delete data from ESDB Index ● Enable Delete via SQL plugin ○ PUT _plugins/_query/settings ○ { ○ "transient": { ○ "plugins.sql.delete.enabled": "true" ○ } ○ } ○
  • 91. SQL PLugin - Delete ● To Delete the data ○ POST _plugins/_sql ○ { ○ "query": "DELETE FROM movies" ○ } ○
  • 92. Asynchronous Search ● Large volumes of data ● Can take longer to search ● Async ○ Run searches in the background ○ Monitor progress of these searches ○ Get back partial results as they become available
  • 93. Asynchronous Search ● POST _plugins/_asynchronous_search ● Response contents: ○ ID ■ Can be used to track the state of the search ■ Get partial results ○ State ■ Running ■ Completed ■ Persisted ● Link
  • 95. Clients ● OpenSearch Python client ● OpenSearch JavaScript (Node.js) client ● OpenSearch .NET clients ● OpenSearch Go client ● OpenSearch PHP client
  • 96. Open Search Client for .NET ● OpenSearch.Net ○ Low level client ● OpenSearch.Client ○ High level client ● Sample code: Link
  • 97. Exercise ● Create a .NET application ● Add a document to OpenSearch using the .NET Application ○ OpenSearch.Client (.NET High level client)
  • 99. Beats ● Data shippers ● Agents on servers ● Send data to ES/ Logstash
  • 100. Grafana ● An open source visualization tool ● Various sources can be used as data source: ○ InfluxDB ○ MySQL ○ ElasticSearch ○ PostgreSQL ● Better suited for metrics visualizations ● Does not allow full text data querying
  • 101. Logstash ● Free/ Open-Source ● Data processing pipeline ● Ingests data from multitude of sources ● Transforms it ● Sends it to your favorite stash
  • 102. Logstash - Ingestion ● Data of all shapes/ sizes/ source ○ Can be ingested ● It can parse/ transform your data
  • 103. Logstash - Output ● ElasticSearch ● Mongodb ● S3 ● Etc. ● Link
  • 104. AWS OpenSearch Security ● Use multi-factor authentication (MFA) with each account. ● Use SSL/TLS to communicate with AWS resources. We recommend TLS 1.2 or later. ● Set up API and user activity logging with AWS CloudTrail. ● Use AWS encryption solutions, along with all default security controls within AWS services. ● Use advanced managed security services such as Amazon Macie, which assists in discovering and securing personal data that is stored in Amazon S3. ● If you require FIPS 140-2 validated cryptographic modules when accessing AWS through a command line interface or an API, use a FIPS endpoint.
  • 105. Summary ● Opensearch ○ Open Source Search solution ● Upcoming and supported by AWS ● Caters to most search use cases ○ Great Query performance ● Powerful tools ● Community Support
  • 106. Connect with me ● Trainings on various tech topics ● For any questions: ○ https://linkedin.com/in/coach4dev