SlideShare a Scribd company logo
1 of 19
Download to read offline
Indexing,Query Optimization, the Query
                         Optimizer

                                  Richard M Kreuter
                                       10gen Inc.
                                  richard@10gen.com


                                       July 27, 2010




MongoDB – Indexing and Query Optimiz(ation—er)
Indexing Basics




         Indexes are tree-structured sets of references to your
         documents.
         The query planner can employ indexes to efficiently enumerate
         and sort matching documents.




   MongoDB – Indexing and Query Optimiz(ation—er)
However, indexing strikes people as a gray art




         As is the case with relational systems, schema design and
         indexing go hand in hand...
         ... but you also need to know about your actual (not just
         predicted) query patterns.




   MongoDB – Indexing and Query Optimiz(ation—er)
Some indexing generalities




         A collection may have at most 64 indexes.
         A query may only use 1 index (at present).
         Indexes entail additional work on inserts, updates, deletes.




   MongoDB – Indexing and Query Optimiz(ation—er)
Creating Indexes
   The id attribute is always indexed. Additional indexes can be
   created with ensureIndex():

      // Create an index on the user attribute
      db.collection.ensureIndex({ user : 1 })
      // Create a compound index on
      // the user and email attributes
      db.collection.ensureIndex({ user : 1, email : 1 })
      // Create an index on the favorites
      // attribute, will index all values in list
      db.collection.ensureIndex({ favorites : 1 })
      // Create a unique index on the user attribte
      db.collection.ensureIndex({user:1}, {unique:true})
      // Create an index in the background.
      db.collection.ensureIndex({user:1}, {background:true})

   MongoDB – Indexing and Query Optimiz(ation—er)
Index maintenance




   // Drops an index on x
   db.collection.dropIndex({x:1})
   // drops all indexes
   db.collection.dropIndexes()
   // Rebuild indexes (need for this will go away in 1.6)
   db.collection.reIndex()




   MongoDB – Indexing and Query Optimiz(ation—er)
Indexes are smart about data types and structures




         Indexes on attributes whose values are of different types in
         different documents can speed up queries by skipping
         documents where the relevant attribute isn’t of the
         appropriate type.
         Indexes on attributes whose values are lists will index each
         element, speeding up queries that look into these attributes.
         (You really want to do this for querying on tags.)




   MongoDB – Indexing and Query Optimiz(ation—er)
When can indexes be used?


   In short, if you can envision how the index might get used, it
   probably is. These will all use an index on x:
         db.collection.find( { x:                   1 } )
         db.collection.find( { x :{ $in :                      [1,2,3] } } )
         db.collection.find( { x :                   { $gt :        1 } } )
         db.collection.find( { x :                   /^a/ } )
         db.collection.count( { x :                   2 } )
         db.collection.distinct( { x :                      2 } )
         db.collection.find().sort( { x :                      1 } )




   MongoDB – Indexing and Query Optimiz(ation—er)
Trickier cases where indexes can be used




         db.collection.find({ x : 1 }).sort({ y : 1 })
         will use an index on y for sorting, if there’s no index on x.
         (For this sort of case, use a compound index on both x and y
         in that order.)
         db.collection.update( { x : 2 } , { x : 3 } )
         will use an index on x (but older mongodb versions didn’t
         permit $inc and other modifiers on indexed fields.)




   MongoDB – Indexing and Query Optimiz(ation—er)
Some array examples



   The following queries will use an index on x, and will match
   documents whose x attribute is the arraay [2,10]
         db.collection.find({ x :                   2 })
         db.collection.find({ x :                   10 })
         db.collection.find({ x :                   { $gt :     5 } })
         db.collection.find({ x :                   [2,10] })
         db.collection.find({ x :                   { $in :     [2,5] }})




   MongoDB – Indexing and Query Optimiz(ation—er)
Geospatial indexes


   Geospatial indexes are a sort of special case; the operators that can
   take advantage of them can only be used if the relevant indexes
   have been created. Some examples:
         db.collection.find({ a : [50, 50]}) finds a
         document with this point for a.
         db.collection.find({a :                    {$near :   [50, 50]}})
         sorts results by distance.
         db.collection.find({
         a:{$within:{$box:[[40,40],[60,60]]}}}})
         db.collection.find({
         a:{$within:{$center:[[50,50],10]}}}})



   MongoDB – Indexing and Query Optimiz(ation—er)
When indexes cannot be used



         Many sorts of negations, e.g., $ne, $not.
         Tricky arithmetic, e.g., $mod.
         Most regular expressions (e.g., /a/).
         Expressions in $where clauses don’t take advantage of indexes.
         map/reduce can’t take advantage of indexes (mapping
         function is opaque to the query optimizer).
   As a rule, if you can’t imagine how an index might be used, it
   probably can’t!




   MongoDB – Indexing and Query Optimiz(ation—er)
Schema/index relationships

   Sometimes, question isn’t “given the shape of these documents,
   how do I index them?”, but “how might I shape the data so I can
   take advantage of indexing?”

   // Consider a schema that uses a list of
   // attribute/value pairs:
   db.c.insert({ product : "SuperDooHickey",
                 attribs : [ { stock : 50,
                               price : 29.95,
                               ... } ] });
   db.c.ensureIndex({ attribs : 1 });
   // All attribute queries can use one index.
   db.c.find( { attribs : { stock : { $gt : 0 } } } )


   MongoDB – Indexing and Query Optimiz(ation—er)
Index sizes



   Of course, indexes take up space. For many interesting databases,
   real query performance will depend on index sizes; so it’s useful to
   see these numbers.
         db.collection.stats() shows indexSizes, the size of
         each index in the collection.
         db.collection.totalIndexSize() displays the size of all
         indexes in the collection.




   MongoDB – Indexing and Query Optimiz(ation—er)
explain()

   It’s useful to be able to ensure that your query is doing what you
   want it to do. For this, we have explain(). Query plans that use
   an index have cursor type BtreeCursor.

   db.collection.find({x:{$gt:5}}).explain()
   {
   "cursor" : "BtreeCursor x_1",
           ...
   "nscanned" : 12345,
           ...
   "n" : 100,
   "millis" : 4,
           ...
   }


   MongoDB – Indexing and Query Optimiz(ation—er)
explain(), continued

   If the query plan doesn’t use the index, the cursor type will be
   BasicCursor.

   db.collection.find({x:{$gt:5}}).explain()
   {
   "cursor" : "BasicCursor",
          ...
   "nscanned" : 12345,
           ...
   "n" : 42,
   "millis" : 4,
           ...
   }


   MongoDB – Indexing and Query Optimiz(ation—er)
Query Optimizer




         MongoDB’s query optimizer is empirical, not cost-based.
         To test query plans, it tries several in parallel, and records the
         plan that finishes fastest.
         If a plan’s performance changes over time (e.g., as data
         changes), the database will reoptimize (i.e., retry all possible
         plans).




   MongoDB – Indexing and Query Optimiz(ation—er)
Hinting the query plan




   Sometimes, you might want to force the query plan. For this, we
   have hint().

   // Force the use of an                    index on attribute x:
   db.collection.find({x:                    1, ...}).hint({x:1})
   // Force indexes to be                    avoided!
   db.collection.find({x:                    1, ...}).hint({$natural:1})




   MongoDB – Indexing and Query Optimiz(ation—er)
Going forward



         www.mongodb.org — downloads, docs, community
         mongodb-user@googlegroups.com — mailing list
         #mongodb on irc.freenode.net
         try.mongodb.org — web-based shell
         10gen is hiring. Email jobs@10gen.com.
         10gen offers support, training, and advising services for
         mongodb




   MongoDB – Indexing and Query Optimiz(ation—er)

More Related Content

What's hot

Indexing documents
Indexing documentsIndexing documents
Indexing documentsMongoDB
 
Python dictionary : past, present, future
Python dictionary: past, present, futurePython dictionary: past, present, future
Python dictionary : past, present, futuredelimitry
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.GeeksLab Odessa
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329Douglas Duncan
 
learn you some erlang - chap 9 to chap10
learn you some erlang - chap 9 to chap10learn you some erlang - chap 9 to chap10
learn you some erlang - chap 9 to chap10경미 김
 
Advanced Django ORM techniques
Advanced Django ORM techniquesAdvanced Django ORM techniques
Advanced Django ORM techniquesDaniel Roseman
 
Encontra presentation
Encontra presentationEncontra presentation
Encontra presentationRicardo Dias
 
PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn
PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn
PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn Sagar Arlekar
 
Big Data LDN 2017: From Zero to AI in 30 Minutes
Big Data LDN 2017: From Zero to AI in 30 MinutesBig Data LDN 2017: From Zero to AI in 30 Minutes
Big Data LDN 2017: From Zero to AI in 30 MinutesMatt Stubbs
 
11. session 11 functions and objects
11. session 11   functions and objects11. session 11   functions and objects
11. session 11 functions and objectsPhúc Đỗ
 
Webinar: Indexing and Query Optimization
Webinar: Indexing and Query OptimizationWebinar: Indexing and Query Optimization
Webinar: Indexing and Query OptimizationMongoDB
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analyticsMongoDB
 
High Performance GPU computing with Ruby, Rubykaigi 2018
High Performance GPU computing with Ruby, Rubykaigi 2018High Performance GPU computing with Ruby, Rubykaigi 2018
High Performance GPU computing with Ruby, Rubykaigi 2018Prasun Anand
 
Spring data presentation
Spring data presentationSpring data presentation
Spring data presentationOleksii Usyk
 
Java script objects 1
Java script objects 1Java script objects 1
Java script objects 1H K
 

What's hot (20)

Indexing documents
Indexing documentsIndexing documents
Indexing documents
 
Apex collection patterns
Apex collection patternsApex collection patterns
Apex collection patterns
 
Python dictionary : past, present, future
Python dictionary: past, present, futurePython dictionary: past, present, future
Python dictionary : past, present, future
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329
 
learn you some erlang - chap 9 to chap10
learn you some erlang - chap 9 to chap10learn you some erlang - chap 9 to chap10
learn you some erlang - chap 9 to chap10
 
ORM in Django
ORM in DjangoORM in Django
ORM in Django
 
Advanced Django ORM techniques
Advanced Django ORM techniquesAdvanced Django ORM techniques
Advanced Django ORM techniques
 
Encontra presentation
Encontra presentationEncontra presentation
Encontra presentation
 
Django Pro ORM
Django Pro ORMDjango Pro ORM
Django Pro ORM
 
PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn
PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn
PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn
 
MongoDB
MongoDB MongoDB
MongoDB
 
Big Data LDN 2017: From Zero to AI in 30 Minutes
Big Data LDN 2017: From Zero to AI in 30 MinutesBig Data LDN 2017: From Zero to AI in 30 Minutes
Big Data LDN 2017: From Zero to AI in 30 Minutes
 
11. session 11 functions and objects
11. session 11   functions and objects11. session 11   functions and objects
11. session 11 functions and objects
 
Webinar: Indexing and Query Optimization
Webinar: Indexing and Query OptimizationWebinar: Indexing and Query Optimization
Webinar: Indexing and Query Optimization
 
Base r
Base rBase r
Base r
 
1403 app dev series - session 5 - analytics
1403   app dev series - session 5 - analytics1403   app dev series - session 5 - analytics
1403 app dev series - session 5 - analytics
 
High Performance GPU computing with Ruby, Rubykaigi 2018
High Performance GPU computing with Ruby, Rubykaigi 2018High Performance GPU computing with Ruby, Rubykaigi 2018
High Performance GPU computing with Ruby, Rubykaigi 2018
 
Spring data presentation
Spring data presentationSpring data presentation
Spring data presentation
 
Java script objects 1
Java script objects 1Java script objects 1
Java script objects 1
 

Similar to MongoDB – Indexing and Query Optimiz(ation—er

Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)MongoDB
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleMongoDB
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDBMongoDB
 
unit 4,Indexes in database.docx
unit 4,Indexes in database.docxunit 4,Indexes in database.docx
unit 4,Indexes in database.docxRaviRajput416403
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & AggregationMongoDB
 
Indexing and Query Optimisation
Indexing and Query OptimisationIndexing and Query Optimisation
Indexing and Query OptimisationMongoDB
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)MongoDB
 
Mongo db a deep dive of mongodb indexes
Mongo db  a deep dive of mongodb indexesMongo db  a deep dive of mongodb indexes
Mongo db a deep dive of mongodb indexesRajesh Kumar
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query OptimizationMongoDB
 
SH 2 - SES 3 - MongoDB Aggregation Framework.pptx
SH 2 - SES 3 -  MongoDB Aggregation Framework.pptxSH 2 - SES 3 -  MongoDB Aggregation Framework.pptx
SH 2 - SES 3 - MongoDB Aggregation Framework.pptxMongoDB
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data ModelingDATAVERSITY
 
Mongo Performance Optimization Using Indexing
Mongo Performance Optimization Using IndexingMongo Performance Optimization Using Indexing
Mongo Performance Optimization Using IndexingChinmay Naik
 
Indexing and Query Optimisation
Indexing and Query OptimisationIndexing and Query Optimisation
Indexing and Query OptimisationMongoDB
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDBElieHannouch
 

Similar to MongoDB – Indexing and Query Optimiz(ation—er (20)

Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)Indexing and Query Optimizer (Mongo Austin)
Indexing and Query Optimizer (Mongo Austin)
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You Scale
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
unit 4,Indexes in database.docx
unit 4,Indexes in database.docxunit 4,Indexes in database.docx
unit 4,Indexes in database.docx
 
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & AggregationWebinar: Applikationsentwicklung mit MongoDB: Teil 5: Reporting & Aggregation
Webinar: Applikationsentwicklung mit MongoDB : Teil 5: Reporting & Aggregation
 
Indexing and Query Optimisation
Indexing and Query OptimisationIndexing and Query Optimisation
Indexing and Query Optimisation
 
Indexing
IndexingIndexing
Indexing
 
Query Optimization in MongoDB
Query Optimization in MongoDBQuery Optimization in MongoDB
Query Optimization in MongoDB
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
 
Nosql part 2
Nosql part 2Nosql part 2
Nosql part 2
 
Fast querying indexing for performance (4)
Fast querying   indexing for performance (4)Fast querying   indexing for performance (4)
Fast querying indexing for performance (4)
 
Mongo db a deep dive of mongodb indexes
Mongo db  a deep dive of mongodb indexesMongo db  a deep dive of mongodb indexes
Mongo db a deep dive of mongodb indexes
 
Mongo db queries
Mongo db queriesMongo db queries
Mongo db queries
 
Indexing and Query Optimization
Indexing and Query OptimizationIndexing and Query Optimization
Indexing and Query Optimization
 
SH 2 - SES 3 - MongoDB Aggregation Framework.pptx
SH 2 - SES 3 -  MongoDB Aggregation Framework.pptxSH 2 - SES 3 -  MongoDB Aggregation Framework.pptx
SH 2 - SES 3 - MongoDB Aggregation Framework.pptx
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Mongo Performance Optimization Using Indexing
Mongo Performance Optimization Using IndexingMongo Performance Optimization Using Indexing
Mongo Performance Optimization Using Indexing
 
Indexing and Query Optimisation
Indexing and Query OptimisationIndexing and Query Optimisation
Indexing and Query Optimisation
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDB
 

More from MongoDB

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump StartMongoDB
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB
 

More from MongoDB (20)

MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB AtlasMongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
MongoDB SoCal 2020: Migrate Anything* to MongoDB Atlas
 
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
MongoDB SoCal 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
MongoDB SoCal 2020: Using MongoDB Services in Kubernetes: Any Platform, Devel...
 
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDBMongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
MongoDB SoCal 2020: A Complete Methodology of Data Modeling for MongoDB
 
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
MongoDB SoCal 2020: From Pharmacist to Analyst: Leveraging MongoDB for Real-T...
 
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series DataMongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
MongoDB SoCal 2020: Best Practices for Working with IoT and Time-series Data
 
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 MongoDB SoCal 2020: MongoDB Atlas Jump Start MongoDB SoCal 2020: MongoDB Atlas Jump Start
MongoDB SoCal 2020: MongoDB Atlas Jump Start
 
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
MongoDB .local San Francisco 2020: Powering the new age data demands [Infosys]
 
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
MongoDB .local San Francisco 2020: Using Client Side Encryption in MongoDB 4.2
 
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
MongoDB .local San Francisco 2020: Using MongoDB Services in Kubernetes: any ...
 
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
MongoDB .local San Francisco 2020: Go on a Data Safari with MongoDB Charts!
 
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your MindsetMongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
MongoDB .local San Francisco 2020: From SQL to NoSQL -- Changing Your Mindset
 
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas JumpstartMongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
MongoDB .local San Francisco 2020: MongoDB Atlas Jumpstart
 
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
MongoDB .local San Francisco 2020: Tips and Tricks++ for Querying and Indexin...
 
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
MongoDB .local San Francisco 2020: Aggregation Pipeline Power++
 
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
MongoDB .local San Francisco 2020: A Complete Methodology of Data Modeling fo...
 
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep DiveMongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
MongoDB .local San Francisco 2020: MongoDB Atlas Data Lake Technical Deep Dive
 
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & GolangMongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
MongoDB .local San Francisco 2020: Developing Alexa Skills with MongoDB & Golang
 
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
MongoDB .local Paris 2020: Realm : l'ingrédient secret pour de meilleures app...
 
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
MongoDB .local Paris 2020: Upply @MongoDB : Upply : Quand le Machine Learning...
 

MongoDB – Indexing and Query Optimiz(ation—er

  • 1. Indexing,Query Optimization, the Query Optimizer Richard M Kreuter 10gen Inc. richard@10gen.com July 27, 2010 MongoDB – Indexing and Query Optimiz(ation—er)
  • 2. Indexing Basics Indexes are tree-structured sets of references to your documents. The query planner can employ indexes to efficiently enumerate and sort matching documents. MongoDB – Indexing and Query Optimiz(ation—er)
  • 3. However, indexing strikes people as a gray art As is the case with relational systems, schema design and indexing go hand in hand... ... but you also need to know about your actual (not just predicted) query patterns. MongoDB – Indexing and Query Optimiz(ation—er)
  • 4. Some indexing generalities A collection may have at most 64 indexes. A query may only use 1 index (at present). Indexes entail additional work on inserts, updates, deletes. MongoDB – Indexing and Query Optimiz(ation—er)
  • 5. Creating Indexes The id attribute is always indexed. Additional indexes can be created with ensureIndex(): // Create an index on the user attribute db.collection.ensureIndex({ user : 1 }) // Create a compound index on // the user and email attributes db.collection.ensureIndex({ user : 1, email : 1 }) // Create an index on the favorites // attribute, will index all values in list db.collection.ensureIndex({ favorites : 1 }) // Create a unique index on the user attribte db.collection.ensureIndex({user:1}, {unique:true}) // Create an index in the background. db.collection.ensureIndex({user:1}, {background:true}) MongoDB – Indexing and Query Optimiz(ation—er)
  • 6. Index maintenance // Drops an index on x db.collection.dropIndex({x:1}) // drops all indexes db.collection.dropIndexes() // Rebuild indexes (need for this will go away in 1.6) db.collection.reIndex() MongoDB – Indexing and Query Optimiz(ation—er)
  • 7. Indexes are smart about data types and structures Indexes on attributes whose values are of different types in different documents can speed up queries by skipping documents where the relevant attribute isn’t of the appropriate type. Indexes on attributes whose values are lists will index each element, speeding up queries that look into these attributes. (You really want to do this for querying on tags.) MongoDB – Indexing and Query Optimiz(ation—er)
  • 8. When can indexes be used? In short, if you can envision how the index might get used, it probably is. These will all use an index on x: db.collection.find( { x: 1 } ) db.collection.find( { x :{ $in : [1,2,3] } } ) db.collection.find( { x : { $gt : 1 } } ) db.collection.find( { x : /^a/ } ) db.collection.count( { x : 2 } ) db.collection.distinct( { x : 2 } ) db.collection.find().sort( { x : 1 } ) MongoDB – Indexing and Query Optimiz(ation—er)
  • 9. Trickier cases where indexes can be used db.collection.find({ x : 1 }).sort({ y : 1 }) will use an index on y for sorting, if there’s no index on x. (For this sort of case, use a compound index on both x and y in that order.) db.collection.update( { x : 2 } , { x : 3 } ) will use an index on x (but older mongodb versions didn’t permit $inc and other modifiers on indexed fields.) MongoDB – Indexing and Query Optimiz(ation—er)
  • 10. Some array examples The following queries will use an index on x, and will match documents whose x attribute is the arraay [2,10] db.collection.find({ x : 2 }) db.collection.find({ x : 10 }) db.collection.find({ x : { $gt : 5 } }) db.collection.find({ x : [2,10] }) db.collection.find({ x : { $in : [2,5] }}) MongoDB – Indexing and Query Optimiz(ation—er)
  • 11. Geospatial indexes Geospatial indexes are a sort of special case; the operators that can take advantage of them can only be used if the relevant indexes have been created. Some examples: db.collection.find({ a : [50, 50]}) finds a document with this point for a. db.collection.find({a : {$near : [50, 50]}}) sorts results by distance. db.collection.find({ a:{$within:{$box:[[40,40],[60,60]]}}}}) db.collection.find({ a:{$within:{$center:[[50,50],10]}}}}) MongoDB – Indexing and Query Optimiz(ation—er)
  • 12. When indexes cannot be used Many sorts of negations, e.g., $ne, $not. Tricky arithmetic, e.g., $mod. Most regular expressions (e.g., /a/). Expressions in $where clauses don’t take advantage of indexes. map/reduce can’t take advantage of indexes (mapping function is opaque to the query optimizer). As a rule, if you can’t imagine how an index might be used, it probably can’t! MongoDB – Indexing and Query Optimiz(ation—er)
  • 13. Schema/index relationships Sometimes, question isn’t “given the shape of these documents, how do I index them?”, but “how might I shape the data so I can take advantage of indexing?” // Consider a schema that uses a list of // attribute/value pairs: db.c.insert({ product : "SuperDooHickey", attribs : [ { stock : 50, price : 29.95, ... } ] }); db.c.ensureIndex({ attribs : 1 }); // All attribute queries can use one index. db.c.find( { attribs : { stock : { $gt : 0 } } } ) MongoDB – Indexing and Query Optimiz(ation—er)
  • 14. Index sizes Of course, indexes take up space. For many interesting databases, real query performance will depend on index sizes; so it’s useful to see these numbers. db.collection.stats() shows indexSizes, the size of each index in the collection. db.collection.totalIndexSize() displays the size of all indexes in the collection. MongoDB – Indexing and Query Optimiz(ation—er)
  • 15. explain() It’s useful to be able to ensure that your query is doing what you want it to do. For this, we have explain(). Query plans that use an index have cursor type BtreeCursor. db.collection.find({x:{$gt:5}}).explain() { "cursor" : "BtreeCursor x_1", ... "nscanned" : 12345, ... "n" : 100, "millis" : 4, ... } MongoDB – Indexing and Query Optimiz(ation—er)
  • 16. explain(), continued If the query plan doesn’t use the index, the cursor type will be BasicCursor. db.collection.find({x:{$gt:5}}).explain() { "cursor" : "BasicCursor", ... "nscanned" : 12345, ... "n" : 42, "millis" : 4, ... } MongoDB – Indexing and Query Optimiz(ation—er)
  • 17. Query Optimizer MongoDB’s query optimizer is empirical, not cost-based. To test query plans, it tries several in parallel, and records the plan that finishes fastest. If a plan’s performance changes over time (e.g., as data changes), the database will reoptimize (i.e., retry all possible plans). MongoDB – Indexing and Query Optimiz(ation—er)
  • 18. Hinting the query plan Sometimes, you might want to force the query plan. For this, we have hint(). // Force the use of an index on attribute x: db.collection.find({x: 1, ...}).hint({x:1}) // Force indexes to be avoided! db.collection.find({x: 1, ...}).hint({$natural:1}) MongoDB – Indexing and Query Optimiz(ation—er)
  • 19. Going forward www.mongodb.org — downloads, docs, community mongodb-user@googlegroups.com — mailing list #mongodb on irc.freenode.net try.mongodb.org — web-based shell 10gen is hiring. Email jobs@10gen.com. 10gen offers support, training, and advising services for mongodb MongoDB – Indexing and Query Optimiz(ation—er)