SlideShare a Scribd company logo
1 of 24
Download to read offline
Indexing, Query Optimization, the Query
                 Optimizer — MongoPhilly

                                  Richard M Kreuter
                                       10gen Inc.
                                  richard@10gen.com


                                      April 26, 2011




MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Indexing Basics




         Indexes are tree-structured sets of references to your
         documents.
         The query planner can employ indexes to efficiently enumerate
         and sort matching documents.




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
However, indexing strikes people as a gray art




         As is the case with relational systems, schema design and
         indexing go hand in hand...
         ... but you also need to know about your actual (not just
         predicted) query patterns.




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Some indexing generalities




         A collection may have at most 64 indexes.
         A query may only use 1 index (except for disjuncts of $or
         queries).
         Indexes entail additional work on inserts, updates, deletes.




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Creating Indexes
   The id attribute is always indexed. Additional indexes can be
   created with ensureIndex():

      // Create an index on the user attribute
      db.collection.ensureIndex({ user : 1 })
      // Create a compound index on
      // the user and email attributes
      db.collection.ensureIndex({ user : 1, email : 1 })
      // Create an index on the favorites
      // attribute, will index all values in list
      db.collection.ensureIndex({ favorites : 1 })
      // Create a unique index on the user attribte
      db.collection.ensureIndex({user:1}, {unique:true})
      // Create an index in the background.
      db.collection.ensureIndex({user:1}, {background:true})

   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Index maintenance




   // Drops an index on x
   db.collection.dropIndex({x:1})
   // drops all indexes
   db.collection.dropIndexes()
   // Rebuild indexes (need for this reduced in 1.6)
   db.collection.reIndex()




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Indexes are smart about data types and structures




         Indexes on attributes whose values are of different types in
         different documents can speed up queries by skipping
         documents where the relevant attribute isn’t of the
         appropriate type.
         Indexes on attributes whose values are lists will index each
         element, speeding up queries that look into these attributes.
         (You really want to do this for querying on tags.)




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
When can indexes be used?


   In short, if you can envision how the index might get used, it
   probably is. These will all use an index on x:
         db.collection.find( { x:                        1 } )
         db.collection.find( { x :{ $in :                           [1,2,3] } } )
         db.collection.find( { x :                        { $gt :         1 } } )
         db.collection.find( { x :                        /^a/ } )
         db.collection.count( { x :                         2 } )
         db.collection.distinct( { x :                            2 } )
         db.collection.find().sort( { x :                            1 } )




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Trickier cases where indexes can be used




         db.collection.find({ x : 1 }).sort({ y : 1 })
         will use an index on y for sorting, if there’s no index on x.
         (For this sort of case, use a compound index on both x and y
         in that order.)
         db.collection.update( { x : 2 } , { x : 3 } )
         will use an index on x (but older mongodb versions didn’t
         permit $inc and other modifiers on indexed fields.)




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Some array examples



   The following queries will use an index on x, and will match
   documents whose x attribute is the array [2,10]
         db.collection.find({ x :                        2 })
         db.collection.find({ x :                        10 })
         db.collection.find({ x :                        { $gt :     5 } })
         db.collection.find({ x :                        [2,10] })
         db.collection.find({ x :                        { $in :     [2,5] }})




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Geospatial indexes


   Geospatial indexes are a sort of special case; the operators that can
   take advantage of them can only be used if the relevant indexes
   have been created. Some examples:
         db.collection.find({ a : [50, 50]}) finds a
         document with this point for a.
         db.collection.find({a :                       {$near :   [50, 50]}})
         sorts results by distance.
         db.collection.find({
         a:{$within:{$box:[[40,40],[60,60]]}}}})
         db.collection.find({
         a:{$within:{$center:[[50,50],10]}}}})



   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
When indexes cannot be used

         Many sorts of negations, e.g., $ne, $not.
         Tricky arithmetic, e.g., $mod.
         Most regular expressions (e.g., /a/).
         Expressions in $where clauses don’t take advantage of
         indexes.
                Of course $where clauses are mostly for complex queries that
                often can’t be indexed anyway, e.g., ‘‘where a > b’’. (If
                these cases matter to you, it you can precompute the match
                and store that as an additional attribute, you can store that,
                index it, and skip the $where clause entirely.)
         map/reduce can’t take advantage of indexes (mapping
         function is opaque to the query optimizer).
   As a rule, if you can’t imagine how an index might be used, it
   probably can’t!
   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Never forget about compound indexes




         Whenever you’re querying on multiple attributes, whether as
         part of the selector document or in a sort(), compound
         indexes can be used.




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Schema/index relationships
   Sometimes, question isn’t “given the shape of these documents,
   how do I index them?”, but “how might I shape the data so I can
   take advantage of indexing?”

   // Consider a schema that uses a list of
   // attribute/value pairs:
   db.c.insert({ product : "SuperDooHickey",
                 manufacturer : "Foo Enterprises",
                 catalog : [ { stock : 50,
                               modtime: ’2010-09-02’ },
                             { price : 29.95,
                               modtime : ’2010-06-14’ } ] });
   db.c.ensureIndex({ catalog : 1 });
   // All attribute queries can use one index.
   db.c.find( { catalog : { stock : { $gt : 0 } } } )

   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Sparse Indexes




   Sparse indexes are a new flavor of index that may be useful when
   you want to index on a field that is present in only a smallish
   subset of a collection. A sparse index is created by specifying
   { sparse : true } to the index constructor, and it only
   create entries for documents that contain the field.




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Covered Indexes


   A covered index is an index from which a query’s results can be
   produced without needing to access full document records. So, for
   example, if you have an index on attributes foo and bar and you
   execute find({ bar : { $gt : 10 } },
   { foo : 1 , id : 0 }), the results can be computed just by
   examining the index.
   Note that the id attribute is not present in indexes by default, and
   so in order to take advantage of covered indexes, you’ll need to
   exclude it from a query’s projection argument or include it in the
   index explicitly.




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Index sizes



   Of course, indexes take up space. For many interesting databases,
   real query performance will depend on index sizes; so it’s useful to
   see these numbers.
         db.collection.stats() shows indexSizes, the size of
         each index in the collection.
         db.collection.totalIndexSize() displays the size of all
         indexes in the collection.




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
explain()

   It’s useful to be able to ensure that your query is doing what you
   want it to do. For this, we have explain(). Query plans that use
   an index have cursor type BtreeCursor.

   db.collection.find({x:{$gt:5}}).explain()
   {
   "cursor" : "BtreeCursor x_1",
           ...
   "nscanned" : 12345,
           ...
   "n" : 100,
   "millis" : 4,
           ...
   }


   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
explain(), continued

   If the query plan doesn’t use the index, the cursor type will be
   BasicCursor.

   db.collection.find({x:{$gt:5}}).explain()
   {
   "cursor" : "BasicCursor",
          ...
   "nscanned" : 12345,
           ...
   "n" : 42,
   "millis" : 4,
           ...
   }


   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Really, compound indexes are important

   Try this at home:
      1   Create a collection with a few tens of thousands of documents
          having two attributes (let’s call them a and b).
      2   Create a compound index on {a :                         1, b :   1},
      3   Do a db.collection.find({a :                            constant}).sort({b :
          1}).explain().
      4   Note the explain result’s millis.
      5   Drop the compound index.
      6   Create another compound index with the attributes reversed.
          (This will be a suboptimal compound index.)
      7   Explain the above query again.
      8   The suboptimal index should produce a slower explain result.

   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
The DB Profiler
  MongoDB includes a database profiler that, when enabled, records
  the timing measurements and result counts in a collection within
  the database.
  // Enable the profiler on this database.
  > db.setProfilingLevel(1, 100)
  { "was" : 0, "slowms" : 100, "ok" : 1 }
  > db.foo.find({a: { $mod : [3, 0] } });
  ...
  // See the profiler info.
  > db.system.profile.find()
  { "ts" : "Thu Nov 18 2010 06:46:16 GMT-0500 (EST)",
     "info" : "query test.$cmd ntoreturn:1
         command: { count: "foo",
                               query: { a: { $mod: [ 3.0, 0.0 ] } },
         fields: {} } reslen:64 406ms",
     "millis" : 406 }
  MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Query Optimizer




         MongoDB’s query optimizer is empirical, not cost-based.
         To test query plans, it tries several in parallel, and records the
         plan that finishes fastest.
         If a plan’s performance changes over time (e.g., as data
         changes), the database will reoptimize (i.e., retry all possible
         plans).




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Hinting the query plan




   Sometimes, you might want to force the query plan. For this, we
   have hint().

   // Force the use of an                   index on attribute x:
   db.collection.find({x:                   1, ...}).hint({x:1})
   // Force indexes to be                   avoided!
   db.collection.find({x:                   1, ...}).hint({$natural:1})




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
Going forward



         www.mongodb.org — downloads, docs, community
         mongodb-user@googlegroups.com — mailing list
         #mongodb on irc.freenode.net
         try.mongodb.org — web-based shell
         10gen is hiring. Email jobs@10gen.com.
         10gen offers support, training, and advising services for
         mongodb




   MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly

More Related Content

What's hot

Schema Design (Mongo Austin)
Schema Design (Mongo Austin)Schema Design (Mongo Austin)
Schema Design (Mongo Austin)MongoDB
 
Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)MongoSF
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleMongoDB
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329Douglas Duncan
 
엘라스틱서치 적합성 이해하기 20160630
엘라스틱서치 적합성 이해하기 20160630엘라스틱서치 적합성 이해하기 20160630
엘라스틱서치 적합성 이해하기 20160630Yong Joon Moon
 
2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDBantoinegirbal
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.GeeksLab Odessa
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDBMongoDB
 
Spring data presentation
Spring data presentationSpring data presentation
Spring data presentationOleksii Usyk
 
Data access 2.0? Please welcome: Spring Data!
Data access 2.0? Please welcome: Spring Data!Data access 2.0? Please welcome: Spring Data!
Data access 2.0? Please welcome: Spring Data!Oliver Gierke
 
Encontra presentation
Encontra presentationEncontra presentation
Encontra presentationRicardo Dias
 
Mongo Nosql CRUD Operations
Mongo Nosql CRUD OperationsMongo Nosql CRUD Operations
Mongo Nosql CRUD Operationsanujaggarwal49
 
Advanced Django ORM techniques
Advanced Django ORM techniquesAdvanced Django ORM techniques
Advanced Django ORM techniquesDaniel Roseman
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data ModelingDATAVERSITY
 

What's hot (19)

Schema Design (Mongo Austin)
Schema Design (Mongo Austin)Schema Design (Mongo Austin)
Schema Design (Mongo Austin)
 
Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)Indexing and Query Optimizer (Aaron Staple)
Indexing and Query Optimizer (Aaron Staple)
 
Indexing Strategies to Help You Scale
Indexing Strategies to Help You ScaleIndexing Strategies to Help You Scale
Indexing Strategies to Help You Scale
 
Mongo indexes
Mongo indexesMongo indexes
Mongo indexes
 
Indexing In MongoDB
Indexing In MongoDBIndexing In MongoDB
Indexing In MongoDB
 
CouchDB-Lucene
CouchDB-LuceneCouchDB-Lucene
CouchDB-Lucene
 
MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329MongoDB and Indexes - MUG Denver - 20160329
MongoDB and Indexes - MUG Denver - 20160329
 
엘라스틱서치 적합성 이해하기 20160630
엘라스틱서치 적합성 이해하기 20160630엘라스틱서치 적합성 이해하기 20160630
엘라스틱서치 적합성 이해하기 20160630
 
2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB2011 Mongo FR - Indexing in MongoDB
2011 Mongo FR - Indexing in MongoDB
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
Indexing with MongoDB
Indexing with MongoDBIndexing with MongoDB
Indexing with MongoDB
 
Spring data presentation
Spring data presentationSpring data presentation
Spring data presentation
 
Data access 2.0? Please welcome: Spring Data!
Data access 2.0? Please welcome: Spring Data!Data access 2.0? Please welcome: Spring Data!
Data access 2.0? Please welcome: Spring Data!
 
Encontra presentation
Encontra presentationEncontra presentation
Encontra presentation
 
Mongo db
Mongo dbMongo db
Mongo db
 
Mongo Nosql CRUD Operations
Mongo Nosql CRUD OperationsMongo Nosql CRUD Operations
Mongo Nosql CRUD Operations
 
ORM in Django
ORM in DjangoORM in Django
ORM in Django
 
Advanced Django ORM techniques
Advanced Django ORM techniquesAdvanced Django ORM techniques
Advanced Django ORM techniques
 
10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling10gen Presents Schema Design and Data Modeling
10gen Presents Schema Design and Data Modeling
 

Viewers also liked

Mongophilly shell-2011-04-26
Mongophilly shell-2011-04-26Mongophilly shell-2011-04-26
Mongophilly shell-2011-04-26kreuter
 
Patterns and antipatterns
Patterns and antipatternsPatterns and antipatterns
Patterns and antipatternskreuter
 
Mongophilly cool-features-2011-04-26
Mongophilly cool-features-2011-04-26Mongophilly cool-features-2011-04-26
Mongophilly cool-features-2011-04-26kreuter
 
Mongophilly cool-features-2011-04-26
Mongophilly cool-features-2011-04-26Mongophilly cool-features-2011-04-26
Mongophilly cool-features-2011-04-26kreuter
 
What's Next in Growth? 2016
What's Next in Growth? 2016What's Next in Growth? 2016
What's Next in Growth? 2016Andrew Chen
 
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your BusinessBarry Feldman
 

Viewers also liked (7)

Mongophilly shell-2011-04-26
Mongophilly shell-2011-04-26Mongophilly shell-2011-04-26
Mongophilly shell-2011-04-26
 
Patterns and antipatterns
Patterns and antipatternsPatterns and antipatterns
Patterns and antipatterns
 
Mongophilly cool-features-2011-04-26
Mongophilly cool-features-2011-04-26Mongophilly cool-features-2011-04-26
Mongophilly cool-features-2011-04-26
 
Mongophilly cool-features-2011-04-26
Mongophilly cool-features-2011-04-26Mongophilly cool-features-2011-04-26
Mongophilly cool-features-2011-04-26
 
Stay Up To Date on the Latest Happenings in the Boardroom: Recommended Summer...
Stay Up To Date on the Latest Happenings in the Boardroom: Recommended Summer...Stay Up To Date on the Latest Happenings in the Boardroom: Recommended Summer...
Stay Up To Date on the Latest Happenings in the Boardroom: Recommended Summer...
 
What's Next in Growth? 2016
What's Next in Growth? 2016What's Next in Growth? 2016
What's Next in Growth? 2016
 
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business32 Ways a Digital Marketing Consultant Can Help Grow Your Business
32 Ways a Digital Marketing Consultant Can Help Grow Your Business
 

Similar to Mongophilly indexing-2011-04-26

unit 4,Indexes in database.docx
unit 4,Indexes in database.docxunit 4,Indexes in database.docx
unit 4,Indexes in database.docxRaviRajput416403
 
Mongo db a deep dive of mongodb indexes
Mongo db  a deep dive of mongodb indexesMongo db  a deep dive of mongodb indexes
Mongo db a deep dive of mongodb indexesRajesh Kumar
 
Indexing documents
Indexing documentsIndexing documents
Indexing documentsMongoDB
 
Indexing and Query Optimisation
Indexing and Query OptimisationIndexing and Query Optimisation
Indexing and Query OptimisationMongoDB
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaperRajesh Kumar
 
MongoDB - An Introduction
MongoDB - An IntroductionMongoDB - An Introduction
MongoDB - An Introductiondinkar thakur
 
Webinar: Indexing and Query Optimization
Webinar: Indexing and Query OptimizationWebinar: Indexing and Query Optimization
Webinar: Indexing and Query OptimizationMongoDB
 
Mongo Performance Optimization Using Indexing
Mongo Performance Optimization Using IndexingMongo Performance Optimization Using Indexing
Mongo Performance Optimization Using IndexingChinmay Naik
 
MongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behlMongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behlTO THE NEW | Technology
 
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2Antonios Giannopoulos
 
Mongo db tutorials
Mongo db tutorialsMongo db tutorials
Mongo db tutorialsAnuj Jain
 
Overview on NoSQL and MongoDB
Overview on NoSQL and MongoDBOverview on NoSQL and MongoDB
Overview on NoSQL and MongoDBharithakannan
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDBElieHannouch
 
Indexing in eXist database
Indexing in eXist database Indexing in eXist database
Indexing in eXist database redchilly
 
SH 2 - SES 3 - MongoDB Aggregation Framework.pptx
SH 2 - SES 3 -  MongoDB Aggregation Framework.pptxSH 2 - SES 3 -  MongoDB Aggregation Framework.pptx
SH 2 - SES 3 - MongoDB Aggregation Framework.pptxMongoDB
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesMongoDB
 

Similar to Mongophilly indexing-2011-04-26 (20)

unit 4,Indexes in database.docx
unit 4,Indexes in database.docxunit 4,Indexes in database.docx
unit 4,Indexes in database.docx
 
Nosql part 2
Nosql part 2Nosql part 2
Nosql part 2
 
Query Optimization in MongoDB
Query Optimization in MongoDBQuery Optimization in MongoDB
Query Optimization in MongoDB
 
Mongo db a deep dive of mongodb indexes
Mongo db  a deep dive of mongodb indexesMongo db  a deep dive of mongodb indexes
Mongo db a deep dive of mongodb indexes
 
Indexing documents
Indexing documentsIndexing documents
Indexing documents
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
Indexing and Query Optimisation
Indexing and Query OptimisationIndexing and Query Optimisation
Indexing and Query Optimisation
 
MongoDB NoSQL database a deep dive -MyWhitePaper
MongoDB  NoSQL database a deep dive -MyWhitePaperMongoDB  NoSQL database a deep dive -MyWhitePaper
MongoDB NoSQL database a deep dive -MyWhitePaper
 
MongoDB - An Introduction
MongoDB - An IntroductionMongoDB - An Introduction
MongoDB - An Introduction
 
Webinar: Indexing and Query Optimization
Webinar: Indexing and Query OptimizationWebinar: Indexing and Query Optimization
Webinar: Indexing and Query Optimization
 
Mongo Performance Optimization Using Indexing
Mongo Performance Optimization Using IndexingMongo Performance Optimization Using Indexing
Mongo Performance Optimization Using Indexing
 
MongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behlMongoDB using Grails plugin by puneet behl
MongoDB using Grails plugin by puneet behl
 
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
New Indexing and Aggregation Pipeline Capabilities in MongoDB 4.2
 
Mongodb Introduction
Mongodb IntroductionMongodb Introduction
Mongodb Introduction
 
Mongo db tutorials
Mongo db tutorialsMongo db tutorials
Mongo db tutorials
 
Overview on NoSQL and MongoDB
Overview on NoSQL and MongoDBOverview on NoSQL and MongoDB
Overview on NoSQL and MongoDB
 
Introduction To MongoDB
Introduction To MongoDBIntroduction To MongoDB
Introduction To MongoDB
 
Indexing in eXist database
Indexing in eXist database Indexing in eXist database
Indexing in eXist database
 
SH 2 - SES 3 - MongoDB Aggregation Framework.pptx
SH 2 - SES 3 -  MongoDB Aggregation Framework.pptxSH 2 - SES 3 -  MongoDB Aggregation Framework.pptx
SH 2 - SES 3 - MongoDB Aggregation Framework.pptx
 
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial IndexesBack to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
Back to Basics Webinar 4: Advanced Indexing, Text and Geospatial Indexes
 

Mongophilly indexing-2011-04-26

  • 1. Indexing, Query Optimization, the Query Optimizer — MongoPhilly Richard M Kreuter 10gen Inc. richard@10gen.com April 26, 2011 MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 2. Indexing Basics Indexes are tree-structured sets of references to your documents. The query planner can employ indexes to efficiently enumerate and sort matching documents. MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 3. However, indexing strikes people as a gray art As is the case with relational systems, schema design and indexing go hand in hand... ... but you also need to know about your actual (not just predicted) query patterns. MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 4. Some indexing generalities A collection may have at most 64 indexes. A query may only use 1 index (except for disjuncts of $or queries). Indexes entail additional work on inserts, updates, deletes. MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 5. Creating Indexes The id attribute is always indexed. Additional indexes can be created with ensureIndex(): // Create an index on the user attribute db.collection.ensureIndex({ user : 1 }) // Create a compound index on // the user and email attributes db.collection.ensureIndex({ user : 1, email : 1 }) // Create an index on the favorites // attribute, will index all values in list db.collection.ensureIndex({ favorites : 1 }) // Create a unique index on the user attribte db.collection.ensureIndex({user:1}, {unique:true}) // Create an index in the background. db.collection.ensureIndex({user:1}, {background:true}) MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 6. Index maintenance // Drops an index on x db.collection.dropIndex({x:1}) // drops all indexes db.collection.dropIndexes() // Rebuild indexes (need for this reduced in 1.6) db.collection.reIndex() MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 7. Indexes are smart about data types and structures Indexes on attributes whose values are of different types in different documents can speed up queries by skipping documents where the relevant attribute isn’t of the appropriate type. Indexes on attributes whose values are lists will index each element, speeding up queries that look into these attributes. (You really want to do this for querying on tags.) MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 8. When can indexes be used? In short, if you can envision how the index might get used, it probably is. These will all use an index on x: db.collection.find( { x: 1 } ) db.collection.find( { x :{ $in : [1,2,3] } } ) db.collection.find( { x : { $gt : 1 } } ) db.collection.find( { x : /^a/ } ) db.collection.count( { x : 2 } ) db.collection.distinct( { x : 2 } ) db.collection.find().sort( { x : 1 } ) MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 9. Trickier cases where indexes can be used db.collection.find({ x : 1 }).sort({ y : 1 }) will use an index on y for sorting, if there’s no index on x. (For this sort of case, use a compound index on both x and y in that order.) db.collection.update( { x : 2 } , { x : 3 } ) will use an index on x (but older mongodb versions didn’t permit $inc and other modifiers on indexed fields.) MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 10. Some array examples The following queries will use an index on x, and will match documents whose x attribute is the array [2,10] db.collection.find({ x : 2 }) db.collection.find({ x : 10 }) db.collection.find({ x : { $gt : 5 } }) db.collection.find({ x : [2,10] }) db.collection.find({ x : { $in : [2,5] }}) MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 11. Geospatial indexes Geospatial indexes are a sort of special case; the operators that can take advantage of them can only be used if the relevant indexes have been created. Some examples: db.collection.find({ a : [50, 50]}) finds a document with this point for a. db.collection.find({a : {$near : [50, 50]}}) sorts results by distance. db.collection.find({ a:{$within:{$box:[[40,40],[60,60]]}}}}) db.collection.find({ a:{$within:{$center:[[50,50],10]}}}}) MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 12. When indexes cannot be used Many sorts of negations, e.g., $ne, $not. Tricky arithmetic, e.g., $mod. Most regular expressions (e.g., /a/). Expressions in $where clauses don’t take advantage of indexes. Of course $where clauses are mostly for complex queries that often can’t be indexed anyway, e.g., ‘‘where a > b’’. (If these cases matter to you, it you can precompute the match and store that as an additional attribute, you can store that, index it, and skip the $where clause entirely.) map/reduce can’t take advantage of indexes (mapping function is opaque to the query optimizer). As a rule, if you can’t imagine how an index might be used, it probably can’t! MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 13. Never forget about compound indexes Whenever you’re querying on multiple attributes, whether as part of the selector document or in a sort(), compound indexes can be used. MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 14. Schema/index relationships Sometimes, question isn’t “given the shape of these documents, how do I index them?”, but “how might I shape the data so I can take advantage of indexing?” // Consider a schema that uses a list of // attribute/value pairs: db.c.insert({ product : "SuperDooHickey", manufacturer : "Foo Enterprises", catalog : [ { stock : 50, modtime: ’2010-09-02’ }, { price : 29.95, modtime : ’2010-06-14’ } ] }); db.c.ensureIndex({ catalog : 1 }); // All attribute queries can use one index. db.c.find( { catalog : { stock : { $gt : 0 } } } ) MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 15. Sparse Indexes Sparse indexes are a new flavor of index that may be useful when you want to index on a field that is present in only a smallish subset of a collection. A sparse index is created by specifying { sparse : true } to the index constructor, and it only create entries for documents that contain the field. MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 16. Covered Indexes A covered index is an index from which a query’s results can be produced without needing to access full document records. So, for example, if you have an index on attributes foo and bar and you execute find({ bar : { $gt : 10 } }, { foo : 1 , id : 0 }), the results can be computed just by examining the index. Note that the id attribute is not present in indexes by default, and so in order to take advantage of covered indexes, you’ll need to exclude it from a query’s projection argument or include it in the index explicitly. MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 17. Index sizes Of course, indexes take up space. For many interesting databases, real query performance will depend on index sizes; so it’s useful to see these numbers. db.collection.stats() shows indexSizes, the size of each index in the collection. db.collection.totalIndexSize() displays the size of all indexes in the collection. MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 18. explain() It’s useful to be able to ensure that your query is doing what you want it to do. For this, we have explain(). Query plans that use an index have cursor type BtreeCursor. db.collection.find({x:{$gt:5}}).explain() { "cursor" : "BtreeCursor x_1", ... "nscanned" : 12345, ... "n" : 100, "millis" : 4, ... } MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 19. explain(), continued If the query plan doesn’t use the index, the cursor type will be BasicCursor. db.collection.find({x:{$gt:5}}).explain() { "cursor" : "BasicCursor", ... "nscanned" : 12345, ... "n" : 42, "millis" : 4, ... } MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 20. Really, compound indexes are important Try this at home: 1 Create a collection with a few tens of thousands of documents having two attributes (let’s call them a and b). 2 Create a compound index on {a : 1, b : 1}, 3 Do a db.collection.find({a : constant}).sort({b : 1}).explain(). 4 Note the explain result’s millis. 5 Drop the compound index. 6 Create another compound index with the attributes reversed. (This will be a suboptimal compound index.) 7 Explain the above query again. 8 The suboptimal index should produce a slower explain result. MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 21. The DB Profiler MongoDB includes a database profiler that, when enabled, records the timing measurements and result counts in a collection within the database. // Enable the profiler on this database. > db.setProfilingLevel(1, 100) { "was" : 0, "slowms" : 100, "ok" : 1 } > db.foo.find({a: { $mod : [3, 0] } }); ... // See the profiler info. > db.system.profile.find() { "ts" : "Thu Nov 18 2010 06:46:16 GMT-0500 (EST)", "info" : "query test.$cmd ntoreturn:1 command: { count: "foo", query: { a: { $mod: [ 3.0, 0.0 ] } }, fields: {} } reslen:64 406ms", "millis" : 406 } MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 22. Query Optimizer MongoDB’s query optimizer is empirical, not cost-based. To test query plans, it tries several in parallel, and records the plan that finishes fastest. If a plan’s performance changes over time (e.g., as data changes), the database will reoptimize (i.e., retry all possible plans). MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 23. Hinting the query plan Sometimes, you might want to force the query plan. For this, we have hint(). // Force the use of an index on attribute x: db.collection.find({x: 1, ...}).hint({x:1}) // Force indexes to be avoided! db.collection.find({x: 1, ...}).hint({$natural:1}) MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly
  • 24. Going forward www.mongodb.org — downloads, docs, community mongodb-user@googlegroups.com — mailing list #mongodb on irc.freenode.net try.mongodb.org — web-based shell 10gen is hiring. Email jobs@10gen.com. 10gen offers support, training, and advising services for mongodb MongoDB – Indexing and Query Optimiz(ation—er) — MongoPhilly