SlideShare a Scribd company logo
1 of 9
Download to read offline
MongoDB on Windows Azure
               A 10gen White Paper
MongoDB on Windows Azure

MongoDB on Windows Azure brings the           provide customers the tools to build limit-
power of the leading NoSQL database           lessly scalable applications in the cloud.
to Microsoft’s flexible, open, and
scalable cloud.                               This paper begins with an overview of
                                              MongoDB. Next, we describe the two
MongoDB is an open source, document-          primary deployment options available on
oriented database designed with scalability   Microsoft’s cloud platform, Windows Azure
and developer agility in mind. Windows        Virtual Machines and Windows Azure Cloud
Azure is the cloud services operating sys-    Services. Finally, to help those evaluating
tem that provides the development, service    deploying MongoDB on Windows Azure,
hosting, and service management envi-         we outline the pros and cons of the two
ronment for the Azure Services Platform.      deployment options available.
Together, MongoDB and Windows Azure




                                                               2
Figure 1: Sample JSON Document



{
	 "_id": ObjectId("504e4dd43796b3da50183991"),
	 "text": "Study Implicates Immune System in Parkinson’s Disease Pathogenesis http://bit.ly/duhe4P",
	 "source": "<a href="http://twitterfeed.com" rel="nofollow">twitterfeed</a>,
	 "coordinates": null,
	 "truncated": false,
	 "entities": {
		"urls": [{
			"indices": [
			67,
			87],
			"url": "http://bit.ly/duhe4P",
			"expanded_url": null
		}],
		"hashtags": []
	},
	 "retweeted": false,
	 "place": null,
	 "user": {
		"friends_count": 780,
		"created_at": "Fri Jan 08 17:40:11 +0000 2010",
		"description": "Latest medical news, articles, and features from Medscape Pathology.",
		"time_zone": "Eastern Time (US & Canada)",
		"url": "http://www.medscape.com/pathology",
		"screen_name": "MedscapePath",
		"utc_offset": -18000
	},
	 "favorited": false,
	 "in_reply_to_user_id": null,
	 "id": NumberLong("22819397000")
}




About MongoDB
MongoDB is an open source, document-           across documents and to adapt schemas as
oriented database. MongoDB bridges the         their applications evolve.
gap between key-value stores – which are
fast and scalable – and relational databas-    Unlike relational databases, MongoDB does
es – which have rich functionality. Instead    not use SQL syntax. Rather, MongoDB has
of storing data in tables and rows as one      a query language based on JSON. It also
would with a relational database, MongoDB      has drivers for most modern languages,
stores a binary form of JSON (BSON or          such as C#, Java, Ruby, Python, and many
‘binary JSON’ documents). An example of a      others. MongoDB’s flexible data model and
document is shown in Figure 1.                 support for modern programming languages
                                               simplify development and administration
The document serves as the fundamental         significantly.
unit within MongoDB (like a row in an
RDBMS); one can add fields (like a column
in an RDBMS), as well as nested fields and
embedded documents. Rather than impos-
ing a flat, rigid schema across an entire
table, the schema is implicit in what fields
are used in the documents. Thus, MongoDB
allows developers to have variable schemas



                                                               3
Figure 2: Replica Sets with MongoDB

                                                                             MongoDB Architecture
                                                                             MongoDB’s core capabilities deliver reli-
                                                                             ability, high availability, high performance,
                       Application                                           and scalability.

                                                                             Replication through replica sets provides for
                    Read             Write                                   high availability and data safety. A replica
                                                                             set is comprised of one primary node and
                                                                             some number of secondary nodes (de-
                           Primary                                           termined by the user). Figure 2 shows an
                                                                             example replica set, with one primary and
                                                                             two secondaries (a common deployment
                                                   Asynchronous
                        Secondary                                            model). By default, the primary node takes
                                                    Replication
                                                                             all reads and writes from the application;
                                                                             the secondaries replicate asynchronously in
                                                                             the background. If the primary node goes
                        Secondary
                                                                             down for any reason, one of the secondaries
                                                                             is automatically promoted to primary status
                         Automatic                                           and begins to take all reads and writes.
                       Leader Election                                       Replica sets help protect applications from
                                                                             hardware and data center-related down-
                                                                             time. Moreover, they make it easy for DBAs
                                                                             to conduct operational tasks, including
                                                                             software upgrades and hardware changes.




Figure 3: Sharding with MongoDB
                                                                             Sharding enables users to scale horizon-
                                                                             tally as their data volumes grow and/or as
      Shard A           Shard B              Shard C              Shard N    demands on their data stores grow. A shard
       0...30           31...60              61...90              n...n+30   is a subset of the database, kind of like a
                                                                             partition of the data. In Figure 3, Shard A

                                                          ...                contains documents 1-30; Shard B contains
                                                                             documents 31-60; and so on. One can
                                                                             choose any key on which to shard the col-
                  Horizontally Scalable                                      lection (e.g., user name), and MongoDB will
                                                                             automatically shard the data store based on
                                                                             this key. One can scale a database infi-
                                                                             nitely using sharding by adding new nodes
                                                                             to a cluster. When a new node is added,
                                                                             MongoDB recognizes it and redistributes
                                                                             the data across the cluster. Because shard-
                                                                             ing distributes both the actual data and
                                                                             therefore the load (i.e., traffic), it enables
                                                                             horizontal scalability as well as high per-
                                                                             formance.




                                                                  4
Figure 4: MongoDB Architecture




                                                             Application




                                                              mongos




                           Replica Set A     Replica Set B        Replica Set C              Replica Set N
                              0...30           31...60              61...90                    n...n+30


                             Primary           Primary             Primary                     Primary



                           Secondary         Secondary            Secondary                   Secondary



                           Secondary         Secondary            Secondary
                                                                                   ...        Secondary




An overview of the MongoDB architecture is shown in Figure                EC2, Azure VMs give users access to elastic, on-demand
4. In a multi-shard environment, the application communi-                 virtual servers. Users can install Windows or Linux on a VM
cates with mongos, an intermediary router that directs reads              and configure it based on their own preferences or their
and writes to the appropriate shard. Each shard is a replica              apps’ specific needs. Users manage the VMs themselves,
set, providing scalability, availability, and performance to              including scaling, installing security patches, and ongoing
                                                                          performance monitoring and management. Azure VMs give
developers.
                                                                          users a relatively significant degree of control over their
MongoDB was built for the cloud. Cloud services like Windows              environments, but by the same token require users to take
                                                                          on the VM management. Note: This service is currently in
Azure are therefore a natural fit for MongoDB. By coupling
                                                                          preview (beta).
MongoDB’s easy-to-scale architecture and Azure’s elastic
cloud capacity, users can quickly and easily build, scale, and          »» Windows Azure Cloud Services. Windows Azure Cloud
manage their applications.                                                Services (Worker Roles and Web Roles) is Microsoft’s
                                                                          Platform–as–a–Service (PaaS) offering. Similar to Heroku,
                                                                          Worker Roles provide users with prebuilt, preconfigured
About Windows Azure Services                                              instances of compute power. In contrast with Azure VMs,
                                                                          users do not have to configure or manage Azure Worker
for MongoDB                                                               Roles. Windows Azure handles the deployment details –
Windows Azure is Microsoft’s suite of cloud services, providing           from provisioning and load balancing to health monitoring
developers on-demand compute and storage to create, host                  for continuous availability. This can be helpful to some
and manage scalable and available web applications through                users who prefer not to manage their applications at the
Microsoft data centers. When deploying MongoDB to Windows                 infrastructure level, though it restricts the level of control
Azure, users can choose from two deployment options:                      users have over their environments.

»» Windows Azure Virtual Machines. Windows Azure
   Virtual Machines (VMs) is Microsoft’s Infrastructure-as-a-
   Service (IaaS) offering. Similar to Amazon Web Services


                                                                    5
Understanding the Deployment Options
Given that MongoDB can be deployed on either Windows                    Azure Virtual Machines may not always be the right fit for the
Azure Virtual Machines (IaaS) or Windows Azure Cloud Services           following reasons:
(PaaS), it is important for users to consider the different capa-
bilities and implementation details of each service to deter-           Increased Operational Effort. The increased control that Azure
mine which deployment model makes the most sense for their              Virtual Machines provide comes with increase effort, as well.
applications.                                                           Users must define and implement their own security measures,
                                                                        apply patches, and locate instances for fault tolerance. This
                                                                        consideration may be important for developers that lack expe-
Azure Virtual Machines                                                  rience managing their own infrastructure or for companies that
BASIC SETUP                                                             don’t have the operational bandwidth to devote to managing
After being granted access to the preview functionality for             this component of the stack.
Azure Virtual Machines, users can launch an instance and
install and configure MongoDB on it manually. Alternatively,            Beta. The Azure Virtual Machines service is still in
users can use the recently released Windows Azure installer for         Preview (beta).
MongoDB to set up a MongoDB replica set quickly and easily on
Windows Azure VMs.                                                      Azure Cloud Services
The installer is built on top of Windows PowerShell and the             BASIC SETUP
Windows Azure command line tool. It contains a number of                Users can also deploy MongoDB on Azure Cloud Services. To
deployment scripts. The tool is designed to help users get              do so, download the MongoDB Azure Worker Role package,
single or multi-node MongoDB configurations up and running              which is a preconfigured Worker Role with MongoDB. When
quickly. There are only two steps to installing and configur-           deployed, each replica set member runs as a separate Worker
ing a MongoDB replica set on Azure VMs. Note: the installer             Role instance; MongoDB data files are stored in Azure Cloud
is designed to run on a user’s local machine (i.e., not directly        Drives. For detailed instructions, visit the MongoDB wiki (wiki.
on an Azure VM), and then to deploy output to Windows Azure             mongodb.org).
VMs. To start, download the publish settings file. Next, run the
installer from the command prompt.                                      PROS AND CONS OF AZURE CLOUD SERVICES
                                                                        The pros and cons of running MongoDB on Azure Cloud
With Azure Virtual Machines, users can create their own VMs or          Services are generally consistent with those of using PaaS in
they can create a VM instance from one of several pre-installed         general, though there are some Azure-specific considerations.
operating system configurations. Both Windows and Linux are             Overall, Windows Azure Cloud Services decreases the opera-
supported on Azure Virtual Machines. To deploy MongoDB on               tional burden on users but affords them less control from an
Linux, visit the MongoDB wiki (wiki.mongodb.org) for step-by-           infrastructure configuration standpoint. The advantages of
step instructions.                                                      using Azure Cloud Services are as follows:

PROS AND CONS OF AZURE VIRTUAL MACHINES                                 »» Lower Operational Effort. Microsoft manages OS updates
                                                                           and security, decreasing the operational burden on the
The pros and cons of deploying MongoDB on Azure Virtual Ma-
                                                                           users.
chines are generally consistent with the considerations around
using IaaS more broadly. Overall, Azure Virtual Machines allow          »» Built-in Fault Tolerance.
                                                                                                   When deploying multiple
users to fine-tune their deployments but by the same token                 MongoDB worker role instances, Windows Azure
require increased operational effort.                                      automatically deploys the instances across multiple fault
                                                                           and update domains to guarantee better uptime.
The advantages of using MongoDB on Azure Virtual Machines
are as follows:                                                         »» Secure by Default. Microsoft takes measures to ensure that
                                                                           worker and web roles are secure. Endpoints on instances
»» Increased Control. Users have more control over their                   can be enabled for instance-to-instance communication
   infrastructural configuration relative to Azure Cloud                   without making them public. Thus, one can configure
   Services. For instance, they can install and configure                  MongoDB to be secure by enabling it only for other roles in
   services on the OS, define policies, etc. This consideration            the same deployment.
   may be important for enterprises that have regimented
   policies and processes for IT security and compliance.

»» OS Choice. Users can use Windows or Linux.




                                                                    6
Table 1: Pros and Cons Summary - Windows Azure Virtual Machines and
Windows Azure Cloud Services
                                                                                 By the same token, there are some aspects
                                                                                 of Azure Cloud Services that may be consid-
                                                                                 ered drawbacks:
                         PROS                           CONS
                                                                                 »» Windows Only. Worker Roles can only be
                                                                                    deployed with Windows; Linux is not an
IaaS –                  - Increased control            - Increased operational
                                                         effort                     option.
Windows                 - OS choice
Azure Virtual
Machines
                                                                                 »» Fixed OS Configuration. Users cannot
                                                                                    configure the OS, and must therefore
                                                                                    develop applications that run on the
                                                                                    pre-defined machine configurations
Initial                 - Lower operational effort     - Windows only
                                                                                    available.
Administrative          - Built-in fault tolerance     - Fixed OS
Effort                                                   configuration
                        - Secure by default                                      Table 1 summarizes the pros and cons of
                                                                                 using Windows Azure Worker Roles and
                                                                                 Windows Azure Virtual Machines.



                                                                                 Summary
                                                                                 MongoDB was built for ease of use, scal-
                                                                                 ability, availability, and performance, and
                                                                                 it’s quickly becoming an attractive alterna-
                                                                                 tive to relational databases. Windows Azure
                                                                                 provides a flexible cloud platform for host-
                                                                                 ing MongoDB, with two deployment models
                                                                                 to choose from. Developers and enterprises
                                                                                 looking at deploying MongoDB on Windows
                                                                                 Azure should consider the pros and cons
                                                                                 discussed here when evaluating which
                                                                                 option is most appropriate for them. We
                                                                                 hope that this paper helps customers better
                                                                                 understand these solutions, how they work,
                                                                                 and how to assess them.

                                                                                 To learn more about MongoDB and how
                                                                                 to deploy it in the cloud, or to speak
                                                                                 to a sales representative, please email
                                                                                 info@10gen.com.




                                                                7
New York 578 Broadway, New York, NY 10012 • London 5-25 Scrutton St., London EC2A 4HJ
info@10gen.com • US (866) 237-8815 • INTL +1 (650) 440-4474
Published by 10gen, Inc.
October 2012

More Related Content

What's hot

Altoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applicationsAltoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applicationsJeff Harris
 
LARGE SCALE IMAGE PROCESSING IN REAL-TIME ENVIRONMENTS WITH KAFKA
LARGE SCALE IMAGE PROCESSING IN REAL-TIME ENVIRONMENTS WITH KAFKA LARGE SCALE IMAGE PROCESSING IN REAL-TIME ENVIRONMENTS WITH KAFKA
LARGE SCALE IMAGE PROCESSING IN REAL-TIME ENVIRONMENTS WITH KAFKA csandit
 
O connor bosc2010
O connor bosc2010O connor bosc2010
O connor bosc2010BOSC 2010
 
Big_SQL_3.0_Whitepaper
Big_SQL_3.0_WhitepaperBig_SQL_3.0_Whitepaper
Big_SQL_3.0_WhitepaperScott Gray
 
Geo distributed parallelization pacts in map reduce
Geo distributed parallelization pacts in map reduceGeo distributed parallelization pacts in map reduce
Geo distributed parallelization pacts in map reduceeSAT Publishing House
 
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...IRJET Journal
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010BOSC 2010
 
An experimental evaluation of performance
An experimental evaluation of performanceAn experimental evaluation of performance
An experimental evaluation of performanceijcsa
 
Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01
Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01
Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01Lenovo Data Center
 
Facebook's TAO & Unicorn data storage and search platforms
Facebook's TAO & Unicorn data storage and search platformsFacebook's TAO & Unicorn data storage and search platforms
Facebook's TAO & Unicorn data storage and search platformsNitish Upreti
 
White Paper: Hadoop in Life Sciences — An Introduction
White Paper: Hadoop in Life Sciences — An Introduction   White Paper: Hadoop in Life Sciences — An Introduction
White Paper: Hadoop in Life Sciences — An Introduction EMC
 
Ibm watson - who what why
Ibm   watson - who what whyIbm   watson - who what why
Ibm watson - who what whyRick Bouter
 
Realtime hadoopsigmod2011
Realtime hadoopsigmod2011Realtime hadoopsigmod2011
Realtime hadoopsigmod2011iammutex
 
PERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCE
PERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCEPERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCE
PERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCEijdpsjournal
 

What's hot (18)

Altoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applicationsAltoros using no sql databases for interactive_applications
Altoros using no sql databases for interactive_applications
 
LARGE SCALE IMAGE PROCESSING IN REAL-TIME ENVIRONMENTS WITH KAFKA
LARGE SCALE IMAGE PROCESSING IN REAL-TIME ENVIRONMENTS WITH KAFKA LARGE SCALE IMAGE PROCESSING IN REAL-TIME ENVIRONMENTS WITH KAFKA
LARGE SCALE IMAGE PROCESSING IN REAL-TIME ENVIRONMENTS WITH KAFKA
 
O connor bosc2010
O connor bosc2010O connor bosc2010
O connor bosc2010
 
Lee oracle
Lee oracleLee oracle
Lee oracle
 
Big_SQL_3.0_Whitepaper
Big_SQL_3.0_WhitepaperBig_SQL_3.0_Whitepaper
Big_SQL_3.0_Whitepaper
 
Geo distributed parallelization pacts in map reduce
Geo distributed parallelization pacts in map reduceGeo distributed parallelization pacts in map reduce
Geo distributed parallelization pacts in map reduce
 
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...
 
Taylor bosc2010
Taylor bosc2010Taylor bosc2010
Taylor bosc2010
 
An experimental evaluation of performance
An experimental evaluation of performanceAn experimental evaluation of performance
An experimental evaluation of performance
 
4 026
4 0264 026
4 026
 
SQL CUDA
SQL CUDASQL CUDA
SQL CUDA
 
Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01
Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01
Demartek Lenovo Storage S3200 MS Exchange Evaluation_2016-01
 
Facebook's TAO & Unicorn data storage and search platforms
Facebook's TAO & Unicorn data storage and search platformsFacebook's TAO & Unicorn data storage and search platforms
Facebook's TAO & Unicorn data storage and search platforms
 
White Paper: Hadoop in Life Sciences — An Introduction
White Paper: Hadoop in Life Sciences — An Introduction   White Paper: Hadoop in Life Sciences — An Introduction
White Paper: Hadoop in Life Sciences — An Introduction
 
Ibm watson - who what why
Ibm   watson - who what whyIbm   watson - who what why
Ibm watson - who what why
 
Realtime hadoopsigmod2011
Realtime hadoopsigmod2011Realtime hadoopsigmod2011
Realtime hadoopsigmod2011
 
Hdfs design
Hdfs designHdfs design
Hdfs design
 
PERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCE
PERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCEPERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCE
PERFORMANCE EVALUATION OF BIG DATA PROCESSING OF CLOAK-REDUCE
 

Similar to MongoDB on Windows Azure

MongoDB on Windows Azure
MongoDB on Windows AzureMongoDB on Windows Azure
MongoDB on Windows AzureJeremy Taylor
 
Comparison between mongo db and cassandra using ycsb
Comparison between mongo db and cassandra using ycsbComparison between mongo db and cassandra using ycsb
Comparison between mongo db and cassandra using ycsbsonalighai
 
Benchmarking Couchbase Server for Interactive Applications
Benchmarking Couchbase Server for Interactive ApplicationsBenchmarking Couchbase Server for Interactive Applications
Benchmarking Couchbase Server for Interactive ApplicationsAltoros
 
Klevis Mino: MongoDB
Klevis Mino: MongoDBKlevis Mino: MongoDB
Klevis Mino: MongoDBCarlo Vaccari
 
Using In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the CloudUsing In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the CloudFrancesco Pagano
 
assignment3
assignment3assignment3
assignment3Kirti J
 
benchmarks-sigmod09
benchmarks-sigmod09benchmarks-sigmod09
benchmarks-sigmod09Hiroshi Ono
 
Predictive maintenance withsensors_in_utilities_
Predictive maintenance withsensors_in_utilities_Predictive maintenance withsensors_in_utilities_
Predictive maintenance withsensors_in_utilities_Tina Zhang
 
Mapreduce - Simplified Data Processing on Large Clusters
Mapreduce - Simplified Data Processing on Large ClustersMapreduce - Simplified Data Processing on Large Clusters
Mapreduce - Simplified Data Processing on Large ClustersAbhishek Singh
 
TCO Comparison MongoDB & Oracle
TCO Comparison MongoDB & OracleTCO Comparison MongoDB & Oracle
TCO Comparison MongoDB & OracleEl Taller Web
 

Similar to MongoDB on Windows Azure (20)

MongoDB on Windows Azure
MongoDB on Windows AzureMongoDB on Windows Azure
MongoDB on Windows Azure
 
Comparison between mongo db and cassandra using ycsb
Comparison between mongo db and cassandra using ycsbComparison between mongo db and cassandra using ycsb
Comparison between mongo db and cassandra using ycsb
 
A concept of dbms
A concept of dbmsA concept of dbms
A concept of dbms
 
H04502048051
H04502048051H04502048051
H04502048051
 
Benchmarking Couchbase Server for Interactive Applications
Benchmarking Couchbase Server for Interactive ApplicationsBenchmarking Couchbase Server for Interactive Applications
Benchmarking Couchbase Server for Interactive Applications
 
Klevis Mino: MongoDB
Klevis Mino: MongoDBKlevis Mino: MongoDB
Klevis Mino: MongoDB
 
mongodb tutorial
mongodb tutorialmongodb tutorial
mongodb tutorial
 
Using In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the CloudUsing In-Memory Encrypted Databases on the Cloud
Using In-Memory Encrypted Databases on the Cloud
 
Presentazione pagano1
Presentazione pagano1Presentazione pagano1
Presentazione pagano1
 
Eg4301808811
Eg4301808811Eg4301808811
Eg4301808811
 
assignment3
assignment3assignment3
assignment3
 
benchmarks-sigmod09
benchmarks-sigmod09benchmarks-sigmod09
benchmarks-sigmod09
 
hadoop-spark.ppt
hadoop-spark.ppthadoop-spark.ppt
hadoop-spark.ppt
 
Predictive maintenance withsensors_in_utilities_
Predictive maintenance withsensors_in_utilities_Predictive maintenance withsensors_in_utilities_
Predictive maintenance withsensors_in_utilities_
 
335 340
335 340335 340
335 340
 
Hadoop pig
Hadoop pigHadoop pig
Hadoop pig
 
Mapreduce - Simplified Data Processing on Large Clusters
Mapreduce - Simplified Data Processing on Large ClustersMapreduce - Simplified Data Processing on Large Clusters
Mapreduce - Simplified Data Processing on Large Clusters
 
Spark
SparkSpark
Spark
 
TCO Comparison MongoDB & Oracle
TCO Comparison MongoDB & OracleTCO Comparison MongoDB & Oracle
TCO Comparison MongoDB & Oracle
 
D04501036040
D04501036040D04501036040
D04501036040
 

More from Jeremy Taylor

MongoDB Schema Design -- Inboxes
MongoDB Schema Design -- InboxesMongoDB Schema Design -- Inboxes
MongoDB Schema Design -- InboxesJeremy Taylor
 
Building Your First App with MongoDB
Building Your First App with MongoDBBuilding Your First App with MongoDB
Building Your First App with MongoDBJeremy Taylor
 
Strategies For Backing Up Mongo Db 10.2012 Copy
Strategies For Backing Up Mongo Db 10.2012 CopyStrategies For Backing Up Mongo Db 10.2012 Copy
Strategies For Backing Up Mongo Db 10.2012 CopyJeremy Taylor
 
MongoDB Quick Reference Card
MongoDB Quick Reference CardMongoDB Quick Reference Card
MongoDB Quick Reference CardJeremy Taylor
 
Mongodb Introduction
Mongodb IntroductionMongodb Introduction
Mongodb IntroductionJeremy Taylor
 

More from Jeremy Taylor (6)

MongoDB Schema Design -- Inboxes
MongoDB Schema Design -- InboxesMongoDB Schema Design -- Inboxes
MongoDB Schema Design -- Inboxes
 
Building Your First App with MongoDB
Building Your First App with MongoDBBuilding Your First App with MongoDB
Building Your First App with MongoDB
 
Strategies For Backing Up Mongo Db 10.2012 Copy
Strategies For Backing Up Mongo Db 10.2012 CopyStrategies For Backing Up Mongo Db 10.2012 Copy
Strategies For Backing Up Mongo Db 10.2012 Copy
 
MongoDB Quick Reference Card
MongoDB Quick Reference CardMongoDB Quick Reference Card
MongoDB Quick Reference Card
 
AWS & MongoDB
AWS & MongoDBAWS & MongoDB
AWS & MongoDB
 
Mongodb Introduction
Mongodb IntroductionMongodb Introduction
Mongodb Introduction
 

MongoDB on Windows Azure

  • 1. MongoDB on Windows Azure A 10gen White Paper
  • 2. MongoDB on Windows Azure MongoDB on Windows Azure brings the provide customers the tools to build limit- power of the leading NoSQL database lessly scalable applications in the cloud. to Microsoft’s flexible, open, and scalable cloud. This paper begins with an overview of MongoDB. Next, we describe the two MongoDB is an open source, document- primary deployment options available on oriented database designed with scalability Microsoft’s cloud platform, Windows Azure and developer agility in mind. Windows Virtual Machines and Windows Azure Cloud Azure is the cloud services operating sys- Services. Finally, to help those evaluating tem that provides the development, service deploying MongoDB on Windows Azure, hosting, and service management envi- we outline the pros and cons of the two ronment for the Azure Services Platform. deployment options available. Together, MongoDB and Windows Azure 2
  • 3. Figure 1: Sample JSON Document { "_id": ObjectId("504e4dd43796b3da50183991"), "text": "Study Implicates Immune System in Parkinson’s Disease Pathogenesis http://bit.ly/duhe4P", "source": "<a href="http://twitterfeed.com" rel="nofollow">twitterfeed</a>, "coordinates": null, "truncated": false, "entities": { "urls": [{ "indices": [ 67, 87], "url": "http://bit.ly/duhe4P", "expanded_url": null }], "hashtags": [] }, "retweeted": false, "place": null, "user": { "friends_count": 780, "created_at": "Fri Jan 08 17:40:11 +0000 2010", "description": "Latest medical news, articles, and features from Medscape Pathology.", "time_zone": "Eastern Time (US & Canada)", "url": "http://www.medscape.com/pathology", "screen_name": "MedscapePath", "utc_offset": -18000 }, "favorited": false, "in_reply_to_user_id": null, "id": NumberLong("22819397000") } About MongoDB MongoDB is an open source, document- across documents and to adapt schemas as oriented database. MongoDB bridges the their applications evolve. gap between key-value stores – which are fast and scalable – and relational databas- Unlike relational databases, MongoDB does es – which have rich functionality. Instead not use SQL syntax. Rather, MongoDB has of storing data in tables and rows as one a query language based on JSON. It also would with a relational database, MongoDB has drivers for most modern languages, stores a binary form of JSON (BSON or such as C#, Java, Ruby, Python, and many ‘binary JSON’ documents). An example of a others. MongoDB’s flexible data model and document is shown in Figure 1. support for modern programming languages simplify development and administration The document serves as the fundamental significantly. unit within MongoDB (like a row in an RDBMS); one can add fields (like a column in an RDBMS), as well as nested fields and embedded documents. Rather than impos- ing a flat, rigid schema across an entire table, the schema is implicit in what fields are used in the documents. Thus, MongoDB allows developers to have variable schemas 3
  • 4. Figure 2: Replica Sets with MongoDB MongoDB Architecture MongoDB’s core capabilities deliver reli- ability, high availability, high performance, Application and scalability. Replication through replica sets provides for Read Write high availability and data safety. A replica set is comprised of one primary node and some number of secondary nodes (de- Primary termined by the user). Figure 2 shows an example replica set, with one primary and two secondaries (a common deployment Asynchronous Secondary model). By default, the primary node takes Replication all reads and writes from the application; the secondaries replicate asynchronously in the background. If the primary node goes Secondary down for any reason, one of the secondaries is automatically promoted to primary status Automatic and begins to take all reads and writes. Leader Election Replica sets help protect applications from hardware and data center-related down- time. Moreover, they make it easy for DBAs to conduct operational tasks, including software upgrades and hardware changes. Figure 3: Sharding with MongoDB Sharding enables users to scale horizon- tally as their data volumes grow and/or as Shard A Shard B Shard C Shard N demands on their data stores grow. A shard 0...30 31...60 61...90 n...n+30 is a subset of the database, kind of like a partition of the data. In Figure 3, Shard A ... contains documents 1-30; Shard B contains documents 31-60; and so on. One can choose any key on which to shard the col- Horizontally Scalable lection (e.g., user name), and MongoDB will automatically shard the data store based on this key. One can scale a database infi- nitely using sharding by adding new nodes to a cluster. When a new node is added, MongoDB recognizes it and redistributes the data across the cluster. Because shard- ing distributes both the actual data and therefore the load (i.e., traffic), it enables horizontal scalability as well as high per- formance. 4
  • 5. Figure 4: MongoDB Architecture Application mongos Replica Set A Replica Set B Replica Set C Replica Set N 0...30 31...60 61...90 n...n+30 Primary Primary Primary Primary Secondary Secondary Secondary Secondary Secondary Secondary Secondary ... Secondary An overview of the MongoDB architecture is shown in Figure EC2, Azure VMs give users access to elastic, on-demand 4. In a multi-shard environment, the application communi- virtual servers. Users can install Windows or Linux on a VM cates with mongos, an intermediary router that directs reads and configure it based on their own preferences or their and writes to the appropriate shard. Each shard is a replica apps’ specific needs. Users manage the VMs themselves, set, providing scalability, availability, and performance to including scaling, installing security patches, and ongoing performance monitoring and management. Azure VMs give developers. users a relatively significant degree of control over their MongoDB was built for the cloud. Cloud services like Windows environments, but by the same token require users to take on the VM management. Note: This service is currently in Azure are therefore a natural fit for MongoDB. By coupling preview (beta). MongoDB’s easy-to-scale architecture and Azure’s elastic cloud capacity, users can quickly and easily build, scale, and »» Windows Azure Cloud Services. Windows Azure Cloud manage their applications. Services (Worker Roles and Web Roles) is Microsoft’s Platform–as–a–Service (PaaS) offering. Similar to Heroku, Worker Roles provide users with prebuilt, preconfigured About Windows Azure Services instances of compute power. In contrast with Azure VMs, users do not have to configure or manage Azure Worker for MongoDB Roles. Windows Azure handles the deployment details – Windows Azure is Microsoft’s suite of cloud services, providing from provisioning and load balancing to health monitoring developers on-demand compute and storage to create, host for continuous availability. This can be helpful to some and manage scalable and available web applications through users who prefer not to manage their applications at the Microsoft data centers. When deploying MongoDB to Windows infrastructure level, though it restricts the level of control Azure, users can choose from two deployment options: users have over their environments. »» Windows Azure Virtual Machines. Windows Azure Virtual Machines (VMs) is Microsoft’s Infrastructure-as-a- Service (IaaS) offering. Similar to Amazon Web Services 5
  • 6. Understanding the Deployment Options Given that MongoDB can be deployed on either Windows Azure Virtual Machines may not always be the right fit for the Azure Virtual Machines (IaaS) or Windows Azure Cloud Services following reasons: (PaaS), it is important for users to consider the different capa- bilities and implementation details of each service to deter- Increased Operational Effort. The increased control that Azure mine which deployment model makes the most sense for their Virtual Machines provide comes with increase effort, as well. applications. Users must define and implement their own security measures, apply patches, and locate instances for fault tolerance. This consideration may be important for developers that lack expe- Azure Virtual Machines rience managing their own infrastructure or for companies that BASIC SETUP don’t have the operational bandwidth to devote to managing After being granted access to the preview functionality for this component of the stack. Azure Virtual Machines, users can launch an instance and install and configure MongoDB on it manually. Alternatively, Beta. The Azure Virtual Machines service is still in users can use the recently released Windows Azure installer for Preview (beta). MongoDB to set up a MongoDB replica set quickly and easily on Windows Azure VMs. Azure Cloud Services The installer is built on top of Windows PowerShell and the BASIC SETUP Windows Azure command line tool. It contains a number of Users can also deploy MongoDB on Azure Cloud Services. To deployment scripts. The tool is designed to help users get do so, download the MongoDB Azure Worker Role package, single or multi-node MongoDB configurations up and running which is a preconfigured Worker Role with MongoDB. When quickly. There are only two steps to installing and configur- deployed, each replica set member runs as a separate Worker ing a MongoDB replica set on Azure VMs. Note: the installer Role instance; MongoDB data files are stored in Azure Cloud is designed to run on a user’s local machine (i.e., not directly Drives. For detailed instructions, visit the MongoDB wiki (wiki. on an Azure VM), and then to deploy output to Windows Azure mongodb.org). VMs. To start, download the publish settings file. Next, run the installer from the command prompt. PROS AND CONS OF AZURE CLOUD SERVICES The pros and cons of running MongoDB on Azure Cloud With Azure Virtual Machines, users can create their own VMs or Services are generally consistent with those of using PaaS in they can create a VM instance from one of several pre-installed general, though there are some Azure-specific considerations. operating system configurations. Both Windows and Linux are Overall, Windows Azure Cloud Services decreases the opera- supported on Azure Virtual Machines. To deploy MongoDB on tional burden on users but affords them less control from an Linux, visit the MongoDB wiki (wiki.mongodb.org) for step-by- infrastructure configuration standpoint. The advantages of step instructions. using Azure Cloud Services are as follows: PROS AND CONS OF AZURE VIRTUAL MACHINES »» Lower Operational Effort. Microsoft manages OS updates and security, decreasing the operational burden on the The pros and cons of deploying MongoDB on Azure Virtual Ma- users. chines are generally consistent with the considerations around using IaaS more broadly. Overall, Azure Virtual Machines allow »» Built-in Fault Tolerance. When deploying multiple users to fine-tune their deployments but by the same token MongoDB worker role instances, Windows Azure require increased operational effort. automatically deploys the instances across multiple fault and update domains to guarantee better uptime. The advantages of using MongoDB on Azure Virtual Machines are as follows: »» Secure by Default. Microsoft takes measures to ensure that worker and web roles are secure. Endpoints on instances »» Increased Control. Users have more control over their can be enabled for instance-to-instance communication infrastructural configuration relative to Azure Cloud without making them public. Thus, one can configure Services. For instance, they can install and configure MongoDB to be secure by enabling it only for other roles in services on the OS, define policies, etc. This consideration the same deployment. may be important for enterprises that have regimented policies and processes for IT security and compliance. »» OS Choice. Users can use Windows or Linux. 6
  • 7. Table 1: Pros and Cons Summary - Windows Azure Virtual Machines and Windows Azure Cloud Services By the same token, there are some aspects of Azure Cloud Services that may be consid- ered drawbacks: PROS CONS »» Windows Only. Worker Roles can only be deployed with Windows; Linux is not an IaaS – - Increased control - Increased operational effort option. Windows - OS choice Azure Virtual Machines »» Fixed OS Configuration. Users cannot configure the OS, and must therefore develop applications that run on the pre-defined machine configurations Initial - Lower operational effort - Windows only available. Administrative - Built-in fault tolerance - Fixed OS Effort configuration - Secure by default Table 1 summarizes the pros and cons of using Windows Azure Worker Roles and Windows Azure Virtual Machines. Summary MongoDB was built for ease of use, scal- ability, availability, and performance, and it’s quickly becoming an attractive alterna- tive to relational databases. Windows Azure provides a flexible cloud platform for host- ing MongoDB, with two deployment models to choose from. Developers and enterprises looking at deploying MongoDB on Windows Azure should consider the pros and cons discussed here when evaluating which option is most appropriate for them. We hope that this paper helps customers better understand these solutions, how they work, and how to assess them. To learn more about MongoDB and how to deploy it in the cloud, or to speak to a sales representative, please email info@10gen.com. 7
  • 8. New York 578 Broadway, New York, NY 10012 • London 5-25 Scrutton St., London EC2A 4HJ info@10gen.com • US (866) 237-8815 • INTL +1 (650) 440-4474
  • 9. Published by 10gen, Inc. October 2012