Tuesday 19 February 2013

An Introduction to MongoDB

MongoDB is a Data Base Management System for web applications and Internet Infrastructure.

It is a scalable, high performance, open source NoSQL database written in C++.

It is a very attractive database. Because of it's intuitive data model.

It also has the capability to represent rich hierarchical data structures. as it has a document data model instead of a relational data model.  
One other instance for a DB using a Document Data Model can be mentioned as Apache's CouchDB.

The disadvantages of relational data bases are not a problem to document based databases. Some of the disadvantages of relational databases can be mentioned as follows;

  • Lots of complicated multi-table joins are there in relational data bases. 
  • Need to write complicated queries with lot of joins to retrieve data.

If we use a database which has a document data model such as MongoDB, then we have the capability to work with data base objects as a whole rather than working with lots of different tables with joins to each other.

For an example lets say we maintain a "Plantations" database.

In a relational database we have to deal with fertilizers for each plant at each stage, climatic conditions, structural configurations, images, suppliers etc. Those will be different tables in our relational database. But if we deal with a document based database, then we'll have the ability to deal with different plantation objects as a whole rather than considering each and every table separately.

As today's developers are mostly using object oriented languages for developments then it will be very effective to use Object Oriented Databases, so that their objects can be directly mapped to their databases.

The other feature we can have from MongoDB is adhoc querying. Although relational DBs support this, not all DBs support this. In adhoc querying  it is not necessary to define in advance about the types of queries the system will accept.

More over MongoDB supports secondary indexes. Secondary indexing is something like having some piece of information for indexing other than a primary key. For an example lets take a "Student" database and we need to search students using their last name. So that the Student_ID here is the primary key and the last_name will be a secondary key for indexing. B tree is the structure for MongoDB's secondary indexes.

The other key features of MongoDB can be mentioned as Replication support, support for speed and durability. MongoDBs have journals which being defaultly enabled, so that the data files can be recovered.

MongoDB supports scalability. Unlike vertical scaling done by single node expansion through hardware, it supports horizontal scaling where it provides the capability to distribute the database across multiple machines.

There are some advantages of scaling horizontally.Some of them can be mentioned as bellows;

1. Ability to get the use of commodity hardware.
2. Cost for hosting total data set can be reduced.
3. Reduce the consequences of failure.    

MongoDB has a range based partitioning method. It is called Auto Sharding. In here what happens is that the data is automatically distributed among the network nodes. 

MongoDB's core server and tools can operate on Mac OSX, Windows and Linux OSs. It's Java Script shell can be used to administer the database and manipulate data.

So ultimately MogoDB can be selected over other data bases due to its high scalability and fastness.