By continuing to navigate on this website, you accept the use of cookies to serve you more relevant services & content .
For more information and to change the setting of cookies on your computer, please read our Cookie Policy.

group() vs aggregation framework vs MapReduce in mongodb

The group() command, Aggregation Framework and MapReduce are collectively aggregation features of MongoDB. group(): Group Performs simple aggregation operations on a collection documents. Group is similar to GROUP_BY in mysql. Output format : Returns result set inline. Sharding: Its not support in shared environment. Limitations:

  • Will not group into a result set with more than 20,000 keys.(from mongo 2.2 version, in before versions limit is up to 10,000 keys)
  • Results must fit within the limitations of a BSON document (currently 16MB).
  • Takes a read lock and does not allow any other threads to execute JavaScript while it is running.

MapReduce():

  • Can be used for incremental aggregation over large collections.
  • There have been significant improvements in Map/Reduce in MongoDB version 2.4. The SpiderMonkey JavaScript engine has been replaced by the V8 JavaScript engine, and there is no longer a global JavaScript lock, which means that multiple Map/Reduce threads can run concurrently.

Output format: MapReduce provides inline, new collection, merge, replace, reduce output options. Sharding: Its supports for both shared and non-shared collections as input and output.If output collection does not exists then MapReduce creates and shards the collection on _id field. Limitations:

  • In MapReduce inline output collection we can't perform find(), sort(), limit() operations.
  • A single emit can only hold half of MongoDB's maximum BSON document size (16MB).
  • The Map/Reduce engine is still considerably slower than the aggregation framework, for two main reasons: (1)The JavaScript engine is interpreted, while the Aggregation Framework runs compiled C++ code.(2)The JavaScript engine still requires that every document being examined get converted from BSON to JSON; if you're saving the output in a collection, the result set must then be converted from JSON back to BSON.

Aggregation Framework:

  • New feature in the MongoDB 2.2.0 production release
  • Uses a "pipeline" approach where objects are transformed as they pass through a series of pipeline operators such as match, project, sort, group, limit, skip, unwind and geonear.

Output format: Returns result set inline. Sharding: Its supports for both shared and non-shared input collections.When operating with shared collections,It push all operations up to first $group or $sort to all shards,The remaining operations from first $group or $sort are run as second pipeline on shared results.

  • Designed with specific goals of improving performance and usability.
  • Pipeline operators can be repeated as needed.
  • Aggregation frame work is 10 times faster than MapReduce.

Limitations:

  • If any single aggregation operation consumes more than 10 percent of system RAM
  • Output from the pipeline cannot exceed the BSON document size limit.
  • The aggregation pipeline cannot operate on values of the following types: symbol, Minkey, Maxkey, DBRef, Code, CodeWScope
    Posted On
  • 21 October 2012
  • By
  • Micropyramid

Need any Help in your Project?Let's Talk

Latest Comments
Related Articles
Celery Flower to monitor task queue

Celery is a task queue that is to built an asynchronous message passing system. It can be used as a bucket where programming tasks can ...

Continue Reading...
Multifactor Authentication with Django MFA using Google Authenticator

Use Django Multi-Factor Authentication method to verify user identity with more than one authentication methods. It can be used for user login, any transactional methods ...

Continue Reading...
MongoDB CRUD operations with Python (Pymongo)

MongoDB with Python - Connection establishment, Create, Update, Retrieve and Delete operations explained with sample code.

Continue Reading...
open source packages

Subscribe To our news letter

Subscribe and Stay Updated about our Webinars, news and articles on Django, Python, Machine Learning, Amazon Web Services, DevOps, Salesforce, ReactJS, AngularJS, React Native.
* We don't provide your email contact details to any third parties