Short answer: use the MongoDB aggregation pipeline. For any grouping, rollup, or analytics query in MongoDB today, the aggregation pipeline (db.collection.aggregate()) is the correct and recommended tool. The legacy db.collection.group() command was removed in MongoDB 4.2 (2019) and no longer exists. Map-reduce has been deprecated since MongoDB 5.0 (2021) - it still runs, but it is slower, harder to read, and earmarked for eventual removal. So the modern answer is not 'pick one of three equals'; it is: reach for the pipeline by default, and only fall back to map-reduce in the rare case where you need arbitrary JavaScript the pipeline genuinely cannot express.
This 2026 guide explains why each feature existed, gives the clear verdict, and shows real $group examples, a map-reduce-to-pipeline migration, and both mongosh and Python (PyMongo) code.
Key takeaways
db.collection.group()is gone. It was deprecated in MongoDB 3.4 and removed in 4.2. Do not write new code with it - it will simply fail to run on any supported server.- Map-reduce is deprecated (since 5.0). It still functions for backward compatibility, but it runs interpreted JavaScript and converts BSON to JSON and back, making it markedly slower than the pipeline.
- The aggregation pipeline is the standard. It runs natively in compiled C++, uses your indexes, works on sharded collections, and expresses grouping with
$groupplus accumulators such as$sum,$avg, and$push. - Materialize results with
$mergeor$out. These replace map-reduce's output-to-collection modes when you need to persist aggregated data. - Modern analytics stages -
$setWindowFields(window functions),$facet(multi-pass), and$bucket(histograms) - cover work that once forced people into map-reduce.
A short history of grouping in MongoDB
Grouping in MongoDB evolved through three generations:
group()(early MongoDB, removed in 4.2) - the original command, conceptually similar to SQLGROUP BY. It ran server-side JavaScript, took a global JS read lock, returned results inline (capped by the 16MB BSON document limit), and did not work on sharded clusters.- Map-reduce (added early, deprecated in 5.0) - a more flexible map/
emit/reduce model for large or incremental aggregations, with several output modes. Powerful but slow, because it interprets JavaScript and serializes documents between BSON and JSON. - Aggregation pipeline (introduced in 2.2, 2012; the standard today) - a declarative sequence of stages that the server compiles and optimizes, uses indexes for, and runs natively. Every release since has expanded it with new stages and operators.
The trajectory is clear: MongoDB has consolidated all grouping and analytics onto the pipeline.
group() vs map-reduce vs aggregation pipeline at a glance
| Aspect | db.collection.group() |
Map-Reduce | Aggregation Pipeline |
|---|---|---|---|
| Status (2026) | Removed in 4.2 | Deprecated since 5.0 | Recommended, actively developed |
| Execution engine | Interpreted JavaScript | Interpreted JavaScript | Native compiled C++ |
| Performance | Slow (global JS lock) | Slow (JS + BSON to JSON) | Fast |
| Uses indexes | Limited | Limited | Yes (esp. $match/$sort early) |
| Sharded collections | Not supported | Supported (input and output) | Supported |
| Output | Inline only (16MB cap) | Inline, collection, merge, reduce | Cursor; $out/$merge to a collection |
| Expressiveness | Basic | Arbitrary JavaScript | 30+ stages, 150+ operators |
| Use it today? | No (it no longer exists) | Only for legacy jobs | Yes - default choice |
group(): removed in 4.2
group() was the first-generation grouping command, similar to SQL GROUP BY. Because it relied on server-side JavaScript with a global lock, returned everything inline, and could not run on sharded clusters, it was deprecated in 3.4 and fully removed in MongoDB 4.2. There is no reason - and no way - to use it on a current server.
Map-reduce: deprecated since 5.0
Map-reduce uses a map function that calls emit(key, value), a reduce function that folds values per key, and an optional finalize step. It supports incremental output to a collection, which historically made it attractive for large rollups. But it is deprecated as of MongoDB 5.0: the engine interprets JavaScript and converts each document from BSON to JSON (and back when writing output), so it is consistently slower than the pipeline. MongoDB's own guidance is to rewrite map-reduce jobs as aggregation pipelines.
Aggregation pipeline: the standard
The aggregation pipeline passes documents through an ordered series of stages ($match, $group, $project, $sort, and many more). The query planner can reorder and merge stages, push $match/$sort down to use indexes, and stream results through a cursor. It is the foundation for grouping, reporting, and analytics in MongoDB today. For broader query patterns, see our guide to advanced queries in MongoDB.
// Total sales, average, and distinct products per customer (mongosh)
db.orders.aggregate([
{ $match: { status: "complete" } }, // filter first - can use an index on "status"
{ $group: {
_id: "$custId", // the group key
totalSpent: { $sum: "$amount" },
avgOrder: { $avg: "$amount" },
maxOrder: { $max: "$amount" },
minOrder: { $min: "$amount" },
orderCount: { $sum: 1 }, // classic count idiom
products: { $addToSet: "$sku" } // distinct SKUs this customer bought
} },
{ $sort: { totalSpent: -1 } }
])How to group with the aggregation pipeline
Grouping centers on the $group stage, but a good pipeline does more than group - it filters first, reshapes after, and can compute analytics in the same pass.
Filter early with $match
Put $match before $group so the server can use an index and shrink the document set before the heavier grouping work. Early $match/$sort stages are the single biggest performance lever in a pipeline.
Group with $group and accumulators
The $group stage takes an _id (the group key - use null to aggregate the whole collection) and one or more accumulator expressions:
| Accumulator | What it does |
|---|---|
$sum |
Adds values; { $sum: 1 } counts documents |
$avg |
Mean of numeric values |
$min / $max |
Smallest / largest value |
$push |
Collects all values into an array |
$addToSet |
Collects distinct values into an array |
$first / $last |
First / last value (respect input sort order) |
$count |
Counts documents in the group (5.0+) |
$accumulator |
Custom accumulator written in JavaScript (4.4+) - an escape hatch |
Reshape with $project and $addFields
After grouping, use $project (whitelist or compute fields) or $addFields (add without dropping the rest) to format the output - rename _id, round numbers, or derive ratios.
Buckets, facets, and window functions
$bucket/$bucketAutobuild histograms by grouping documents into ranges (added in 3.4).$facetruns several sub-pipelines over the same input in one pass - perfect for dashboards that need totals plus a breakdown at once.$setWindowFields(added in 5.0) computes running totals, moving averages, and$rankover partitions without collapsing rows - the analytics that previously pushed people toward map-reduce.$countas a stage returns the number of documents reaching it.
Because every stage runs natively and can leverage indexes, the pipeline is the right home for grouping at scale. It pairs naturally with other server features such as full-text search in MongoDB.
// Monthly revenue per region with a running cumulative total, saved to a collection
db.sales.aggregate([
{ $match: { orderDate: { $gte: ISODate("2025-01-01") } } },
{ $group: {
_id: { region: "$region", month: { $month: "$orderDate" } },
revenue: { $sum: "$amount" },
orders: { $sum: 1 }
} },
{ $setWindowFields: {
partitionBy: "$_id.region",
sortBy: { "_id.month": 1 },
output: {
cumulativeRevenue: {
$sum: "$revenue",
window: { documents: ["unbounded", "current"] }
}
}
} },
{ $sort: { "_id.region": 1, "_id.month": 1 } },
// Persist the rollup (replaces map-reduce's output-to-collection modes):
{ $merge: { into: "monthly_region_revenue", whenMatched: "replace", whenNotMatched: "insert" } }
])Migrating a map-reduce job to an aggregation pipeline
Most map-reduce jobs map directly onto the pipeline. The common pattern - emit(key, value) then a reduce that sums the values - is just $group with $sum. Map-reduce's out collection becomes $out (replace) or $merge (upsert/merge).
The example below computes per-customer order totals. The pipeline version is shorter, runs natively, and uses an index on status for the initial filter. If you are modernizing a legacy MongoDB data layer or moving between databases, our database migration services can help plan and execute the change safely.
// BEFORE - deprecated map-reduce (works, but slow and discouraged)
db.orders.mapReduce(
function () { emit(this.cust_id, this.price); }, // map
function (key, values) { return Array.sum(values); }, // reduce
{ query: { status: "A" }, out: "order_totals" }
);
// AFTER - equivalent aggregation pipeline (recommended)
db.orders.aggregate([
{ $match: { status: "A" } },
{ $group: { _id: "$cust_id", total: { $sum: "$price" } } },
{ $merge: "order_totals" } // use { $out: "order_totals" } to fully replace instead
]);Grouping with Python (PyMongo)
From Python, call collection.aggregate() with the same list of stages you would use in mongosh - keys become strings and ISODate becomes a datetime. The driver returns a cursor you can iterate. For the basics of connecting and reading data, see MongoDB CRUD operations with Python (PyMongo).
from pymongo import MongoClient
client = MongoClient("mongodb://localhost:27017/")
db = client.shop
pipeline = [
{"$match": {"status": "complete"}}, # filter first (index-friendly)
{"$group": {
"_id": "$custId",
"totalSpent": {"$sum": "$amount"},
"avgOrder": {"$avg": "$amount"},
"orderCount": {"$sum": 1},
"products": {"$addToSet": "$sku"},
}},
{"$sort": {"totalSpent": -1}},
{"$limit": 10},
]
for doc in db.orders.aggregate(pipeline):
print(doc["_id"], doc["totalSpent"], doc["orderCount"])
# Persist results to a collection instead of streaming them:
# pipeline.append({"$merge": "top_customers"})
# db.orders.aggregate(pipeline)When (if ever) should you still use map-reduce?
Almost never on a modern deployment. The aggregation pipeline now covers grouping, bucketing, faceting, and window analytics, and the $accumulator and $function operators (added in 4.4) let you drop into custom JavaScript inside a pipeline for the rare cases that genuinely need it. Practically, the only reasons to touch map-reduce today are maintaining an old job you have not yet rewritten, or supporting a server old enough that a stage you need is unavailable. New work should use the pipeline.
If you are building data-heavy MongoDB features in a Python backend, our Django development services team builds and tunes these pipelines in production.
Frequently Asked Questions
Is db.collection.group() still available in MongoDB?
No. db.collection.group() was deprecated in MongoDB 3.4 and removed entirely in MongoDB 4.2 (2019). It does not exist on any currently supported server, so any code calling it will fail. Use the aggregation pipeline's $group stage instead.
Is MongoDB map-reduce deprecated?
Yes. Map-reduce has been deprecated since MongoDB 5.0 (2021). It still runs for backward compatibility, but it is slower than the aggregation pipeline because it interprets JavaScript and converts documents between BSON and JSON. MongoDB recommends rewriting map-reduce jobs as aggregation pipelines.
What replaced map-reduce in MongoDB?
The aggregation pipeline replaced map-reduce. Grouping is done with $group and accumulators like $sum and $avg, and results can be written to a collection with $out (replace) or $merge (insert/update/merge), which cover map-reduce's output modes.
How do I group documents in the aggregation pipeline?
Use the $group stage with an _id field as the group key and one or more accumulators - for example { $group: { _id: "$custId", total: { $sum: "$amount" }, count: { $sum: 1 } } }. Add a $match stage before $group to filter early and let the query use indexes.
Is the aggregation pipeline faster than map-reduce?
Yes, usually by a wide margin. The aggregation pipeline runs natively in compiled C++, can use indexes, and streams documents, while map-reduce interprets JavaScript and serializes each document to JSON and back. For the same grouping task, the pipeline is typically several times faster.
How do I save aggregation results to a collection?
Add a $out or $merge stage at the end of the pipeline. $out replaces the target collection with the pipeline's output, while $merge can insert, replace, or merge documents into an existing collection (including a sharded one), making it the flexible successor to map-reduce's output-to-collection behavior.