Map-Reduce in MongoDB

25 / Oct / 2014 by Puneet Behl 0 comments

Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. Basically, in MongoDB map-reduce contains two JavaScript functions map and reduce.

  • The map functions emits key-value pair.
  • And for those keys which have multiple values reduce function collect and condenses aggregated data.

Let’s discuss the problem from one of our project for better understanding of how and where we can use map-reduce.

The Problem
We need to find the number of patient visiting to Medical Practice (Hospital/Clinic) by their visit reason. The Visit document structure is:

[sourcecode language=”java”]
{
"_id" : NumberLong(1),
"name":"Amelia Watson",
"visitReason" : "Broken leg, Fever"
}
[/sourcecode]

Insert some documents into database.

[sourcecode language=”java”]
db.visit.insert({name:"John Doe", visitReason:"Cold, Cough"});
db.visit.insert({name: "Amenda", visitReason:"Regular Health Check-up"});
db.visit.insert({name: "William", visitReason:"Regular Health Check-up"});
db.visit.insert({name: "Mark", visitReason:"High Blood Pressure, Fever"});
db.visit.insert({name: "Milly", visitReason:"Bleeding gums"});
db.visit.insert({name: "Bosco", visitReason:"Food poisoning"});
db.visit.insert({name: "Bart", visitReason:"Migrane"});
db.visit.insert({name: "Harry", visitReason:"Headache"});
db.visit.insert({name: "Nova", visitReason:"Skin test"});
db.visit.insert({name: "Sunny", visitReason:"Cold"});
db.visit.insert({name: "Leelo", visitReason:"Fever"});
db.visit.insert({name: "Martin", visitReason:"Joint problem"});
[/sourcecode]

Now we’ve populated some data in database, so let’s see how to write MongoDB query using map-reduce to get the solution for above problem. Below is the map function to emit data:

[sourcecode language=”java”]
function() {
if(this.visitReason != undefined){
this.visitReason.split(‘,’).forEach(function (v) {
emit(v.trim(), 1);
});
}
}
[/sourcecode]

The above map will emit data for each visit reason as key with count 1 as value. Now, the reduce function will group all the visit reason (key-value pair) and calculate sum of all the count values of same visit reason. Here is the reduce function:

[sourcecode language=”java”]
function (key, values) {
return Array.sum(values);
}
[/sourcecode]

The complete MongoDB query would be:

[sourcecode language=”java”]
db.visit.mapReduce(
function() {
if(this.visitReason != undefined){
this.visitReason.split(‘,’).forEach(function (v) {
emit(v.trim(), 1);
});
}
},
function (key, values) {
return Array.sum(values);
},
{out: "dbStats"}
).find();
[/sourcecode]
Please note that dbStats  in {out: "dbStats"} is the name of the collection  where the operation outputs the result.

The output of above MongoDB query is:
[sourcecode language=”java”]
{ "_id" : "Bleeding gums", "value" : 15 }
{ "_id" : "Broken knee", "value" : 5 }
{ "_id" : "Broken leg", "value" : 12 }
{ "_id" : "Cough", "value" : 9 }
{ "_id" : "Fever with Pain", "value" : 8 }
{ "_id" : "Health Checkup", "value" : 18 }
{ "_id" : "High fever", "value" : 8 }
{ "_id" : "Low Blood Pressure", "value" : 1 }
[/sourcecode]

FOUND THIS USEFUL? SHARE IT

Leave a Reply

Your email address will not be published. Required fields are marked *