Map-Reduce in MongoDB

25 / Oct / 2014 by Puneet Behl 0 comments

Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. Basically, in MongoDB map-reduce contains two JavaScript functions map and reduce.

  • The map functions emits key-value pair.
  • And for those keys which have multiple values reduce function collect and condenses aggregated data.

Let’s discuss the problem from one of our project for better understanding of how and where we can use map-reduce.

The Problem
We need to find the number of patient visiting to Medical Practice (Hospital/Clinic) by their visit reason. The Visit document structure is:

{
	"_id" : NumberLong(1),
        "name":"Amelia Watson",
	"visitReason" : "Broken leg, Fever"
}

Insert some documents into database.

db.visit.insert({name:"John Doe", visitReason:"Cold, Cough"});
db.visit.insert({name: "Amenda", visitReason:"Regular Health Check-up"});
db.visit.insert({name: "William", visitReason:"Regular Health Check-up"});
db.visit.insert({name: "Mark", visitReason:"High Blood Pressure, Fever"});
db.visit.insert({name: "Milly", visitReason:"Bleeding gums"});
db.visit.insert({name: "Bosco", visitReason:"Food poisoning"});
db.visit.insert({name: "Bart", visitReason:"Migrane"});
db.visit.insert({name: "Harry", visitReason:"Headache"});
db.visit.insert({name: "Nova", visitReason:"Skin test"});
db.visit.insert({name: "Sunny", visitReason:"Cold"});
db.visit.insert({name: "Leelo", visitReason:"Fever"});
db.visit.insert({name: "Martin", visitReason:"Joint problem"});

Now we’ve populated some data in database, so let’s see how to write MongoDB query using map-reduce to get the solution for above problem. Below is the map function to emit data:

function() {
    if(this.visitReason != undefined){
      this.visitReason.split(',').forEach(function (v) {
        emit(v.trim(), 1);
      });
    }
}

The above map will emit data for each visit reason as key with count 1 as value. Now, the reduce function will group all the visit reason (key-value pair) and calculate sum of all the count values of same visit reason. Here is the reduce function:

function (key, values) {
    return Array.sum(values);
}

The complete MongoDB query would be:

db.visit.mapReduce(
  function() {
    if(this.visitReason != undefined){
      this.visitReason.split(',').forEach(function (v) {
        emit(v.trim(), 1);
      });
    }
  },
  function (key, values) {
    return Array.sum(values);
  },
  {out: "dbStats"}
).find();

Please note that dbStats  in {out: "dbStats"} is the name of the collection  where the operation outputs the result.

The output of above MongoDB query is:

 { "_id" : "Bleeding gums", "value" : 15 }
 { "_id" : "Broken knee", "value" : 5 }
 { "_id" : "Broken leg", "value" : 12 }
 { "_id" : "Cough", "value" : 9 }
 { "_id" : "Fever with Pain", "value" : 8 }
 { "_id" : "Health Checkup", "value" : 18 }
 { "_id" : "High fever", "value" : 8 }
 { "_id" : "Low Blood Pressure", "value" : 1 }

Tag -

FOUND THIS USEFUL? SHARE IT

Leave a comment -