A use case and implementation of MongoDB Aggregation queries.

08 / Jun / 2015 by Manish Kapoor 0 comments

In any DBMS, aggregations are operations that process data records and return computed results. MongoDB provides a rich set of aggregation operations that examine and perform calculations on the data sets. Running data aggregation on the mongod instance simplifies application code and limits resource requirements.

In my current project, I had a requirement to show who all fans are liking user’s facebok fan page posts. Here is the expected output:

John, Tom, Tim liked Post1 on Page1
Josh, Tim liked Post2 on Page2

We fetched likes of user’s fanpage from Facebook REST API, and saved the information in a collection named Notification. Here is sample data in the collection:

{
    "_id": ObjectId("55ae0efasdfasdfa7b7a6"),
    "postId": "14332423434_4645747538607955",
    "senderName": "John Doe",
    "postText": "Post text 1",
    "fanpageName": "Page 1",
    "type" : "LIKE"
},
{
    "_id": ObjectId("55ae0ef46ffdgdfaab7a6"),
    "senderName": "Tim",
    "postText": "Some inbox message",
    "fanpageName": "Page 1",
    "type" : "INBOX_MESSAGE"
},
{
    "_id": ObjectId("556c295b64a3862c6e36e61d"),
    "postId": "1433234256523434_464574753658607955",
    "senderName": "John Doe",
    "postText": "Post text 2",
    "fanpageName": "Page 1",
    "type" : "POST"
},
{
    "_id": ObjectId("55ae0ef46ffasdfa7bsdfasdf'),
    "postId" : "14332423434_4645747538607955",
    "senderName" : "Kim",
    "postText" : "Posttext1",
    "type" : "LIKE"
}

For all the LIKE notifications, type was set to “LIKE” (It had some other fields as well which were being used to cater other requirments and for simplicity, I am not listing those fields here).

This was a case of grouping of ‘LIKE’ notification based on their postId.

Out of all the fields in the collections, we needed to project only postId, postText, senderName, fanpageName.

So, we used following aggregation pipeline:

[
    { $match   : { type: "LIKE" }},
	{ $project : { postId:1, postText: 1, senderName:1, fanpageName:1} },
	{ $group   : {_id:"$postId", senderList: {$push:"$$ROOT" } }}

]

Here in group query, we have used ‘$push:”$$ROOT”‘. This populates an array ‘senderList’ of sub-documents which consist all the fields mentioned in $project query.
Also, while using $project and $group, we need to ensure that we are projecting the field based on which we are grouping the results(‘postId’ in this case).
Here is the sample output of the above pipeline:

{
    "result" : [ 
         {
            "_id" : "14332423489_4645747538607932",
            "senders" : [ 
                {
                    "_id" : ObjectId("55ae0ef46ffasdfafdgdergre'
                    "postId" : "14332423489_4645747538607932",
                    "senderName" : "John Doe",
                    "postText" : "Post text 2",
                    "fanpageName" : "Page 1"
                },
                {
                 -  "_id" : ObjectId("55ae0ef46ffasfasdfasdf
                    "postId" : "14332423489_4645747538607932",
                    "senderName" : "Matt",
                    "postText" : "Post text 2",
                    "fanpageName" : "Page 1"
                }
                
            ]
        },
        {
            "_id" : "143324442442343489_464574767538607932",
            "senders" : [ 
                {
                    "_id" : ObjectId("55ae0ef46ffasdfaf33dgdergre')
                    "postId" : "143324442442343489_464574767538607932",
                    "senderName" : "John Doe",
                    "postText" : "Post text 22",
                    "fanpageName" : "Page 2"
                }              
                
            ]
        }                      
    ],
    "ok" : 1.0000000000000000
}

This gave me the data which I needed for my requirement.

Hope this helps.
Thanks.

FOUND THIS USEFUL? SHARE IT

Leave a comment -