We at IntelliGrape divide Big Data into four major sectors - as we commonly refer as 4C's of Big Data. These 4C's are:- Capture (Data Ingestion) Contain (Data Persistence (NoSQL) Compute (Data Processing) Comprehend (Data Analytics and Visualization) Within this blog, I'll be focusing on the last pointer i.e....
Using GroupBy and JOIN is often very challenging. Recently in one of the POCs of MEAN project, I used groupBy and join in apache spark. I had two datasets in hdfs, one for the sales and other for the product. Sales Datasets column : Sales Id, Version, Brand Name, Product Id, No of Item Purchased, Purchased Date Product...