The conveyance of data from many sources to a storage medium where it may be accessed, utilized, and analyzed by an organization is known as data ingestion. Typically, the destination is a data warehouse, data mart, database, or document storage. Sources can include RDBMS such as MySQL, Oracle, and Postgres. The data ingestion layer...
We at IntelliGrape divide Big Data into four major sectors - as we commonly refer as 4C's of Big Data. These 4C's are:- Capture (Data Ingestion) Contain (Data Persistence (NoSQL) Compute (Data Processing) Comprehend (Data Analytics and Visualization) Within this blog, I'll be focusing on the last pointer i.e....
Using GroupBy and JOIN is often very challenging. Recently in one of the POCs of MEAN project, I used groupBy and join in apache spark. I had two datasets in hdfs, one for the sales and other for the product. Sales Datasets column : Sales Id, Version, Brand Name, Product Id, No of Item Purchased, Purchased Date Product...