Tag Archives: HDFS

No Code Data Ingestion Framework Using Apache-Flink

The conveyance of data from many sources to a storage medium where it may be accessed, utilized, and analyzed by an organization is known as data ingestion. Typically, the destination is a data warehouse, data mart, database, or document storage. Sources can include RDBMS such as MySQL, Oracle, and Postgres. The data ingestion layer...

by Vikas Duvedi
Tag: HDFS

27-Jun-2023

Big Data

Visualization using R and googleVis

We at IntelliGrape divide Big Data into four major sectors - as we commonly refer as 4C's of Big Data. These 4C's are:- Capture (Data Ingestion) Contain (Data Persistence (NoSQL) Compute (Data Processing) Comprehend (Data Analytics and Visualization) Within this blog, I'll be focusing on the last pointer i.e....

by Mohit Garg
Tag: HDFS

08-Dec-2014

Big Data

Usage of GroupBy and Join in Apache Spark

Using GroupBy and JOIN is often very challenging. Recently in one of the POCs of MEAN project, I used groupBy and join in apache spark. I had two datasets in hdfs, one for the sales and other for the product. Sales Datasets column : Sales Id, Version, Brand Name, Product Id, No of Item Purchased, Purchased Date Product...

by Mohit Garg
Tag: HDFS

15-Sep-2014

Blogs

Tips for writing a blog

Learn how to write a caption