AWSBig DataCorporate

Amazon Awards IntelliGrape For Customer Obsession

Intelligrape (Now TO THE NEW Digital ) is an Advanced Consulting Partner with Amazon and specialises in large-scale implementations and managed services for web based applications. We have been working with Amazon team in India and outside on multiple initiatives since more than 2 years now. Amazon conducted its series of APN (AWS Partner Network) exclusive […]

Big Data

Apache Flume : Setup & Best Practices

Apache Flume is an open source project aimed at providing a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large volume of data. It is a complex task when moving data in large volume. We try to minimize the latency in transfer; this is achieved by specifically tweaking the configuration of Flume. […]

Rishabh Jain
Rishabh Jain
Read

Big Data

Consistent Hashing? What the heck is that…..

Hashing is one of the main concepts that we are introduced to as we start off as a basic programmer. Be it ‘data structures’ or simple ‘object’ notion – hashing has a role to play everywhere. But when it comes to Big Data – like every thing else, the hashing mechanism is also exposed to […]

Big Data

Spark 1O3 – Understanding Spark Internals

In this post, I will present a technical “deep-dive” into Spark internals, including RDD and Shared Variables. If you want to know more about Spark and Spark setup in a single node, please refer previous post of Spark series, including Spark 1O1 and Spark 1O2. Resilient Distributed Datasets (RDD) – An RDD in is primary abstraction […]

Big Data

Predictive Analysis in R using Rattle

R is the most common platform for predictive analysis. Rattle library is an extension of R which takes the predictive analysis to another level. This blog is focused towards  people who have some experience in R. Rattle is the library provided for R language that is used for data mining process, where you can apply […]

Mohit Garg
Mohit Garg
Read

Big Data

Prediction Analysis using Knime

Prediction Analysis is the practice of extracting information from existing data sets in order to determine patterns and predict future outcomes and trends. There are various analytic and machine learning tool available in the market for predictive analysis. This post includes introduction of Knime followed by a sample use case of clustering using Knime and […]

Big Data

Predictive Analysis – Introduction

Big Data in itself brings many challenges; as is the case with anything related to data. Predictive Analysis is one part which takes up much effort and attention as well. One of the foremost challenge which one comes across is how to get started with the “subject”. I would first like to highlight the basic […]

Rishabh Jain
Rishabh Jain
Read

Big Data

Spark 1o2 – “Hello World”

This is the second blog of the Spark series. This blog post include setup of Spark environment followed by a small word count program. The idea behind the blog is to get hands on in Spark setup and running simple program on Spark. If you want to know more about Spark history and it’s comparison […]

Big Data

Realtime Event processing with Esper

In one of the recent use case, we had to implement a complex event processing in real time mode. Storm is used as real time processing engine, but since It doesn’t provide batching of events therefore we took upon Esper to do the required job. Esper can be thought as a complex event processing (CEP) […]

Mohit Garg
Mohit Garg
Read