Blog posts around Big Data | TO THE NEW Blog

Amazon Awards IntelliGrape For Customer Obsession

Intelligrape (Now TO THE NEW Digital ) is an Advanced Consulting Partner with Amazon and specialises in large-scale implementations and managed services for web based applications. We have been working with Amazon team in India and outside on multiple initiatives since more than 2 years now. Amazon conducted its series of APN (AWS Partner Network) exclusive […]

Aman Aggarwal May 5, 2015

Read

Big Data

Apache Flume : Setup & Best Practices

Apache Flume is an open source project aimed at providing a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large volume of data. It is a complex task when moving data in large volume. We try to minimize the latency in transfer; this is achieved by specifically tweaking the configuration of Flume. […]

Rishabh Jain February 28, 2015

Read

Big Data

Consistent Hashing? What the heck is that…..

Hashing is one of the main concepts that we are introduced to as we start off as a basic programmer. Be it ‘data structures’ or simple ‘object’ notion – hashing has a role to play everywhere. But when it comes to Big Data – like every thing else, the hashing mechanism is also exposed to […]

Surendra Pratap Singh February 27, 2015

Read

Big Data

Spark 1O3 – Understanding Spark Internals

In this post, I will present a technical “deep-dive” into Spark internals, including RDD and Shared Variables. If you want to know more about Spark and Spark setup in a single node, please refer previous post of Spark series, including Spark 1O1 and Spark 1O2. Resilient Distributed Datasets (RDD) – An RDD in is primary abstraction […]

Surendra Pratap Singh February 13, 2015

Read

Big Data

Predictive Analysis in R using Rattle

R is the most common platform for predictive analysis. Rattle library is an extension of R which takes the predictive analysis to another level. This blog is focused towards people who have some experience in R. Rattle is the library provided for R language that is used for data mining process, where you can apply […]

Mohit Garg February 12, 2015

Read

Big Data

Prediction Analysis using Knime

Prediction Analysis is the practice of extracting information from existing data sets in order to determine patterns and predict future outcomes and trends. There are various analytic and machine learning tool available in the market for predictive analysis. This post includes introduction of Knime followed by a sample use case of clustering using Knime and […]

Surendra Pratap Singh February 5, 2015

Read

Big Data

Predictive Analysis – Introduction

Big Data in itself brings many challenges; as is the case with anything related to data. Predictive Analysis is one part which takes up much effort and attention as well. One of the foremost challenge which one comes across is how to get started with the “subject”. I would first like to highlight the basic […]

Rishabh Jain February 4, 2015

Read

Big Data

Spark 1o2 – “Hello World”

This is the second blog of the Spark series. This blog post include setup of Spark environment followed by a small word count program. The idea behind the blog is to get hands on in Spark setup and running simple program on Spark. If you want to know more about Spark history and it’s comparison […]

Surendra Pratap Singh January 21, 2015

Read

Big Data

Realtime Event processing with Esper

In one of the recent use case, we had to implement a complex event processing in real time mode. Storm is used as real time processing engine, but since It doesn’t provide batching of events therefore we took upon Esper to do the required job. Esper can be thought as a complex event processing (CEP) […]

Mohit Garg January 21, 2015

Read

Tips for writing a blog

Learn how to write a caption