This is the second blog of the Spark series. This blog post include setup of Spark environment followed by a small word count program. The idea behind the blog is to get hands on in Spark setup and running simple program on Spark. If you want to know more about Spark history and it’s comparison […]
We at IntelliGrape divide Big Data into four major sectors – as we commonly refer as 4C’s of Big Data. These 4C’s are:- Capture (Data Ingestion) Contain (Data Persistence (NoSQL) Compute (Data Processing) Comprehend (Data Analytics and Visualization) Within this blog, I’ll be focusing on the last pointer i.e. Comprehend part of Big Data – precisely […]
Overview: The big data space has been evolving continuously and each day more technologies are added in ecosystem. Hadoop Hive is one of the technologies that has been around along. It’s give a SQL wrapper to execute Hadoop as a query language. Inherently, It’s having some of the optimizations techniques. Through this blog, I thought […]
A Brief History of Hadoop: Hadoop was created by Doug Cutting, the creator of Apache Lucene, the widely used text search library. Hadoop has its origins in Apache Nutch, an open source web search engine, itself a part of the Lucene project. The Origin Of The Name “hadoop”. Hadoop is not an acronym; it’s a […]
Big Data is a collection of high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. Big data is data that exceeds the processing capacity of conventional database systems. The data is too big to fit traditional data-stores, moving too fast to be churned by […]