Hadoop Archives - Page 4 of 6 - Big Data In Real World

Hadoop

October 5, 2015

Hadoop Archives (HAR)

Hadoop Archives (HAR) Hadoop Archives (HAR) offers an effective way to deal with the small files problem. This post will explain – The problem with small […]
October 5, 2015

Pig vs. Hive

Pig vs. Hive Apache Pig takes in a set of instructions written in Pig Latin, compiles them and produce a set of MapReduce jobs and execute […]
September 8, 2015

Datanode Block Scanner

Datanode Block Scanner In this blog post we saw how HDFS handles and corrects data corruption in HDFS using checksum. During a write operation the datanode […]
August 30, 2015

HDFS Federation

What is HDFS Federation? Namenode is responsible for the successful operation of HDFS.  Namenode holds the entire metadata of HDFS, which includes information about files and […]
August 4, 2015

InputSplit vs Block

InputSplit vs Block The central idea behind MapReduce is distributed processing and hence the most important thing is to divide the dataset in to chunks and […]
Data Locality in Hadoop
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X