BLOG - Page 12 of 14 - Big Data In Real World

BLOG

December 20, 2015

Apache Pig Tutorial – Executing as a Script

Apache Pig Tutorial – Executing as a Script Goal of this tutorial is to learn Apache Pig concepts in a fast pace. So don’t except lengthy […]
December 20, 2015

Apache Pig Tutorial – Ordering Records

Apache Pig Tutorial – Ordering Records Goal of this tutorial is to learn Apache Pig concepts in a fast pace. So don’t except lengthy posts. All […]
December 19, 2015

Apache Pig Tutorial – Grouping Records

Apache Pig Tutorial – Grouping Records Goal of this tutorial is to learn Apache Pig concepts in a fast pace. So don’t except lengthy posts. All […]
December 18, 2015

Apache Pig Tutorial – Filter Records

Apache Pig Tutorial – Filter Records Goal of this tutorial is to learn Apache Pig concepts in a fast pace. So don’t except lengthy posts. All […]
December 16, 2015

Apache Pig Tutorial – Project and Manipulate Columns

Apache Pig Tutorial – Project and Manipulate Columns Goal of this tutorial is to learn Apache Pig concepts in a fast pace. So don’t except lengthy […]
December 14, 2015

Apache Pig Tutorial -Load Variations

Apache Pig Tutorial – Load Variations Goal of this tutorial is to learn Apache Pig concepts in a fast pace. So don’t except lengthy posts. All posts […]
December 7, 2015

Apache Pig Tutorial – Loading Datasets

Apache Pig Tutorial – Loading Datasets Goal of this tutorial is to learn Apache Pig concepts in a fast pace. So don’t except lengthy posts. All […]
October 13, 2015

Is Hive Good At Everything?

Is Hive Good At Everything? Hive is an awesome tool, which takes in SQL like queries and translate them in to MapReduce. Hive is very helpful […]
October 11, 2015

How much memory your Namenode need?

How much memory your Namenode need? This is going to be a very short post. When you are building a cluster from scratch, Hadoop developers and […]
October 5, 2015

Hadoop Archives (HAR)

Hadoop Archives (HAR) Hadoop Archives (HAR) offers an effective way to deal with the small files problem. This post will explain – The problem with small […]
October 5, 2015

Pig vs. Hive

Pig vs. Hive Apache Pig takes in a set of instructions written in Pig Latin, compiles them and produce a set of MapReduce jobs and execute […]
September 8, 2015

Datanode Block Scanner

Datanode Block Scanner In this blog post we saw how HDFS handles and corrects data corruption in HDFS using checksum. During a write operation the datanode […]
September 6, 2015

Dealing With Data Corruption In HDFS

Dealing With Data Corruption In HDFS Hadoop is designed to store and analyze huge volume of data and with huge volume of data stored in HDFS […]
September 1, 2015

Can Reducer always be reused for Combiner?

Can Reducer always be reused for Combiner? A Combiner function is an optional intermediary function which is executed on the Map phase right after the execution […]
August 30, 2015

HDFS Federation

What is HDFS Federation? Namenode is responsible for the successful operation of HDFS.  Namenode holds the entire metadata of HDFS, which includes information about files and […]
August 26, 2015

Reading A File From HDFS – Java Program

Reading A File From HDFS – Java Program In this last post we saw how to write a file to HDFS by writing our own Java […]
August 23, 2015

Writing A File To HDFS – Java Program

Writing A File To HDFS – Java Program   Writing a file to HDFS is very easy, we  can simply execute hadoop fs -copyFromLocal  command to copy […]
August 16, 2015

Speculative Execution

What is Speculative Execution? Sometimes you will notice that a Job which has 3 input splits executed 4 mappers and killed the 4th mapper. The job […]
August 11, 2015

Changing Number Of Reducers

Changing Number Of Reducers In this blog post we saw how we can change the number of mappers in a MapReduce execution. In this post, we […]
August 9, 2015

Changing Number Of Mappers

Changing Number Of Mappers Number of mappers always equals to the Number of splits. Having said that it is possible to control the number of splits […]
gdpr-image
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X