BLOG – Page 3 – Hadoop In Real World

BLOG

February 9, 2017

Working with HDFS

In the HDFS – Why another filesytem post, we got ourselves introduced about HDFS its time to try some HDFS commands. You are probably thinking why […]
February 6, 2017

HDFS – Why another file system?

In Understanding Big Data Problem post we saw that HDFS or Hadoop Distributed filesystem takes care of all the storage related complexities in Hadoop. In this […]
February 2, 2017

Finding the MAX tuple with Pig

Finding the MAX tuple with Pig Here is a sample dataset. Our goal is to find the record with maximum record_value which is [crayon-59ec350cb3b8d660574877-i/]  [crayon-59ec350cb3b97912448647/] Script […]
January 30, 2017

How to find directories in HDFS which are older than N days?

How to find directories in HDFS which are older than N days? Cleaning up older or obsolete files in HDFS is important. Even if you have […]
January 26, 2017

How to use multi character delimiter in a Hive table?

How to use multi character delimiter in a Hive table? Sometimes your data is slightly complex to delimit the individual columns with a single character like […]
January 23, 2017

Change field termination value in Hive

Change field termination value in Hive This blog post describes how to change the field termination value in Hive. Assume when you created the Hive table, […]