How to use multi character delimiter in a Hive table? - Big Data In Real World

How to use multi character delimiter in a Hive table?

Change field termination value in Hive
January 23, 2017
How to find directories in HDFS which are older than N days?
January 30, 2017
Change field termination value in Hive
January 23, 2017
How to find directories in HDFS which are older than N days?
January 30, 2017

How to use multi character delimiter in a Hive table?

Sometimes your data is slightly complex to delimit the individual columns with a single character like delimiter comma, pipe symbol etc. 

Problem

What if you have multi character delimiter like below ?

In the below sample record the delimiter is @#

NYSE@#B7J@#2009-12-16@#8.80@#8.84@#8.77@#8.83@#69700@#8.83

Solution

Well if you are using Hive version 0.14.0 then you are in luck. You could use MultiDelimitSerDe  and specify the multi character delimiter for field.delim  property.

hive> CREATE EXTERNAL TABLE IF NOT EXISTS stocks_multi (
exch STRING,
symbol STRING,
ymd STRING,
price_open FLOAT,
price_high FLOAT,
price_low FLOAT,
price_close FLOAT,
volume INT,
price_adj_close FLOAT)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.MultiDelimitSerDe'
WITH SERDEPROPERTIES ("field.delim"="@#")
LOCATION '/user/hirw/stocks-multi';

 

Big Data In Real World
Big Data In Real World
We are a group of Big Data engineers who are passionate about Big Data and related Big Data technologies. We have designed, developed, deployed and maintained Big Data applications ranging from batch to real time streaming big data platforms. We have seen a wide range of real world big data problems, implemented some innovative and complex (or simple, depending on how you look at it) solutions.

Comments are closed.

How to use multi character delimiter in a Hive table?
This website uses cookies to improve your experience. By using this website you agree to our Data Protection Policy.

Hadoop In Real World is now Big Data In Real World!

X