Ingesting data from Apache Hive

Hi team,

I am very new to Apache Druid but keen to learn! I have already injested some data from our hdfs using flat files and got everything working.

But our batch data is imported into Apache Hive, is it possible to injest data from Apache Hive ideally using the Web gui?

Thank you in advance for your help

Hi CpuMonsta,

Welcome to the Druid Forum.

I don’t believe there is a direct way to ingest data from Hive.
But you can use Hive to export into a CSV delimited form on HDFS.

-- create target CSV table
CREATE EXTERNAL TABLE csv_formatted_data
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED as textfile
LOCATION '<HDFS path>'
AS
SELECT <columns> FROM <source table>
;


-- load the csv data
INSERT INTO csv_formatted_data
SELECT <columns> FROM <source_table>
;

You can then ingest the data directly from its HDFS LOCATION specified in the CREATE EXTERNAL TABLE command.

Hope this helps,
Sergio

1 Like