Druid indexes Hive data

Dear all,

  • I have hive table 23GB .

  • When I run index hive data to druid as command:

CREATE TABLE hiveto_druid

STORED BY ‘org.apache.hadoop.hive.druid.DruidStorageHandler’ TBLPROPERTIES (“druid.segment.granularity” = “HOUR”,“druid.query.granularity” = “minute”)

AS

SELECT

cast(from_unixtime(unix_timestamp(time_stamp ,‘yyyyMMddHHmmss’), ‘yyyy-MM-dd HH:mm:ss’) as timestamp) as __time,

cast(msisdn as string) msisdn,

cast(imei as string) imei,

cast(imsi as string) imsi,

cast(location as string) location,

cast(client_ip as string) client_ip,

cast(application_category as string) application_category,

cast(application_name as string) application_name,

cast(rat_type as string) rat_type,

vol_in,

vol_out,

record_duration,

rxmit_vol_in,

rxmit_vol_out,

pkt_in,

pkt_out,

rxmit_pkt_in,

rxmit_pkt_out,

reorder_pkt,

rxmit_pkt,

client_delay,

first_data_delay,

std,

network_delay

FROM hivetable;

I receive an error, please help me why ?

ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Dag received [DAG_TERMINATE, SERVICE_PLUGIN_ERROR] in RUNNING state.Error reported by TaskScheduler [[2:LLAP]][SERVICE_UNAVAILABLE] No LLAP Daemons are runningVertex killed, vertexName=Reducer 2, vertexId=vertex_1581307135273_1361_3_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to DAG_TERMINATED, failedTasks:0 killedTasks:1009, Vertex vertex_1581307135273_1361_3_01 [Reducer 2] killed/failed due to:DAG_TERMINATED]Vertex killed, vertexName=Map 1, vertexId=vertex_1581307135273_1361_3_00, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to DAG_TERMINATED, failedTasks:0 killedTasks:26, Vertex vertex_1581307135273_1361_3_00 [Map 1] killed/failed due to:DAG_TERMINATED]DAG did not succeed due to SERVICE_PLUGIN_ERROR. failedVertices:0 killedVertices:2

INFO : Resetting the caller context to HIVE_SSN_ID:c107acfb-b2d7-4ae5-9fc5-5126d7fb3ca6

INFO : Completed executing command(queryId=hive_20200212090300_b3966025-8673-43dd-8b13-521dd83b99b1); Time taken: 263.245 seconds

Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Dag received [DAG_TERMINATE, SERVICE_PLUGIN_ERROR] in RUNNING state.Error reported by TaskScheduler [[2:LLAP]][SERVICE_UNAVAILABLE] No LLAP Daemons are runningVertex killed, vertexName=Reducer 2, vertexId=vertex_1581307135273_1361_3_01, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to DAG_TERMINATED, failedTasks:0 killedTasks:1009, Vertex vertex_1581307135273_1361_3_01 [Reducer 2] killed/failed due to:DAG_TERMINATED]Vertex killed, vertexName=Map 1, vertexId=vertex_1581307135273_1361_3_00, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to DAG_TERMINATED, failedTasks:0 killedTasks:26, Vertex vertex_1581307135273_1361_3_00 [Map 1] killed/failed due to:DAG_TERMINATED]DAG did not succeed due to SERVICE_PLUGIN_ERROR. failedVertices:0 killedVertices:2 (state=08S01,code=2)

Seems like The Hive LLAP node has crashed, wondering if this is due to the how big is this job ?
First will need to get more logs from Hive Severe 2 and the Yarn application logs

Also Try to use the TEZ container mode, it provides more fault tolerance.

FYI it is better to send such question to the Hive user group or the Hive Jira portal

thanks very much, slim.
I’ll check your suggestion.