Transform Hive Data to Druid and 6 Lakes Records are missing

Hi Team,

We are not able to transform Hive Data to Druid.

Total records are 8L in Hive, only 2L records are transform to Druid and missing some 6 L records.

Even after adding Cast to every column of hive.

If i do count(*) on both tables some 6L records are missing in Druid tables.

Count of Druid table - 200392
Count of Hive table - 794705

Below is my Hive external table.


STORED BY ‘org.apache.hadoop.hive.druid.DruidStorageHandler’




current_timestamp() as __time,

CAST(transactiondate as STRING) transactiondate,

CAST(airlinecode AS STRING) airlinecode,

CAST(airlinename AS STRING) airlinename,

CAST(billtocode AS STRING) billtocode,

CAST(shiptocode AS STRING) shiptocode,

CAST(shiptoname AS STRING) shiptoname,

CAST(locationid AS STRING) locationid,

CAST(customertype AS STRING) customertype,

CAST(loc_name AS STRING) loc_name,

CAST(loc_code AS STRING) loc_code,

CAST(aircarfttype AS STRING) aircarfttype,

CAST(ftd AS STRING) ftd,

CAST(mtd AS STRING) mtd,

CAST(ytd AS STRING) ytd,

CAST(year AS STRING) year,

CAST(month AS STRING) month,

CAST(quantitydispended AS STRING) quantitydispended,

CAST(quantitydispended_litres AS STRING) quantitydispended_litres

from tab_sales where billtocode is not NULL;

Please help me how to solve this issue.

Hey Md!

Can you also include your ingestion spec?

We just want to transform our hive data to druid. mismatch of total rows happening in druid tables. I have also mention hive extetnal which will use to store data in druid