Hi there,
I have 2TB in Druid Data Source e.g., data source name “my_ds” --> size 2TB
I am using Hive to create external table for the data source and then query the data source"my_ds".
When I use Hive to count the row of this data source e.g., select count(*) from my_ds
I got 7710906278 rows.
However, when I use Hive to dupplicate the table “my_ds” to another table to store as ORC format, I will get only 22085632 rows.
E.g., create table my_ds_dump as select * from my_ds
Then select count(*) from my_ds_dump --> I get only 22085632 rows
Do you know what is the problem?
How can I select all the rows from the original source?
Thanks
Tas