Coordinator UI Console - "x% to load until available" for a data source & Inconsistency in record count in Druid

Hi,

We have 1 data source in Druid 0.15 cluster with below configurations .

Linux 16 GB Ram, 300 GB Harddisk, 8 cores

6 data nodes,1 Query server, 1 master

We are using local as deep storage.

We are ingesting data in batch mode into druid through kafka with number of tasks as 6 in supervisor specs.

we see the availability of data sources is **<= 99% available **in the coordinator UI console with 18% to load until available even after waiting for 8hrs once data is pushed to Druid(Not sure if it is in tmp directory before loading to deep storage). we are using default retention(load forever)

And we also notice inconsistency in the record count in Druid for the dates that are already loaded, the % to load message we see when we hover over the datasource is different each time, say 9 % sometimes and if we see after 5 mins it shows 30% to load.We have a given a higher value for segment locations cache ~300 GB.We are seeing this inconsistency every time we are ingesting into druid, say for a frequency of a day or two, around 500Million records.

Though we have ingested 500M, complete data is not available when compared to source and the record count varies each time, not sure if it is because of “% to load until available” message, sometimes it even decreases and we have never been able to see full data available to query.

Any help is much appreciated. ****

Regards,

Sunita

Hi Sunita ,

Do you mean,

  1. You have 6 data nodes each with [16 GB Ram, 300 GB Harddisk, 8 cores]?

  2. What do you mean by : We are ingesting data in batch mode into druid through Kafka?

Do you mean you are ingesting data in druid using DRUID KAFKA indexing service and BATCH ingestion both ?

Inconsistency in the row count is because segments for the data source is not yet available fully for quering as its <=99% .

Please check the coordinator and Historical log to get more insight on what happening.

fyi: The Druid Coordinator process is primarily responsible for segment management and distribution and communicates to Historical processes to load or drop segments based on configurations. The Druid Coordinator is responsible for loading new segments, dropping outdated segments, managing segment replication, and balancing segment load.

Thanks and Regards,

Vaibhav

Hi Vaibhav,

Please find answers for your questions inline below.

We are ingesting data into druid using kafka indexing service, three times a day, around 400 million records at one time, though we are using kafka its not continous, we give around 3 hrs gap between each ingestion.

Co-ordinator UI not coming up as expected, it shows Request failed with status code 500, Please find the screenshot and below logs

Router logs:

Hi Sunitha,

May i know which version of mysql you are using and also check the driver used is not matching.

Can you check the version of jdbc driver in druid under druid/extensions/mysql-metadata-storage is matching with the mysql you have used.

Thanks

[1]
looking at your coordinator logs :

java.lang.NoClassDefFoundError: com/mysql/jdbc/exceptions/MySQLTransientException

it seems it’s missing the MySQL driver jar

https://druid.apache.org/docs/latest/development/extensions-core/mysql.html

Could you cross verify the configuration and make sure you have the mysql jar in place.

[2] From the attached screenshot, I see when you click on the data server it throws STATUS code 500 : which generally means that the server cannot process the request due to some reasons.

Could you check your data server logs [ Historicals ] if everything is good?

Also, Could you confirm on point [1] asked in my last response?

Thanks,

Vaibhav