Druid segments become unavailable after data ingestion

Druid cluster shows unavailable for certain segments of data of data source after data ingestion. Ex: 72.4% available (2352 segments, 647 segments unavailable)We have a clustered deployment 3 nodes : master node (coordinator amd overlord) Data node (historical and middlemanager) Query node (broker and router) Any specific reason why it is happening so.

Hi Shashank,

It may be , the Indexing process has generated x segments and historical has loaded some of them and still loading the segments are there may be some error/issues.

You should look into :

  • Historical and coordinator log.

Thanks and Regards,

Vaibhav

Thanks Vaibhav!

I see these in the logs that there is a problem while segment handoff from coordinator to historical.

2020-02-18T03:54:48,513 INFO [coorrinator_handoff_scheduled_0] org.apache.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments:

We have enabled only error logging in our environment. As of now we dont see anything red in either coordinator or historical log We have raised a request to enable info logging. We will keep posted.

Thanks
Shashank

More insight into the configuration details:

1.We are using local strorage mount for deep storage
2. We are connected to postgres for metadata storage and we see the metadata is getting populated in druid tables.
3. We also see the historical and middlemanager are showing up on router user interface exposed through port 8888 which means all nodes are connected through zookeeper.

We need to check if historical node is able to read segments info from load queue and able to store in segment cache.

Thanks
Shashank

Hi Team,

The issue is resolved after clean restart of master and data nodes in the cluster.

However just restarting nodes without cleaning data didnot work.

Regards
Shashank