Druid realtime ingestion tasks are failing at handoffstage

I am frequently seeing the following lines in the log

2019-06-24T13:33:01,891 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing commit metadata for segment[druid_datasourcce_2019-06-24T13:00:00.000Z_2019-06-24T14:00:00.000Z_2019-06-24T13:00:00.809Z].
2019-06-24T13:33:01,893 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[druid_datasourcce_2019-06-24T13:00:00.000Z_2019-06-24T14:00:00.000Z_2019-06-24T13:00:00.809Z] at path[/druid/segments/druid_ingestion_peon:8103/druid_ingestion_peon:8103_indexer-executor__default_tier_2019-06-24T12:25:04.559Z_62ab1b83f7244e078ca093d012d783f40]
2019-06-24T13:33:01,893 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[druid_datasourcce_2019-06-24T13:00:00.000Z_2019-06-24T14:00:00.000Z_2019-06-24T13:00:00.809Z].
2019-06-24T13:33:01,901 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Deleting Index File[/tmp/druid/persistent/task/index_kafka_druid_datasourcce_2c99376abff20af_njnlmbcm/work/persist/druid_datasourcce_2019-06-24T13:00:00.000Z_2019-06-24T14:00:00.000Z_2019-06-24T13:00:00.809Z]
2019-06-24T13:34:01,677 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:34:01,684 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:35:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:35:01,681 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:36:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:36:01,685 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:37:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:37:01,683 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:38:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:38:01,682 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:39:01,677 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:39:01,684 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:40:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:40:01,681 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:41:01,677 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:41:01,682 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:42:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:42:01,681 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:43:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:43:01,682 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:44:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:44:01,682 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:45:01,677 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:45:01,736 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:46:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:46:01,680 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:47:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:47:01,681 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:48:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:48:01,681 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:49:01,677 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:49:01,682 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:50:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:50:01,681 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:51:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:51:01,681 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:52:01,677 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:52:01,683 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:53:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:53:01,682 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:54:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:54:01,681 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]
2019-06-24T13:55:01,676 INFO [coordinator_handoff_scheduled_0] io.druid.java.util.http.client.pool.ChannelResourceFactory - Generating: http://druid_coordinator:8081
2019-06-24T13:55:01,682 INFO [coordinator_handoff_scheduled_0] io.druid.segment.realtime.plumber.CoordinatorBasedSegmentHandoffNotifier - Still waiting for Handoff for Segments : [[SegmentDescriptor{interval=2019-03-05T10:00:00.000Z/2019-03-05T11:00:00.000Z, version=‘2019-03-05T10:00:00.353Z’, partitionNumber=68}]]

I have observed that the tasks are failing after waiting at handoff stage for 20-24 minutes.

Is there way we can increase the timeout at handoff stage ?

And also I don’t see any heavy segments being loaded on coordinator.
How should I go ahead debugging this issue ?

What metrics should I collect in order to monitor this ?

Hi Sharath,

This might help. Look at section My stream ingest is not handing segments off here in the FAQ https://druid.apache.org/docs/latest/ingestion/faq.html.

Thanks,

Sashi