Kafka indexing service can't create segment when task ended with not realtime data

Hi everyone,
Right now I am experiencing push not realtime data into a datasource. Currently, have some data by using kafka-indexing-service.

I create a supervisor to pull data from kafka then I wait for 1h20 minutes for task ended and I can’t query after that before I can.

Surprisingly the tasks were succeeded.

2016-11-10T11:23:19,307 INFO [qtp1723810021-195] io.druid.indexing.kafka.KafkaIndexTask - Stopping gracefully.
2016-11-10T11:23:19,307 INFO [qtp1723810021-195] io.druid.indexing.kafka.KafkaIndexTask - Interrupting run thread (status: [PUBLISHING])
2016-11-10T11:23:19,315 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Shutting down...
2016-11-10T11:23:19,320 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[ad_statistic_history_2016-11-09T07:00:00.000Z_2016-11-09T08:00:00.000Z_2016-11-10T09:05:24.457Z_31] at path[/druid/prod/segments/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105_indexer-executor__default_tier_2016-11-10T10:03:30.126Z_7b540319d7114725bfb5aa8fedc15e100]
2016-11-10T11:23:19,321 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[ad_statistic_history_2016-11-09T07:00:00.000Z_2016-11-09T08:00:00.000Z_2016-11-10T09:05:24.457Z_31].
2016-11-10T11:23:19,327 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[ad_statistic_history_2016-11-09T07:00:00.000Z_2016-11-09T08:00:00.000Z_2016-11-10T09:05:24.457Z_30] at path[/druid/prod/segments/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105_indexer-executor__default_tier_2016-11-10T10:03:30.126Z_7b540319d7114725bfb5aa8fedc15e100]
2016-11-10T11:23:19,328 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[ad_statistic_history_2016-11-09T07:00:00.000Z_2016-11-09T08:00:00.000Z_2016-11-10T09:05:24.457Z_30].
2016-11-10T11:23:19,329 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[ad_statistic_history_2016-11-09T08:00:00.000Z_2016-11-09T09:00:00.000Z_2016-11-10T09:09:28.745Z_25] at path[/druid/prod/segments/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105_indexer-executor__default_tier_2016-11-10T10:03:30.126Z_7b540319d7114725bfb5aa8fedc15e100]
2016-11-10T11:23:19,330 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[ad_statistic_history_2016-11-09T08:00:00.000Z_2016-11-09T09:00:00.000Z_2016-11-10T09:09:28.745Z_25].
2016-11-10T11:23:19,331 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[ad_statistic_history_2016-11-09T08:00:00.000Z_2016-11-09T09:00:00.000Z_2016-11-10T09:09:28.745Z_24] at path[/druid/prod/segments/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105_indexer-executor__default_tier_2016-11-10T10:03:30.126Z_7b540319d7114725bfb5aa8fedc15e100]
2016-11-10T11:23:19,332 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[ad_statistic_history_2016-11-09T08:00:00.000Z_2016-11-09T09:00:00.000Z_2016-11-10T09:09:28.745Z_24].
2016-11-10T11:23:19,333 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[ad_statistic_history_2016-11-09T06:00:00.000Z_2016-11-09T07:00:00.000Z_2016-11-10T09:21:51.740Z_21] at path[/druid/prod/segments/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105_indexer-executor__default_tier_2016-11-10T10:03:30.126Z_7b540319d7114725bfb5aa8fedc15e100]
2016-11-10T11:23:19,334 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[ad_statistic_history_2016-11-09T06:00:00.000Z_2016-11-09T07:00:00.000Z_2016-11-10T09:21:51.740Z_21].
2016-11-10T11:23:19,335 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[ad_statistic_history_2016-11-09T06:00:00.000Z_2016-11-09T07:00:00.000Z_2016-11-10T09:21:51.740Z_23] at path[/druid/prod/segments/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105_indexer-executor__default_tier_2016-11-10T10:03:30.126Z_7b540319d7114725bfb5aa8fedc15e100]
2016-11-10T11:23:19,335 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[ad_statistic_history_2016-11-09T06:00:00.000Z_2016-11-09T07:00:00.000Z_2016-11-10T09:21:51.740Z_23].
2016-11-10T11:23:19,336 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[ad_statistic_history_2016-11-09T08:00:00.000Z_2016-11-09T09:00:00.000Z_2016-11-10T09:09:28.745Z_28] at path[/druid/prod/segments/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105_indexer-executor__default_tier_2016-11-10T10:03:30.126Z_7b540319d7114725bfb5aa8fedc15e100]
2016-11-10T11:23:19,336 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[ad_statistic_history_2016-11-09T08:00:00.000Z_2016-11-09T09:00:00.000Z_2016-11-10T09:09:28.745Z_28].
2016-11-10T11:23:19,337 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[ad_statistic_history_2016-11-09T06:00:00.000Z_2016-11-09T07:00:00.000Z_2016-11-10T09:21:51.740Z_25] at path[/druid/prod/segments/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105_indexer-executor__default_tier_2016-11-10T10:03:30.126Z_7b540319d7114725bfb5aa8fedc15e100]
2016-11-10T11:23:19,337 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[ad_statistic_history_2016-11-09T06:00:00.000Z_2016-11-09T07:00:00.000Z_2016-11-10T09:21:51.740Z_25].
2016-11-10T11:23:19,338 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[ad_statistic_history_2016-11-09T06:00:00.000Z_2016-11-09T07:00:00.000Z_2016-11-10T09:21:51.740Z_26] at path[/druid/prod/segments/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105_indexer-executor__default_tier_2016-11-10T10:03:30.126Z_7b540319d7114725bfb5aa8fedc15e100]
2016-11-10T11:23:19,338 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[ad_statistic_history_2016-11-09T06:00:00.000Z_2016-11-09T07:00:00.000Z_2016-11-10T09:21:51.740Z_26].
2016-11-10T11:23:19,339 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[ad_statistic_history_2016-11-09T08:00:00.000Z_2016-11-09T09:00:00.000Z_2016-11-10T09:09:28.745Z_26] at path[/druid/prod/segments/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105_indexer-executor__default_tier_2016-11-10T10:03:30.126Z_7b540319d7114725bfb5aa8fedc15e100]
2016-11-10T11:23:19,339 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[ad_statistic_history_2016-11-09T08:00:00.000Z_2016-11-09T09:00:00.000Z_2016-11-10T09:09:28.745Z_26].
2016-11-10T11:23:19,340 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[ad_statistic_history_2016-11-09T07:00:00.000Z_2016-11-09T08:00:00.000Z_2016-11-10T09:05:24.457Z_25] at path[/druid/prod/segments/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105_indexer-executor__default_tier_2016-11-10T10:03:30.126Z_7b540319d7114725bfb5aa8fedc15e100]
2016-11-10T11:23:19,340 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[ad_statistic_history_2016-11-09T07:00:00.000Z_2016-11-09T08:00:00.000Z_2016-11-10T09:05:24.457Z_25].
2016-11-10T11:23:19,340 INFO [appenderator_persist_0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[ad_statistic_history_2016-11-09T07:00:00.000Z_2016-11-09T08:00:00.000Z_2016-11-10T09:05:24.457Z_26] at path[/druid/prod/segments/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105_indexer-executor__default_tier_2016-11-10T10:03:30.126Z_7b540319d7114725bfb5aa8fedc15e100]
2016-11-10T11:23:19,340 INFO [appenderator_persist_0] io.druid.curator.announcement.Announcer - unannouncing [/druid/prod/segments/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105/adx-pro-vdchcm-ants20-ip-10-199-0-20:8105_indexer-executor__default_tier_2016-11-10T10:03:30.126Z_7b540319d7114725bfb5aa8fedc15e100]
         vmop                    [threads: total initially_running wait_to_block]    [time: spin block sync cleanup vmop] page_trap_count
5429.534: RevokeBias                       [     201          0              1    ]      [     0     0     0     0     0    ]  0  
Total time for which application threads were stopped: 0.0008518 seconds, Stopping threads took: 0.0001839 seconds
2016-11-10T11:23:19,430 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[ad_statistic_history_2016-11-09T07:00:00.000Z_2016-11-09T08:00:00.000Z_2016-11-10T09:05:24.457Z_26].
         vmop                    [threads: total initially_running wait_to_block]    [time: spin block sync cleanup vmop] page_trap_count
5429.612: RevokeBias                       [     200          0              1    ]      [     0     0     0     0     0    ]  0  
Total time for which application threads were stopped: 0.0006482 seconds, Stopping threads took: 0.0001711 seconds
2016-11-10T11:23:19,432 INFO [task-runner-0-priority-0] io.druid.indexing.kafka.KafkaIndexTask - The task was asked to stop before completing
2016-11-10T11:23:19,432 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_kafka_ad_statistic_history_617b498088a2c6a_hgmkghjk] status changed to [SUCCESS].
2016-11-10T11:23:19,435 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_kafka_ad_statistic_history_617b498088a2c6a_hgmkghjk",
  "status" : "SUCCESS",
  "duration" : 5424721
}

``

I attached the full logs.

log.txt.zip (903 KB)

It seems datasource only accepted data late under 8h in our situation but this is weird because before we can push data late for few days.
Our datasource now have data from Oct-20 to now and it was segment by hour.

Hey Chanh,

Looks like the task was killed by the supervisor because it took longer than expected to handoff (and handoff never happened which is why the data was unqueryable once the task ended). Try increasing completionTimeout in the supervisor IOConfig to something greater than the default (30 minutes).

Hi David,

Thanks for relying.

I tried to push data into a new datasource it was ok and it used the same configuration.

Any ideas why it takes so long to hand off? What is the different between a datasource create by supervisor and existing datasource and use superior to push data into in hand off period?

Regards,

Chanh

Hey Chanh,

There’s no performance difference between segments being added to an existing datasource vs. a new one being created. The variance in publish/handoff time can be attributed to many factors. The most significant is the volume/interval/complexity of data which affects indexing time, but other factors like CPU load, number/complexity of queries happening at the same time, coordinator configuration can all factor in.