Indexing task not shutting down for one particular datasource

I am using indexing service(overlord) to ingest data in druid via tranquility.
Deep storage in my setup is HDFS and there is one instance of middelmanager,historical,broker and overlord running.

I am pushing events to two datasources lets say dataSource1 and dataSource2 and my druid window is set to 15 mins with granularity of segment set to 1 min for both datasources.

Indexing works fine for dataSource1 but for dataSource2 the indexing task never shuts down.

In the indexing task logs i can see these suspicious lines:

2015-09-08T11:12:00,003 INFO [demand-overseer-0] io.druid.segment.realtime.plumber.RealtimePlumber - Starting merge and push.
2015-09-08T11:12:00,003 INFO [demand-overseer-0] io.druid.segment.realtime.plumber.RealtimePlumber - Found [1] sinks. minTimestamp [1970-01-01T00:00:00.000Z]
2015-09-08T11:12:00,003 WARN [demand-overseer-0] io.druid.segment.realtime.plumber.RealtimePlumber - [2015-09-08T10:45:00.000Z] < [1970-01-01T00:00:00.000Z] Skipping persist and merge.
2015-09-08T11:12:00,003 INFO [demand-overseer-0] io.druid.segment.realtime.plumber.RealtimePlumber - Found [0] sinks to persist and merge
2015-09-08T11:13:00,003 INFO [demand-overseer-0] io.druid.segment.realtime.plumber.RealtimePlumber - Starting merge and push.
2015-09-08T11:13:00,003 INFO [demand-overseer-0] io.druid.segment.realtime.plumber.RealtimePlumber - Found [1] sinks. minTimestamp [1970-01-01T00:00:00.000Z]
2015-09-08T11:13:00,003 WARN [demand-overseer-0] io.druid.segment.realtime.plumber.RealtimePlumber - [2015-09-08T10:45:00.000Z] < [1970-01-01T00:00:00.000Z] Skipping persist and merge.
2015-09-08T11:13:00,003 INFO [demand-overseer-0] io.druid.segment.realtime.plumber.RealtimePlumber - Found [0] sinks to persist and merge
2015-09-08T11:14:00,003 INFO [demand-overseer-0] io.druid.segment.realtime.plumber.RealtimePlumber - Starting merge and push.
2015-09-08T11:14:00,003 INFO [demand-overseer-0] io.druid.segment.realtime.plumber.RealtimePlumber - Found [1] sinks. minTimestamp [1970-01-01T00:00:00.000Z]
2015-09-08T11:14:00,003 WARN [demand-overseer-0] io.druid.segment.realtime.plumber.RealtimePlumber - [2015-09-08T10:45:00.000Z] < [1970-01-01T00:00:00.000Z]

From these log lines it looks like that some issue exist with the persist and merge phase.Last time i saw this exception there was configuration issue with deep storage but this time this doesn’t seem to be the case as indexing is working fine for 1 of the datasources.

In historical logs i am seeing exceptions about not able to load segment for dataSource1(for which indexing works fine) and no exceptions for dataSource2(culprit datasource),these exceptions are caused because i deleted few data files from HDFS to recover space without removing those segments from druid.

Any idea what could be wrong here??Can deleting of files from HDFS cause these kind of errors and why does the indexing logs say "[2015-09-08T10:45:00.000Z] < [1970-01-01T00:00:00.000Z] Skipping persist and merge" which is not true?

Hey Rohit,

That’s actually normal with tranquility. Periodic handoff is generally disabled by tranquility, and instead, handoff will happen when the timed shutoff occurs. You should see a log message like “Shutting down…” in your logs followed by some attempts to hand off segments. I would guess that something is wrong with the handoff- either the tasks can’t merge and push to deep storage, or, they can do that but the coordinator is not picking up the segments.

Btw- that log message is changed in more recent versions of Druid to be correct (< should have been >= in the message) and also more clear about what’s actually going on.