Ingestion error after upgrade to v0.10.0

Hi,

All of a sudden the data ingestion started to fail once Druid cluster upgraded to the latest v0.10.0 release. From /druid/logs/tasks/index_stats_2017-04-21T14:39:53.795Z.log:

ERROR [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Sink for segment[stats_2017-04-20T00:00:00.000Z_2017-04-21T00:00:00.000Z_2017-04-21T14:39:53.804Z] was unexpectedly full!

io.druid.segment.incremental.IndexSizeExceededException: Maximum number of rows [0] reached

at io.druid.segment.incremental.OnheapIncrementalIndex.addToFacts(OnheapIncrementalIndex.java:200) ~[druid-services-0.10.0-selfcontained.jar:0.10.0]

at io.druid.segment.incremental.IncrementalIndex.add(IncrementalIndex.java:383) ~[druid-services-0.10.0-selfcontained.jar:0.10.0]

at io.druid.segment.realtime.plumber.Sink.add(Sink.java:152) ~[druid-services-0.10.0-selfcontained.jar:0.10.0]

at io.druid.segment.realtime.appenderator.AppenderatorImpl.add(AppenderatorImpl.java:201) [druid-services-0.10.0-selfcontained.jar:0.10.0]

at io.druid.segment.realtime.appenderator.FiniteAppenderatorDriver.add(FiniteAppenderatorDriver.java:205) [druid-services-0.10.0-selfcontained.jar:0.10.0]

at io.druid.indexing.common.task.IndexTask.generateAndPublishSegments(IndexTask.java:435) [druid-services-0.10.0-selfcontained.jar:0.10.0]

at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:207) [druid-services-0.10.0-selfcontained.jar:0.10.0]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-services-0.10.0-selfcontained.jar:0.10.0]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-services-0.10.0-selfcontained.jar:0.10.0]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_121]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]

ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[IndexTask{id=index_stats_2017-04-21T14:39:53.795Z, type=index, dataSource=stats}]

java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-1)

at java.util.ArrayList.subListRangeCheck(ArrayList.java:1006) ~[?:1.8.0_121]

at java.util.ArrayList.subList(ArrayList.java:996) ~[?:1.8.0_121]

at io.druid.segment.realtime.appenderator.AppenderatorImpl.persistAll(AppenderatorImpl.java:363) ~[druid-services-0.10.0-selfcontained.jar:0.10.0]

at io.druid.segment.realtime.appenderator.FiniteAppenderatorDriver.persist(FiniteAppenderatorDriver.java:232) ~[druid-services-0.10.0-selfcontained.jar:0.10.0]

at io.druid.indexing.common.task.IndexTask.generateAndPublishSegments(IndexTask.java:453) ~[druid-services-0.10.0-selfcontained.jar:0.10.0]

at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:207) ~[druid-services-0.10.0-selfcontained.jar:0.10.0]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-services-0.10.0-selfcontained.jar:0.10.0]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-services-0.10.0-selfcontained.jar:0.10.0]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_121]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_121]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_121]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_121]

Any ideas what could go wrong?

If there’s anything that I can provide additionally please just drop here a line. Thanks!

Regards,

Shinesun

It looks like somehow maxRowsInMemory got set to 0. Are you overriding it in your task json?

Hi Gian,

Thanks for the follow up.

You’re right, it’s being overriden in a batch ingestion task by rowFlushBoundary which I’ve found deprecated just now. Though “tuningConfig”: { “type”: “index”, “rowFlushBoundary”: 0 } was working fine until this Druid upgrade.

Now, as I’m about to remove rowFlushBoundary from task entirely (and fallback to maxRowsInMemory default value) are there any affected Druid cluster parts on which I should pay attention to additionally e.g. update required JVM heap size configs? Are there any needs for maxRowsInMemory value other then default?

Thanks!

P.S. Maybe upgrade instructions should be updated accordingly in order to help someone out who could encounter this one too.

Regards,

Shinesun

I guess rowFlushBoundary=0 used to be ignored (or treated the same as null) but now it’s respected. I bet if you remove rowFlushBoundary, the behavior in 0.10.0 will be the same as you got in 0.9.2.

Hello,

I’m having a very similar issue, but with maxRowsInMemory set to non-zero (75000); can you please help me to investigate? Below is the relevant part of the logs:

2017-08-17T09:24:25,144 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.FiniteAppenderatorDriver - New segment[my_events_druid_test_2017-08-15T00:00:00.000Z_2017-08-16T00:00:00.000Z_2017-08-17T09:23:33.297Z] for sequenceName[index_2017-08-15T00:00:00.000Z/2017-08-16T00:00:00.000Z_2017-08-17T09:23:33.297Z_0].
2017-08-17T09:24:25,144 INFO [main] com.sun.jersey.server.impl.application.WebApplicationImpl - Initiating Jersey application, version 'Jersey: 1.19 02/11/2015 03:25 AM'
2017-08-17T09:24:25,180 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.FiniteAppenderatorDriver - Persisting data.
2017-08-17T09:24:25,185 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Shutting down...
2017-08-17T09:24:25,187 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[my_events_druid_test_2017-08-15T00:00:00.000Z_2017-08-16T00:00:00.000Z_2017-08-17T09:23:33.297Z].
2017-08-17T09:24:25,192 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[IndexTask{id=index_my_events_druid_test_2017-08-17T09:23:33.293Z, type=index, dataSource=my_events_druid_test}]
java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-1)
	at java.util.ArrayList.subListRangeCheck(ArrayList.java:1006) ~[?:1.8.0_131]
	at java.util.ArrayList.subList(ArrayList.java:996) ~[?:1.8.0_131]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl.persistAll(AppenderatorImpl.java:363) ~[druid-server-0.10.0.jar:0.10.0]
	at io.druid.segment.realtime.appenderator.FiniteAppenderatorDriver.persist(FiniteAppenderatorDriver.java:232) ~[druid-server-0.10.0.jar:0.10.0]
	at io.druid.indexing.common.task.IndexTask.generateAndPublishSegments(IndexTask.java:453) ~[druid-indexing-service-0.10.0.jar:0.10.0]
	at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:207) ~[druid-indexing-service-0.10.0.jar:0.10.0]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.10.0.jar:0.10.0]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.10.0.jar:0.10.0]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_131]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_131]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_131]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
2017-08-17T09:24:25,196 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_my_events_druid_test_2017-08-17T09:23:33.293Z] status changed to [FAILED].

Many thanks in advance,
Loïc

Some additional information: it only occurs for an Index Task (the data being stored locally on disk); on the same cluster, we could successfully run Hadoop Index Tasks and Kafka indexing as well.
Attached is the task specification.

Thanks,

Loïc

local_index_task_specs.json (2.26 KB)