Indexing task fails with Error " java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-1)"

Hi,

I am trying to load a sample data set to druid , however my indexing Task fails with error below.

I am not using hadoop and using local storage to read the data from. Could some one please help me in resolving this issue.

I have looked through one of the post with similar issue however not being of much help

I have tried loading a JSON file and it works fine, I am not sure if i have any issues with my Spec file (attaching the same for reference)

2018-02-07T12:36:34,705 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorDriver - New segment[retail_2010-12-01T00:00:00.00

0Z_2010-12-02T00:00:00.000Z_2018-02-07T12:36:29.193Z] for sequenceName[index_retail_2018-02-07T12:36:29.191Z].

2018-02-07T12:36:34,760 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorDriver - Persisting data.

2018-02-07T12:36:34,766 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Shutting down…

2018-02-07T12:36:34,770 INFO [appenderator_persist_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Removing sink for segment[retail_2010-12-01T00:00:00.000Z_2010-12-02T00:00:00.000Z_2018-02-07T12:36:29.193Z].

2018-02-07T12:36:34,776 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[IndexTask{id=index_retail_2018-02-07T12:36:29.191Z, type=index, dataSource=retail}]

java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-1)

at java.util.ArrayList.subListRangeCheck(ArrayList.java:1014) ~[?:1.8.0_152]

at java.util.ArrayList.subList(ArrayList.java:1004) ~[?:1.8.0_152]

at io.druid.segment.realtime.appenderator.AppenderatorImpl.persist(AppenderatorImpl.java:381) ~[druid-server-0.11.0.jar:0.11.0]

at io.druid.segment.realtime.appenderator.AppenderatorImpl.persistAll(AppenderatorImpl.java:462) ~[druid-server-0.11.0.jar:0.11.0]

at io.druid.segment.realtime.appenderator.AppenderatorDriver.persist(AppenderatorDriver.java:258) ~[druid-server-0.11.0.jar:0.11.0]

at io.druid.indexing.common.task.IndexTask.generateAndPublishSegments(IndexTask.java:695) ~[druid-indexing-service-0.11.0.jar:0.11.0]

at io.druid.indexing.common.task.IndexTask.run(IndexTask.java:233) ~[druid-indexing-service-0.11.0.jar:0.11.0]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.11.0.jar:0.11.0]

at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.11.0.jar:0.11.0]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_152]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_152]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_152]

at java.lang.Thread.run(Thread.java:748) [?:1.8.0_152]

2018-02-07T12:36:34,783 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_retail_2018-02-07T12:36:29.191Z] status changed to [FAILED].

2018-02-07T12:36:34,787 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {

“id” : “index_retail_2018-02-07T12:36:29.191Z”,

“status” : “FAILED”,

“duration” : 354

}

Thanks

Atul S

retail.json (2.4 KB)

Hi Atul,

I think this exception occurs when the segment to be persisted has no rows, can you try running your job with reportParseExceptions=true in the tuningConfig? Maybe also try double checking your timestamp formats.

  • Jon

Hi Jon,

I see the same issue with Druid 0.12.1-rc2. I also notice some data loss after this error is hit. Is this going to be resolved in 0.12.1 or 0.13.0?

2018-05-29T20:06:46,138 INFO [task-runner-0-priority-0] io.druid.server.coordination.CuratorDataSegmentServerAnnouncer - Unannouncing self[DruidServerMetadata{name='10.1.1.1:8102', hostAndPort='10.1.1.1:8102', hostAndTlsPort='null', maxSize=0, tier='_default_tier', type=indexer-executor, priority=0}] at [/druid/announcements/10.1.1.1:8102]
2018-05-29T20:06:46,138 INFO [task-runner-0-priority-0] io.druid.curator.announcement.Announcer - unannouncing [/druid/announcements/10.1.1.1:8102]
2018-05-29T20:06:46,142 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[KafkaIndexTask{id=index_kafka_topic_8e372287495ab31_aijnmgpe, type=index_kafka, dataSource=topic}]
java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-1)
	at java.util.ArrayList.subListRangeCheck(ArrayList.java:1014) ~[?:1.8.0_162]
	at java.util.ArrayList.subList(ArrayList.java:1004) ~[?:1.8.0_162]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl.persistAll(AppenderatorImpl.java:408) ~[druid-server-0.12.1-rc2.jar:0.12.1-rc2]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl.push(AppenderatorImpl.java:519) ~[druid-server-0.12.1-rc2.jar:0.12.1-rc2]
	at io.druid.segment.realtime.appenderator.BaseAppenderatorDriver.pushInBackground(BaseAppenderatorDriver.java:351) ~[druid-server-0.12.1-rc2.jar:0.12.1-rc2]
	at io.druid.segment.realtime.appenderator.StreamAppenderatorDriver.publish(StreamAppenderatorDriver.java:268) ~[druid-server-0.12.1-rc2.jar:0.12.1-rc2]
	at io.druid.indexing.kafka.KafkaIndexTask.lambda$createAndStartPublishExecutor$1(KafkaIndexTask.java:364) ~[?:?]
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_162]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_162]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_162]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_162]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
2018-05-29T20:06:46,143 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_kafka_topic_8e372287495ab31_aijnmgpe] status changed to [FAILED].
2018-05-29T20:06:46,146 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "index_kafka_topic_8e372287495ab31_aijnmgpe",
  "status" : "FAILED",
  "duration" : 583724
}

Thanks,

Avinash

Hi Avinash,

The resolution for “java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-1)” is most likely to fix whatever errors are present in the input data that leads to unparseable rows, which is something that must be done on the user side.

0.13.0 will have better error reporting for ingestion tasks (https://github.com/druid-io/druid/pull/5418), which should help with identifying such parsing errors.

Thanks,
Jon

We observed this issue also, but i guess it’s not a parse problem, since the segments regenerated in the following task and finally published successful.
A little different is that in my case the exception is reported when serialize the metadata:

2019-01-29T13:30:48,059 INFO [access_log-incremental-persist] io.druid.segment.realtime.appenderator.AppenderatorImpl - Committing metadata[AppenderatorDriverMetadata{segments={****}].

2019-01-29T13:30:48,060 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.StreamAppenderatorDriver - Persisted pending data in 48ms.

2019-01-29T13:30:48,064 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Shutting down immediately…

2019-01-29T13:30:48,065 INFO [task-runner-0-priority-0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[***] at path[/druid/segments/emr-worker-4.cluster-72321:8101/emr-worker-4.cluster-72321:8101_indexer-executor__default_tier_2019-01-29T12:50:13.319Z_51df8fd2617a48868262f617e05efb170]

2019-01-29T13:30:48,066 INFO [task-runner-0-priority-0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[***] at path[/druid/segments/emr-worker-4.cluster-72321:8101/emr-worker-4.cluster-72321:8101_indexer-executor__default_tier_2019-01-29T12:50:13.319Z_51df8fd2617a48868262f617e05efb170]

2019-01-29T13:30:48,067 INFO [task-runner-0-priority-0] io.druid.server.coordination.BatchDataSegmentAnnouncer - Unannouncing segment[***] at path[/druid/segments/emr-worker-4.cluster-72321:8101/emr-worker-4.cluster-72321:8101_indexer-executor__default_tier_2019-01-29T12:50:13.319Z_51df8fd2617a48868262f617e05efb170]

2019-01-29T13:30:48,067 INFO [task-runner-0-priority-0] io.druid.curator.announcement.Announcer - unannouncing [/druid/segments/emr-worker-4.cluster-72321:8101/emr-worker-4.cluster-72321:8101_indexer-executor__default_tier_2019-01-29T12:50:13.319Z_51df8fd2617a48868262f617e05efb170]

2019-01-29T13:30:48,088 INFO [task-runner-0-priority-0] io.druid.segment.realtime.firehose.ServiceAnnouncingChatHandlerProvider - Unregistering chat handler[***]

2019-01-29T13:30:48,088 INFO [task-runner-0-priority-0] io.druid.curator.discovery.CuratorDruidNodeAnnouncer - Unannouncing [DiscoveryDruidNode{druidNode=DruidNode{serviceName=‘druid/middleManager’, host=‘emr-worker-4.cluster-72321’, port=-1, plaintextPort=8101, enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, nodeType=‘peon’, services={dataNodeService=DataNodeService{tier=’_default_tier’, maxSize=0, type=indexer-executor, priority=0}, lookupNodeService=LookupNodeService{lookupTier=’__default’}}}].

2019-01-29T13:30:48,088 INFO [task-runner-0-priority-0] io.druid.curator.announcement.Announcer - unannouncing [/druid/internal-discovery/peon/emr-worker-4.cluster-72321:8101]

2019-01-29T13:30:48,092 INFO [task-runner-0-priority-0] io.druid.curator.discovery.CuratorDruidNodeAnnouncer - Unannounced [DiscoveryDruidNode{druidNode=DruidNode{serviceName=‘druid/middleManager’, host=‘emr-worker-4.cluster-72321’, port=-1, plaintextPort=8101, enablePlaintextPort=true, tlsPort=-1, enableTlsPort=false}, nodeType=‘peon’, services={dataNodeService=DataNodeService{tier=’_default_tier’, maxSize=0, type=indexer-executor, priority=0}, lookupNodeService=LookupNodeService{lookupTier=’__default’}}}].

2019-01-29T13:30:48,092 INFO [task-runner-0-priority-0] io.druid.server.coordination.CuratorDataSegmentServerAnnouncer - Unannouncing self[DruidServerMetadata{name=‘emr-worker-4.cluster-72321:8101’, hostAndPort=‘emr-worker-4.cluster-72321:8101’, hostAndTlsPort=‘null’, maxSize=0, tier=’_default_tier’, type=indexer-executor, priority=0}] at [/druid/announcements/emr-worker-4.cluster-72321:8101]

2019-01-29T13:30:48,092 INFO [task-runner-0-priority-0] io.druid.curator.announcement.Announcer - unannouncing [/druid/announcements/emr-worker-4.cluster-72321:8101]

2019-01-29T13:30:48,095 ERROR [task-runner-0-priority-0] io.druid.indexing.overlord.ThreadPoolTaskRunner - Exception while running task[KafkaIndexTask{id=index_kafka_access_log_5855cca6d6ce7d7_ojahebcn, type=index_kafka, dataSource=access_log}]

java.lang.IllegalArgumentException: fromIndex(0) > toIndex(-1)

at java.util.ArrayList.subListRangeCheck(ArrayList.java:1012) ~[?:1.8.0_151]

at java.util.ArrayList.subList(ArrayList.java:1002) ~[?:1.8.0_151]

at io.druid.segment.realtime.appenderator.AppenderatorImpl.persistAll(AppenderatorImpl.java:408) ~[druid-server-0.12.1.jar:0.12.1]

at io.druid.segment.realtime.appenderator.AppenderatorImpl.push(AppenderatorImpl.java:519) ~[druid-server-0.12.1.jar:0.12.1]

at io.druid.segment.realtime.appenderator.BaseAppenderatorDriver.pushInBackground(BaseAppenderatorDriver.java:351) ~[druid-server-0.12.1.jar:0.12.1]

at io.druid.segment.realtime.appenderator.StreamAppenderatorDriver.publish(StreamAppenderatorDriver.java:268) ~[druid-server-0.12.1.jar:0.12.1]

at io.druid.indexing.kafka.KafkaIndexTask.lambda$createAndStartPublishExecutor$1(KafkaIndexTask.java:364) ~[?:?]

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_151]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_151]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_151]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_151]

at java.lang.Thread.run(Thread.java:748) [?:1.8.0_151]

2019-01-29T13:30:48,096 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [***] status changed to [FAILED].

2019-01-29T13:30:48,097 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {

“id” : “***”,

“status” : “FAILED”,

“duration” : 2436516

}