[0.9.2-rc1] Getting Exception while running Kafka Indexing Service

Hey,

I have upgraded my druid version to 0.9.2-rc1 , but I am getting the following error while persisting directory in deep storage (google cloud storage) . I am using using gcs connector with druid-hdfs-storage extension to store the data in deep storage.

Exception trace :

2016-10-13T16:40:08,261 WARN [appenderator_merge_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Failed to push merged index for segment[segment-data-final_2016-10-11T00:00:00.000Z_2016-10-12T00:00:00.000Z_2016-10-13T12:55:26.590Z_1].
java.io.IOException: Failed to rename temp directory[gs://nis-druid-old-data/segments/df0f893f4b064f8cb30ae5fdc0ffb03b/index.zip] and segment directory[gs://nis-druid-old-data/segments/segment-data-final/20161011T000000.000Z_20161012T000000.000Z/2016-10-13T12_55_26.590Z/1] is not present.
	at io.druid.storage.hdfs.HdfsDataSegmentPusher.push(HdfsDataSegmentPusher.java:125) ~[druid-hdfs-storage-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:571) [druid-server-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl.access$600(AppenderatorImpl.java:93) [druid-server-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl$3.apply(AppenderatorImpl.java:467) [druid-server-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl$3.apply(AppenderatorImpl.java:455) [druid-server-0.9.2-rc1.jar:0.9.2-rc1]
	at com.google.common.util.concurrent.Futures$1.apply(Futures.java:713) [guava-16.0.1.jar:?]
	at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:861) [guava-16.0.1.jar:?]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_101]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_101]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101]
2016-10-13T16:40:08,262 WARN [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.FiniteAppenderatorDriver - Failed publishAll (try 104), retrying in 57,967ms.
java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.io.IOException: Failed to rename temp directory[gs://nis-druid-old-data/segments/df0f893f4b064f8cb30ae5fdc0ffb03b/index.zip] and segment directory[gs://nis-druid-old-data/segments/segment-data-final/20161011T000000.000Z_20161012T000000.000Z/2016-10-13T12_55_26.590Z/1] is not present.
	at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) ~[guava-16.0.1.jar:?]
	at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) ~[guava-16.0.1.jar:?]
	at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) ~[guava-16.0.1.jar:?]
	at io.druid.segment.realtime.appenderator.FiniteAppenderatorDriver.publishAll(FiniteAppenderatorDriver.java:417) [druid-server-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.segment.realtime.appenderator.FiniteAppenderatorDriver.finish(FiniteAppenderatorDriver.java:256) [druid-server-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.indexing.kafka.KafkaIndexTask.run(KafkaIndexTask.java:503) [druid-kafka-indexing-service-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:436) [druid-indexing-service-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.indexing.overlord.ThreadPoolTaskRunner$ThreadPoolTaskRunnerCallable.call(ThreadPoolTaskRunner.java:408) [druid-indexing-service-0.9.2-rc1.jar:0.9.2-rc1]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_101]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_101]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_101]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101]
Caused by: java.lang.RuntimeException: java.io.IOException: Failed to rename temp directory[gs://nis-druid-old-data/segments/df0f893f4b064f8cb30ae5fdc0ffb03b/index.zip] and segment directory[gs://nis-druid-old-data/segments/segment-data-final/20161011T000000.000Z_20161012T000000.000Z/2016-10-13T12_55_26.590Z/1] is not present.
	at com.google.common.base.Throwables.propagate(Throwables.java:160) ~[guava-16.0.1.jar:?]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:585) ~[druid-server-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl.access$600(AppenderatorImpl.java:93) ~[druid-server-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl$3.apply(AppenderatorImpl.java:467) ~[druid-server-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl$3.apply(AppenderatorImpl.java:455) ~[druid-server-0.9.2-rc1.jar:0.9.2-rc1]
	at com.google.common.util.concurrent.Futures$1.apply(Futures.java:713) ~[guava-16.0.1.jar:?]
	at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:861) ~[guava-16.0.1.jar:?]
	... 3 more
Caused by: java.io.IOException: Failed to rename temp directory[gs://nis-druid-old-data/segments/df0f893f4b064f8cb30ae5fdc0ffb03b/index.zip] and segment directory[gs://nis-druid-old-data/segments/segment-data-final/20161011T000000.000Z_20161012T000000.000Z/2016-10-13T12_55_26.590Z/1] is not present.
	at io.druid.storage.hdfs.HdfsDataSegmentPusher.push(HdfsDataSegmentPusher.java:125) ~[druid-hdfs-storage-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl.mergeAndPush(AppenderatorImpl.java:571) ~[druid-server-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl.access$600(AppenderatorImpl.java:93) ~[druid-server-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl$3.apply(AppenderatorImpl.java:467) ~[druid-server-0.9.2-rc1.jar:0.9.2-rc1]
	at io.druid.segment.realtime.appenderator.AppenderatorImpl$3.apply(AppenderatorImpl.java:455) ~[druid-server-0.9.2-rc1.jar:0.9.2-rc1]
	at com.google.common.util.concurrent.Futures$1.apply(Futures.java:713) ~[guava-16.0.1.jar:?]
	at com.google.common.util.concurrent.Futures$ChainingListenableFuture.run(Futures.java:861) ~[guava-16.0.1.jar:?]
	... 3 more
2016-10-13T16:40:26,483 INFO [MonitorSch

Hey Saurabh,

This looks to be related to this bug: https://github.com/druid-io/druid/pull/3547

It was backported into 0.9.2 a few days ago so you can either try building the extension from that branch or wait for 0.9.2-rc2 to be released. Either way, let us know if the patch fixes your issue.

I had the same issue and compiling the current 0.9.2 branch and using that version for druid-hdfs-storage fixed the problem.

Hey ,

Previous error got fixed in the task. But now on historical while pulling segments from deep storage , the following exception is occurring. I have check gs bucket and the file is present there.

2016-10-14T13:03:21,202 WARN [ZkCoordinator-1] com.metamx.common.RetryUtils - Failed on try 1, retrying in 1,123ms.

java.io.FileNotFoundException: File does not exist: /segments/segment-data-final-2/20161011T000000.000Z_20161012T000000.000Z/2016-10-14T11_58_12.714Z/2/index.zip

at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)

at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1828)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1712)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:587)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:365)

at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_101]

at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_101]

at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_101]

at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_101]

at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) ~[hadoop-common-2.3.0.jar:?]

at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) ~[hadoop-common-2.3.0.jar:?]

at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1133) ~[hadoop-hdfs-2.3.0.jar:?]

at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1121) ~[hadoop-hdfs-2.3.0.jar:?]

at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1111) ~[hadoop-hdfs-2.3.0.jar:?]

at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:272) ~[hadoop-hdfs-2.3.0.jar:?]

at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:239) ~[hadoop-hdfs-2.3.0.jar:?]

at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:232) ~[hadoop-hdfs-2.3.0.jar:?]

at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1279) ~[hadoop-hdfs-2.3.0.jar:?]

at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:296) ~[hadoop-hdfs-2.3.0.jar:?]

at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:292) ~[hadoop-hdfs-2.3.0.jar:?]

at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-2.3.0.jar:?]

at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:292) ~[hadoop-hdfs-2.3.0.jar:?]

at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:765) ~[hadoop-common-2.3.0.jar:?]

at io.druid.storage.hdfs.HdfsDataSegmentPuller$1.openInputStream(HdfsDataSegmentPuller.java:107) ~[druid-hdfs-storage-0.9.2-rc2-SNAPSHOT.jar:0.9.2-rc2-SNAPSHOT]

at io.druid.storage.hdfs.HdfsDataSegmentPuller.getInputStream(HdfsDataSegmentPuller.java:298) ~[druid-hdfs-storage-0.9.2-rc2-SNAPSHOT.jar:0.9.2-rc2-SNAPSHOT]

at io.druid.storage.hdfs.HdfsDataSegmentPuller$3.openStream(HdfsDataSegmentPuller.java:241) ~[druid-hdfs-storage-0.9.2-rc2-SNAPSHOT.jar:0.9.2-rc2-SNAPSHOT]

at com.metamx.common.CompressionUtils$1.call(CompressionUtils.java:138) ~[java-util-0.27.10.jar:?]

at com.metamx.common.CompressionUtils$1.call(CompressionUtils.java:134) ~[java-util-0.27.10.jar:?]

at com.metamx.common.RetryUtils.retry(RetryUtils.java:60) [java-util-0.27.10.jar:?]

at com.metamx.common.RetryUtils.retry(RetryUtils.java:78) [java-util-0.27.10.jar:?]

at com.metamx.common.CompressionUtils.unzip(CompressionUtils.java:132) [java-util-0.27.10.jar:?]

at io.druid.storage.hdfs.HdfsDataSegmentPuller.getSegmentFiles(HdfsDataSegmentPuller.java:235) [druid-hdfs-storage-0.9.2-rc2-SNAPSHOT.jar:0.9.2-rc2-SNAPSHOT]

at io.druid.storage.hdfs.HdfsLoadSpec.loadSegment(HdfsLoadSpec.java:62) [druid-hdfs-storage-0.9.2-rc2-SNAPSHOT.jar:0.9.2-rc2-SNAPSHOT]

at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:143) [druid-server-0.9.2-rc2-SNAPSHOT.jar:0.9.2-rc2-SNAPSHOT]

at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:95) [druid-server-0.9.2-rc2-SNAPSHOT.jar:0.9.2-rc2-SNAPSHOT]

at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:152) [druid-server-0.9.2-rc2-SNAPSHOT.jar:0.9.2-rc2-SNAPSHOT]

at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:306) [druid-server-0.9.2-rc2-SNAPSHOT.jar:0.9.2-rc2-SNAPSHOT]

at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:351) [druid-server-0.9.2-rc2-SNAPSHOT.jar:0.9.2-rc2-SNAPSHOT]

at io.druid.server.coordination.SegmentChangeRequestLoad.go(SegmentChangeRequestLoad.java:44) [druid-server-0.9.2-rc2-SNAPSHOT.jar:0.9.2-rc2-SNAPSHOT]

at io.druid.server.coordination.ZkCoordinator$1.childEvent(ZkCoordinator.java:153) [druid-server-0.9.2-rc2-SNAPSHOT.jar:0.9.2-rc2-SNAPSHOT]

at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:522) [curator-recipes-2.11.0.jar:?]

at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:516) [curator-recipes-2.11.0.jar:?]

at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93) [curator-framework-2.11.0.jar:?]

at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) [guava-16.0.1.jar:?]

at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:84) [curator-framework-2.11.0.jar:?]

at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:513) [curator-recipes-2.11.0.jar:?]

at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35) [curator-recipes-2.11.0.jar:?]

at org.apache.curator.framework.recipes.cache.PathChildrenCache$9.run(PathChildrenCache.java:773) [curator-recipes-2.11.0.jar:?]

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_101]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_101]

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_101]

at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_101]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_101]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_101]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_101]

Caused by: org.apache.hadoop.ipc.RemoteException: File does not exist: /segments/segment-data-final-2/20161011T000000.000Z_20161012T000000.000Z/2016-10-14T11_58_12.714Z/2/index.zip

at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:71)

at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:61)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1828)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1799)

at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1712)

at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:587)

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:365)

at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)

at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)

at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)

at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:422)

at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)

at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)

at org.apache.hadoop.ipc.Client.call(Client.java:1406) ~[hadoop-common-2.3.0.jar:?]

at org.apache.hadoop.ipc.Client.call(Client.java:1359) ~[hadoop-common-2.3.0.jar:?]

at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) ~[hadoop-common-2.3.0.jar:?]

at com.sun.proxy.$Proxy59.getBlockLocations(Unknown Source) ~[?:?]

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_101]

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_101]

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_101]

at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_101]

at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:186) ~[hadoop-common-2.3.0.jar:?]

at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) ~[hadoop-common-2.3.0.jar:?]

at com.sun.proxy.$Proxy59.getBlockLocations(Unknown Source) ~[?:?]

at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:206) ~[hadoop-hdfs-2.3.0.jar:?]

at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1131) ~[hadoop-hdfs-2.3.0.jar:?]

… 43 more

It looks to me it is trying to access an HDFS location and not a bucket when loading the segment.

Hey Saurabh,

Is this the same issue you reported in https://groups.google.com/d/topic/druid-user/K3CM-jXNOCY/discussion? I raised this issue for that thread: https://github.com/druid-io/druid/issues/3576

Yup that was the same issue.