[druid-user] org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager - Failed to load segment

Hi,

I am facing an issue after changing the Active and Standby Named nodes of HDFS. The Indexing is giving issues. I am unable to access the segments in HDFS. I am using postgres as the metadata storage. The segment references are present in the postgres but physically they dont exist in the HDFS if I browse through the directory in HDFS.

PFB the logs.

java.io.FileNotFoundException: File does not exist: /druid/segments/pts_session_summary/20220301T000000.000Z_20220401T000000.000Z/2022-03-01T00_00_08.525Z/345_594a571f-b840-4aaf-87fe-7be854063a28_index.zip
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:156)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1990)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:768)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:442)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1029)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:957)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2957)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_281]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_281]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_281]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_281]
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) ~[?:?]
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:849) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:836) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:825) ~[?:?]
at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:325) ~[?:?]
at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:285) ~[?:?]
at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:270) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1064) ~[?:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:328) ~[?:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325) ~[?:?]
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[?:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:325) ~[?:?]
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:787) ~[?:?]
at org.apache.druid.storage.hdfs.HdfsDataSegmentPuller$2.openInputStream(HdfsDataSegmentPuller.java:124) ~[?:?]
at org.apache.druid.storage.hdfs.HdfsDataSegmentPuller.getInputStream(HdfsDataSegmentPuller.java:298) ~[?:?]
at org.apache.druid.storage.hdfs.HdfsDataSegmentPuller$3.openStream(HdfsDataSegmentPuller.java:249) ~[?:?]
at org.apache.druid.utils.CompressionUtils.lambda$unzip$1(CompressionUtils.java:188) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:87) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:115) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:105) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.utils.CompressionUtils.unzip(CompressionUtils.java:187) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.storage.hdfs.HdfsDataSegmentPuller.getSegmentFiles(HdfsDataSegmentPuller.java:243) ~[?:?]
at org.apache.druid.storage.hdfs.HdfsLoadSpec.loadSegment(HdfsLoadSpec.java:57) ~[?:?]
at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.loadInLocation(SegmentLoaderLocalCacheManager.java:304) ~[druid-server-0.21.0.jar:0.21.0]
at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.loadInLocationWithStartMarker(SegmentLoaderLocalCacheManager.java:292) ~[druid-server-0.21.0.jar:0.21.0]
at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.loadSegmentWithRetry(SegmentLoaderLocalCacheManager.java:253) ~[druid-server-0.21.0.jar:0.21.0]
at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:225) ~[druid-server-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.input.DruidSegmentInputEntity.fetch(DruidSegmentInputEntity.java:74) ~[druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.input.DruidSegmentReader.intermediateRowIterator(DruidSegmentReader.java:95) ~[druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.data.input.IntermediateRowParsingReader.read(IntermediateRowParsingReader.java:44) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.lambda$read$0(InputEntityIteratingReader.java:78) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIteratorIfNecessary(CloseableIterator.java:84) [druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator$2.(CloseableIterator.java:69) [druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator.flatMap(CloseableIterator.java:67) [druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.createIterator(InputEntityIteratingReader.java:103) [druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.read(InputEntityIteratingReader.java:74) [druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.segment.transform.TransformingInputSourceReader.read(TransformingInputSourceReader.java:43) [druid-processing-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.inputSourceReader(AbstractBatchIndexTask.java:195) [druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:330) [druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:197) [druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:152) [druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:451) [druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:423) [druid-indexing-service-0.21.0.jar:0.21.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_281]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_281]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_281]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_281]
Caused by: org.apache.hadoop.ipc.RemoteException: File does not exist: /druid/segments/pts_session_summary/20220301T000000.000Z_20220401T000000.000Z/2022-03-01T00_00_08.525Z/345_594a571f-b840-4aaf-87fe-7be854063a28_index.zip
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:156)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1990)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:768)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:442)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1029)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:957)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2957)

at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1489) ~[?:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1435) ~[?:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1345) ~[?:?]
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227) ~[?:?]
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) ~[?:?]
at com.sun.proxy.$Proxy69.getBlockLocations(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:259) ~[?:?]
at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_281]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_281]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409) ~[?:?]
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163) ~[?:?]
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155) ~[?:?]
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) ~[?:?]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346) ~[?:?]
at com.sun.proxy.$Proxy70.getBlockLocations(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:847) ~[?:?]
… 45 more
2022-03-23T08:25:12,989 ERROR [task-runner-0-priority-0] org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager - Failed to load segment in current location [/home/release/release_independent/druid/task/single_phase_sub_task_pts_session_summary_temp_hamfjkbb_2022-03-23T08:24:58.160Z/work/indexing-tmp], try next location if any: {class=org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager, exceptionType=class org.apache.druid.segment.loading.SegmentLoadingException, exceptionMessage=Error loading [hdfs://plcluster/druid/segments/pts_session_summary/20220301T000000.000Z_20220401T000000.000Z/2022-03-01T00_00_08.525Z/345_594a571f-b840-4aaf-87fe-7be854063a28_index.zip], location=/home/release/release_independent/druid/task/single_phase_sub_task_pts_session_summary_temp_hamfjkbb_2022-03-23T08:24:58.160Z/work/indexing-tmp}
org.apache.druid.segment.loading.SegmentLoadingException: Error loading [hdfs://plcluster/druid/segments/pts_session_summary/20220301T000000.000Z_20220401T000000.000Z/2022-03-01T00_00_08.525Z/345_594a571f-b840-4aaf-87fe-7be854063a28_index.zip]
at org.apache.druid.storage.hdfs.HdfsDataSegmentPuller.getSegmentFiles(HdfsDataSegmentPuller.java:292) ~[?:?]
at org.apache.druid.storage.hdfs.HdfsLoadSpec.loadSegment(HdfsLoadSpec.java:57) ~[?:?]
at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.loadInLocation(SegmentLoaderLocalCacheManager.java:304) ~[druid-server-0.21.0.jar:0.21.0]
at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.loadInLocationWithStartMarker(SegmentLoaderLocalCacheManager.java:292) ~[druid-server-0.21.0.jar:0.21.0]
at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.loadSegmentWithRetry(SegmentLoaderLocalCacheManager.java:253) ~[druid-server-0.21.0.jar:0.21.0]
at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:225) ~[druid-server-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.input.DruidSegmentInputEntity.fetch(DruidSegmentInputEntity.java:74) ~[druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.input.DruidSegmentReader.intermediateRowIterator(DruidSegmentReader.java:95) ~[druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.data.input.IntermediateRowParsingReader.read(IntermediateRowParsingReader.java:44) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.lambda$read$0(InputEntityIteratingReader.java:78) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIteratorIfNecessary(CloseableIterator.java:84) [druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator$2.(CloseableIterator.java:69) [druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator.flatMap(CloseableIterator.java:67) [druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.createIterator(InputEntityIteratingReader.java:103) [druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.read(InputEntityIteratingReader.java:74) [druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.segment.transform.TransformingInputSourceReader.read(TransformingInputSourceReader.java:43) [druid-processing-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.inputSourceReader(AbstractBatchIndexTask.java:195) [druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:330) [druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:197) [druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:152) [druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:451) [druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:423) [druid-indexing-service-0.21.0.jar:0.21.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_281]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_281]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_281]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_281]
Caused by: java.io.FileNotFoundException: File does not exist: /druid/segments/pts_session_summary/20220301T000000.000Z_20220401T000000.000Z/2022-03-01T00_00_08.525Z/345_594a571f-b840-4aaf-87fe-7be854063a28_index.zip
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:156)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1990)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:768)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:442)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1029)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:957)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2957)

at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_281]
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_281]
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_281]
at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_281]
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:121) ~[?:?]
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:88) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:849) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:836) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:825) ~[?:?]
at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:325) ~[?:?]
at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:285) ~[?:?]
at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:270) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1064) ~[?:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:328) ~[?:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325) ~[?:?]
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[?:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:325) ~[?:?]
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:787) ~[?:?]
at org.apache.druid.storage.hdfs.HdfsDataSegmentPuller$2.openInputStream(HdfsDataSegmentPuller.java:124) ~[?:?]
at org.apache.druid.storage.hdfs.HdfsDataSegmentPuller.getInputStream(HdfsDataSegmentPuller.java:298) ~[?:?]
at org.apache.druid.storage.hdfs.HdfsDataSegmentPuller$3.openStream(HdfsDataSegmentPuller.java:249) ~[?:?]
at org.apache.druid.utils.CompressionUtils.lambda$unzip$1(CompressionUtils.java:188) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:87) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:115) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:105) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.utils.CompressionUtils.unzip(CompressionUtils.java:187) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.storage.hdfs.HdfsDataSegmentPuller.getSegmentFiles(HdfsDataSegmentPuller.java:243) ~[?:?]
… 25 more
Caused by: org.apache.hadoop.ipc.RemoteException: File does not exist: /druid/segments/pts_session_summary/20220301T000000.000Z_20220401T000000.000Z/2022-03-01T00_00_08.525Z/345_594a571f-b840-4aaf-87fe-7be854063a28_index.zip
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:86)
at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:76)
at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:156)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1990)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:768)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:442)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1086)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1029)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:957)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1762)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2957)

at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1489) ~[?:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1435) ~[?:?]
at org.apache.hadoop.ipc.Client.call(Client.java:1345) ~[?:?]
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227) ~[?:?]
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) ~[?:?]
at com.sun.proxy.$Proxy69.getBlockLocations(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:259) ~[?:?]
at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_281]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_281]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409) ~[?:?]
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163) ~[?:?]
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155) ~[?:?]
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) ~[?:?]
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346) ~[?:?]
at com.sun.proxy.$Proxy70.getBlockLocations(Unknown Source) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:847) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:836) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:825) ~[?:?]
at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:325) ~[?:?]
at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:285) ~[?:?]
at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:270) ~[?:?]
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1064) ~[?:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:328) ~[?:?]
at org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325) ~[?:?]
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[?:?]
at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:325) ~[?:?]
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:787) ~[?:?]
at org.apache.druid.storage.hdfs.HdfsDataSegmentPuller$2.openInputStream(HdfsDataSegmentPuller.java:124) ~[?:?]
at org.apache.druid.storage.hdfs.HdfsDataSegmentPuller.getInputStream(HdfsDataSegmentPuller.java:298) ~[?:?]
at org.apache.druid.storage.hdfs.HdfsDataSegmentPuller$3.openStream(HdfsDataSegmentPuller.java:249) ~[?:?]
at org.apache.druid.utils.CompressionUtils.lambda$unzip$1(CompressionUtils.java:188) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:87) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:115) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.RetryUtils.retry(RetryUtils.java:105) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.utils.CompressionUtils.unzip(CompressionUtils.java:187) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.storage.hdfs.HdfsDataSegmentPuller.getSegmentFiles(HdfsDataSegmentPuller.java:243) ~[?:?]
… 25 more
2022-03-23T08:25:13,019 INFO [task-runner-0-priority-0] org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager - Deleting directory[/home/release/release_independent/druid/task/single_phase_sub_task_pts_session_summary_temp_hamfjkbb_2022-03-23T08:24:58.160Z/work/indexing-tmp/pts_session_summary/2022-03-01T00:00:00.000Z_2022-04-01T00:00:00.000Z/2022-03-01T00:00:08.525Z/345]
2022-03-23T08:25:13,021 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Exception while running task[AbstractTask{id=‘single_phase_sub_task_pts_session_summary_temp_hamfjkbb_2022-03-23T08:24:58.160Z’, groupId=‘index_parallel_pts_session_summary_temp_hfpmllec_2022-03-23T08:21:50.948Z’, taskResource=TaskResource{availabilityGroup=‘single_phase_sub_task_pts_session_summary_temp_hamfjkbb_2022-03-23T08:24:58.160Z’, requiredCapacity=1}, dataSource=‘pts_session_summary_temp’, context={forceTimeChunkLock=true}}]
java.lang.RuntimeException: org.apache.druid.segment.loading.SegmentLoadingException: Failed to load segment pts_session_summary_2022-03-01T00:00:00.000Z_2022-04-01T00:00:00.000Z_2022-03-01T00:00:08.525Z_345 in all locations.
at org.apache.druid.indexing.input.DruidSegmentInputEntity.fetch(DruidSegmentInputEntity.java:77) ~[druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.input.DruidSegmentReader.intermediateRowIterator(DruidSegmentReader.java:95) ~[druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.data.input.IntermediateRowParsingReader.read(IntermediateRowParsingReader.java:44) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.lambda$read$0(InputEntityIteratingReader.java:78) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIteratorIfNecessary(CloseableIterator.java:84) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator$2.(CloseableIterator.java:69) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.java.util.common.parsers.CloseableIterator.flatMap(CloseableIterator.java:67) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.createIterator(InputEntityIteratingReader.java:103) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.data.input.impl.InputEntityIteratingReader.read(InputEntityIteratingReader.java:74) ~[druid-core-0.21.0.jar:0.21.0]
at org.apache.druid.segment.transform.TransformingInputSourceReader.read(TransformingInputSourceReader.java:43) ~[druid-processing-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.inputSourceReader(AbstractBatchIndexTask.java:195) ~[druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:330) ~[druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:197) ~[druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:152) ~[druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:451) [druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:423) [druid-indexing-service-0.21.0.jar:0.21.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_281]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_281]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_281]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_281]
Caused by: org.apache.druid.segment.loading.SegmentLoadingException: Failed to load segment pts_session_summary_2022-03-01T00:00:00.000Z_2022-04-01T00:00:00.000Z_2022-03-01T00:00:08.525Z_345 in all locations.
at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.loadSegmentWithRetry(SegmentLoaderLocalCacheManager.java:271) ~[druid-server-0.21.0.jar:0.21.0]
at org.apache.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegmentFiles(SegmentLoaderLocalCacheManager.java:225) ~[druid-server-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.input.DruidSegmentInputEntity.fetch(DruidSegmentInputEntity.java:74) ~[druid-indexing-service-0.21.0.jar:0.21.0]
… 19 more
2022-03-23T08:25:13,026 INFO [task-runner-0-priority-0] org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
“id” : “single_phase_sub_task_pts_session_summary_temp_hamfjkbb_2022-03-23T08:24:58.160Z”,
“status” : “FAILED”,
“duration” : 9553,
“errorMsg” : “java.lang.RuntimeException: org.apache.druid.segment.loading.SegmentLoadingException: Failed to load…”,
“location” : {
“host” : null,
“port” : -1,
“tlsPort” : -1
}
}

Hi Harshit,

druid.metadata.storage.tables.segments might give you a little more insight into your existing segments, but I’m wondering if you might end up having to re-ingest your data? I’m including How do I get HDFS to work?, Hadoop-based ingestion, and Hadoop ingestion in case another reader hasn’t seen these resources.

Best,

Mark

If the segments don’t exist in HDFS, it sounds like a problem with the active/passive setup there. I expect druid is trying to pull from the once-standby, now active location, and not finding the files.

1 Like

Hi Ben,

Yes, there was a flip in active and standby nodes. Will this be resolved if I flip them over again?

Thanks,

Harshit

Hi Mark,

I have checked the segment tables. The segment metadata does exist in the table but after manually checking in HDFS, I cannot find that segment.

Thanks,
Harshit