Index_hadoop : Failed to find any Kerberos tgt

I am using druid 0.22.1 and working with kerberos hadoop. I entered the keytab normally in the properties.

druid.hadoop.security.kerberos.principal
druid.hadoop.security.kerberos.keytab

deep_storage and index_parallel operations work normally.
However, when working with index_hadoop, the job is successful, but it seems that the Kerberos keytab is not recognized in the last cleanup job.

Looking at the log, there is no problem with index_hadoop’s determin_partition operation and index_generator operation, and there is no issue with the deleting operation after the first MapReduce operation is finished.

However, when all tasks are finished and the last deleting task is executed, Kerberos ticket issue occurs as shown below. There is nothing wrong with the segment creation and the result, but I want to find a problem to fix the error.

2022-03-25T09:28:23,896 INFO [task-runner-0-priority-0] org.apache.druid.indexer.DetermineHashedPartitionsJob - Job job-determine_partitions_hashed-Optional.absent() submitted, status available at: ..../
2022-03-25T09:28:23,897 INFO [task-runner-0-priority-0] org.apache.druid.indexer.JobHelper - MR job id [job_aa] is written to the file [path_2022-03-25T09:27:52.331Z/mapReduceJobId.json]
2022-03-25T09:28:23,898 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Running job: job_aa
2022-03-25T09:28:41,866 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_aa running in uber mode : false
2022-03-25T09:28:41,867 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 0% reduce 0%
2022-03-25T09:29:09,847 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 24% reduce 0%
2022-03-25T09:29:12,878 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 35% reduce 0%
2022-03-25T09:29:14,907 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 49% reduce 0%
2022-03-25T09:29:15,918 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 55% reduce 0%
2022-03-25T09:29:16,928 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 65% reduce 0%
2022-03-25T09:29:18,949 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 77% reduce 0%
2022-03-25T09:29:19,958 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 82% reduce 0%
2022-03-25T09:29:22,988 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 0%
2022-03-25T09:30:09,456 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 100%
2022-03-25T09:30:30,683 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_aa completed successfully
....
2022-03-25T09:30:33,753 INFO [task-runner-0-priority-0] org.apache.druid.indexer.JobHelper - Deleting path[path/2022-03-25T092752.331Z_6745fbc8036a47de8d693a8c90c8b480]
....
2022-03-25T09:32:26,513 INFO [task-runner-0-priority-0] org.apache.druid.indexer.IndexGeneratorJob - Job job-index-generator-Optional.of([2022-03-22T06:00:00.000Z/2022-03-22T07:00:00.000Z]) submitted, status available at .../
2022-03-25T09:32:26,513 INFO [task-runner-0-priority-0] org.apache.druid.indexer.JobHelper - MR job id [job_bb] is written to the file [path_2022-03-25T09:27:52.331Z/mapReduceJobId.json]
2022-03-25T09:32:26,513 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Running job: job_bb
2022-03-25T09:32:45,453 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_bb running in uber mode : false
2022-03-25T09:32:45,453 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 0% reduce 0%
2022-03-25T09:33:15,710 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 10% reduce 0%
2022-03-25T09:33:18,738 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 15% reduce 0%
2022-03-25T09:33:20,756 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 20% reduce 0%
2022-03-25T09:33:23,784 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 22% reduce 0%
2022-03-25T09:33:27,822 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 33% reduce 0%
2022-03-25T09:33:35,894 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 35% reduce 0%
2022-03-25T09:33:38,922 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 39% reduce 0%
2022-03-25T09:33:41,949 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 47% reduce 0%
2022-03-25T09:33:44,976 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 55% reduce 0%
2022-03-25T09:33:48,003 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 62% reduce 0%
2022-03-25T09:33:51,030 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 69% reduce 0%
2022-03-25T09:33:54,058 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 75% reduce 0%
2022-03-25T09:33:57,085 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 89% reduce 0%
2022-03-25T09:34:02,129 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 0%
2022-03-25T09:34:09,192 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 49%
2022-03-25T09:34:12,219 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 59%
2022-03-25T09:34:15,247 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 67%
2022-03-25T09:34:18,273 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 68%
2022-03-25T09:34:21,299 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 78%
2022-03-25T09:34:24,325 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 87%
2022-03-25T09:34:27,351 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 88%
2022-03-25T09:34:46,521 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 96%
2022-03-25T09:34:49,549 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job -  map 100% reduce 100%
2022-03-25T09:35:19,826 INFO [task-runner-0-priority-0] org.apache.hadoop.mapreduce.Job - Job job_bb completed successfully

2022-03-25T09:35:23,363 INFO [task-runner-0-priority-0] org.apache.druid.indexer.JobHelper - Deleting path[path../2022-03-25T092752.331Z_6745fbc8036a47de8d693a8c90c8b480]
2022-03-25T09:35:23,459 WARN [task-runner-0-priority-0] org.apache.hadoop.ipc.Client - Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
2022-03-25T09:35:23,475 WARN [task-runner-0-priority-0] org.apache.hadoop.ipc.Client - Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
..
java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "xxxx"; destination host is: "xxx; 
	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782) ~[hadoop-common-2.8.5.jar:?]
	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1493) ~[hadoop-common-2.8.5.jar:?]
	at org.apache.hadoop.ipc.Client.call(Client.java:1435) ~[hadoop-common-2.8.5.jar:?]
	at org.apache.hadoop.ipc.Client.call(Client.java:1345) ~[hadoop-common-2.8.5.jar:?]
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:227) ~[hadoop-common-2.8.5.jar:?]
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116) ~[hadoop-common-2.8.5.jar:?]
	at com.sun.proxy.$Proxy326.delete(Unknown Source) ~[?:?]
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:568) ~[hadoop-hdfs-client-2.8.5.jar:?]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:409) [hadoop-common-2.8.5.jar:?]
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:163) ~[hadoop-common-2.8.5.jar:?]
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:155) ~[hadoop-common-2.8.5.jar:?]
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) [hadoop-common-2.8.5.jar:?]
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:346) [hadoop-common-2.8.5.jar:?]
	at com.sun.proxy.$Proxy327.delete(Unknown Source) [?:?]
	at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1591) [hadoop-hdfs-client-2.8.5.jar:?]
	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:798) [hadoop-hdfs-client-2.8.5.jar:?]
	at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:795) [hadoop-hdfs-client-2.8.5.jar:?]
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) [hadoop-common-2.8.5.jar:?]
	at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:795) [hadoop-hdfs-client-2.8.5.jar:?]
	at org.apache.druid.indexer.JobHelper.maybeDeleteIntermediatePath(JobHelper.java:422) [druid-indexing-hadoop-0.22.1.jar:0.22.1]
	at org.apache.druid.indexing.common.task.HadoopIndexTask$HadoopIndexerGeneratorCleanupRunner.runTask(HadoopIndexTask.java:981) [druid-indexing-service-0.22.1.jar:0.22.1]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_112]
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_112]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_112]
	at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
	at org.apache.druid.indexing.common.task.HadoopIndexTask.indexerGeneratorCleanupJob(HadoopIndexTask.java:616) [druid-indexing-service-0.22.1.jar:0.22.1]
	at org.apache.druid.indexing.common.task.HadoopIndexTask.runInternal(HadoopIndexTask.java:497) [druid-indexing-service-0.22.1.jar:0.22.1]
	at org.apache.druid.indexing.common.task.HadoopIndexTask.runTask(HadoopIndexTask.java:284) [druid-indexing-service-0.22.1.jar:0.22.1]
	at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:159) [druid-indexing-service-0.22.1.jar:0.22.1]
	at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:471) [druid-indexing-service-0.22.1.jar:0.22.1]
	at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:443) [druid-indexing-service-0.22.1.jar:0.22.1]
	at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_112]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]

...
2022-03-25T09:35:23,583 WARN [task-runner-0-priority-0] org.apache.hadoop.ipc.Client - Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
2022-03-25T09:35:23,584 ERROR [task-runner-0-priority-0] org.apache.druid.indexer.JobHelper - Failed to cleanup path[path/2022-03-25T092752.331Z_6745fbc8036a47de8d693a8c90c8b480]
....
....
2022-03-25T09:35:23,589 INFO [task-runner-0-priority-0] org.apache.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {
  "id" : "job.._2022-03-25T09:27:52.331Z",
  "status" : "SUCCESS",
  "duration" : 444551,
  "errorMsg" : null,
  "location" : {
    "host" : null,
    "port" : -1,
    "tlsPort" : -1
  }
}

I wonder if you encountered this:

This generally happens when the kerberos ticket that is being used to authenticate with the Hadoop cluster has expired. If the kerberos credentials are configured in Druid using druid.hadoop.security.kerberos.principal and druid.hadoop.security.kerberos.keytab , Druid takes care of refreshing the ticket before expiry. If not, you might be relying on an external service to renew the ticket cache and it might be worth checking that out to see if the renewal is working as expected.

@Mark_Herrera

druid.hadoop.security.kerberos.principal
druid.hadoop.security.kerberos.keytab

I did not update the keytab myself, and the above settings are entered.
However, Kerberos error occurs only in the last delete operation.
Of course, if I update the keytab myself and work, there is no problem.
Currently I am using a different version of Hadoop as an external dependency for index_hadoop.

I’m sorry for not focusing more directly on this language, but I’m glad to hear that your ingestion job is working.

Regarding the Kerberos error, I came across this interesting article entitled Error Messages to Fear. When you find the No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt section, the culprit might be kinit. If this ends up being the solution to your problem, notice that the domain is capitalized in the knit command.

Let us know how it goes.