Kafka druid connection error in Hortonworks Kerberos cluster

Hi team,

I tried below steps

  1. Edit “wikipedia1-kafka-supervisor.json” as per below

{

“type”: “kafka”,

“dataSchema”: {

“dataSource”: “kafkawikipedia21”,

“parser”: {

“type”: “string”,

“parseSpec”: {

“format”: “json”,

“timestampSpec”: {

“column”: “time”,

“format”: “iso”

},

“dimensionsSpec”: {

“dimensions”: [

“channel”,

“cityName”,

“comment”,

“countryIsoCode”,

“countryName”,

“isAnonymous”,

“isMinor”,

“isNew”,

“isRobot”,

“isUnpatrolled”,

“metroCode”,

“namespace”,

“page”,

“regionIsoCode”,

“regionName”,

“user”,

{ “name”: “added”, “type”: “long” },

{ “name”: “deleted”, “type”: “long” },

{ “name”: “delta”, “type”: “long” }

]

}

}

}

},

“ioConfig”: {

“topic”: “test4”,

“replicas”: 1,

“taskDuration”: “PT10M”,

“completionTimeout”: “PT20M”,

“consumerProperties”: {

“bootstrap.servers”:“IPADRESS:6667,IPADRESS2:6667”,

“security.protocol”: “SASL_PLAINTEXT”,

“group.id”:“kafkawikigroup”,

“sasl.kerberos.service.name”: “kafka”

}

}

}

  1. curl -ikv --negotiate -X ‘POST’ -H ‘Content-Type:application/json’ -d @/root/druidkafkatest/from_hwx/wikipedia1-kafka-supervisor.json http://IPADRESS:8090/druid/indexer/v1/supervisor

  2. cat /root/druidkafkatest/from_hwx/wikiticker-2015-09-12-sampled.json | /usr/hdp/current/kafka-broker/bin/kafka-console-producer.sh --broker-list IPADRESS:6667,IPADRESS:6667 --topic test4 --security-protocol SASL_PLAINTEXT

Kafka producer and consumer working file, But the data is not ingested to Druid , Getting below error in overlord.log Please help me on this ?

–io.druid.indexing.kafka.supervisor.KafkaSupervisor - Unable to compute Kafka lag io.druid.java.util.common.ISE: Latest offsets from Kafka have not been fetched

*– No such topic [test4] found, list of discovered topics [[ATLAS_HOOK, ATLAS_ENTITIES]]*

– WARN [KafkaSupervisor-kafkawikipedia21-Reporting-0] io.druid.indexing.kafka.supervisor.KafkaSupervisor - Exception while getting current/latest offsets

io.druid.java.util.common.ISE: Could not retrieve partitions for topic [test4]

io.druid.java.util.common.ISE: Could not retrieve partitions for topic [test4]

at io.druid.indexing.kafka.supervisor.KafkaSupervisor.updateLatestOffsetsFromKafka(KafkaSupervisor.java:2082) ~[?:?]

at io.druid.indexing.kafka.supervisor.KafkaSupervisor.lambda$updateCurrentAndLatestOffsets$24(KafkaSupervisor.java:2190) ~[?:?]

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112]

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_112]

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_112]

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_112]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]

2019-03-04T08:10:32,466 INFO [TaskQueue-StorageSync] io.druid.indexing.overlord.TaskQueue - Synced 0 tasks from storage (0 tasks added, 0 tasks removed).

2019-03-04T08:10:37,651 WARN [KafkaSupervisor-kafkawikipedia21] io.druid.indexing.kafka.supervisor.KafkaSupervisor - No such topic [test4] found, list of discovered topics [[ATLAS_HOOK, ATLAS_ENTITIES]]

2019-03-04T08:10:37,652 INFO [KafkaSupervisor-kafkawikipedia21] io.druid.indexing.kafka.supervisor.KafkaSupervisor - {id=‘kafkawikipedia21’, generationTime=2019-03-04T08:10:37.652Z, payload={dataSource=‘kafkawikipedia21’, topic=‘test4’, partitions=0, replicas=1, durationSeconds=600, active=, publishing=}}

2019-03-04T08:10:52,651 WARN [KafkaSupervisor-kafkawikipedia21-Reporting-0] io.druid.indexing.kafka.supervisor.KafkaSupervisor - Exception while getting current/latest offsets

io.druid.java.util.common.ISE: Could not retrieve partitions for topic [test4]

at io.druid.indexing.kafka.supervisor.KafkaSupervisor.updateLatestOffsetsFromKafka(KafkaSupervisor.java:2082) ~[?:?]

at io.druid.indexing.kafka.supervisor.KafkaSupervisor.lambda$updateCurrentAndLatestOffsets$24(KafkaSupervisor.java:2190) ~[?:?]

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112]

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_112]

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_112]

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_112]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]

2019-03-04T08:11:02,650 WARN [KafkaSupervisor-kafkawikipedia21-Reporting-0] io.druid.indexing.kafka.supervisor.KafkaSupervisor - Unable to compute Kafka lag

io.druid.java.util.common.ISE: Latest offsets from Kafka have not been fetched

at io.druid.indexing.kafka.supervisor.KafkaSupervisor.lambda$emitLag$19(KafkaSupervisor.java:2132) ~[?:?]

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_112]

at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_112]

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_112]

at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_112]

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_112]

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_112]

at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]

2019-03-04T08:11:07,651 WARN [KafkaSupervisor-kafkawikipedia21] io.druid.indexing.kafka.supervisor.KafkaSupervisor - No such topic [test4] found, list of discovered topics [[ATLAS_HOOK, ATLAS_ENTITIES]]

2019-03-04T08:11:07,652 INFO [KafkaSupervisor-kafkawikipedia21] io.druid.indexing.kafka.supervisor.KafkaSupervisor - {id=‘kafkawikipedia21’, generationTime=2019-03-04T08:11:07.652Z, payload={dataSource=‘kafkawikipedia21’, topic=‘test4’, partitions=0, replicas=1, durationSeconds=600, active=, publishing=}}

Thanks

Alam

Can any one help?

Hi A,am

It looks druid is unable to get the offset location correctly.

Please review
http://druid.io/docs/latest/development/extensions-core/kafka-ingestion.html
and see if resetOffsetAutomatically / useEarliestOffset helps in the
KafkaSupervisorIOConfig section of the ingestion spec.

Hi venket,

It’s not taking topic name from kafka. Please let know what are the steps need to change for this.

Thanks