Issue with Druid Version 0.10.1

Hi All,

I am having an issue to schedule kafka indexing task on druid version 0.10.1.

It was working fine till recently but after a certain issue on the cluster I am facing a weird issues with Overlord when I am trying to schedule indexing job. I think it seems like an leader election problem for Overlord but I am not able to resolve it. Please let me know how I can resolve this. We have two Overlord instances running on

ip-172-31-32-141.us-west-2.compute.internal and ip-172-31-32-141.us-west-2.compute.internal.

curl http://ip-172-31-37-215.us-west-2.compute.internal:8090/druid/indexer/v1/isLeader

{“leader”:false}

[admin@ip-172-31-42-196 druid_tasks]$ curl http://ip-172-31-32-141.us-west-2.compute.internal:8090/druid/indexer/v1/isLeader

{“leader”:false}

[

[admin@ip-172-31-42-196 druid_tasks]$ curl http://ip-172-31-32-141.us-west-2.compute.internal:8090/druid/indexer/v1/leader

ip-172-31-32-141.us-west-2.compute.internal:8090

curl -v -X ‘POST’ -H ‘Content-Type:application/json’ -d @druid_dev_indexing_task.json http://ip-172-31-32-141.us-west-2.compute.internal:8090/druid/indexer/v1/task

  • About to connect() to ip-172-31-32-141.us-west-2.compute.internal port 8090 (#0)

  • Trying 172.31.32.141…

  • Connected to ip-172-31-32-141.us-west-2.compute.internal (172.31.32.141) port 8090 (#0)

POST /druid/indexer/v1/task HTTP/1.1

User-Agent: curl/7.29.0

Host: ip-172-31-32-141.us-west-2.compute.internal:8090

Accept: /

Content-Type:application/json

Content-Length: 5127

Expect: 100-continue

< HTTP/1.1 307 Temporary Redirect

< Date: Thu, 11 Jun 2020 19:02:03 GMT

< Location: http://ip-172-31-32-141.us-west-2.compute.internal:8090/druid/indexer/v1/task

< Content-Length: 0

< Connection: close

< Server: Jetty(9.3.19.v20170502)

**< **

* Closing connection 0

Thanks & Regards,

Vikram

Hi Vikram,

It seems your overlords are unstable and might be restarting, do you see anything in the overlord logs ? If you see any OOM errors, it might be because overlords have an excessive amount of indexing task history, if that’s the case, you might need to clean the druid_tasks metadata table.

HTH

Surekha