Hello Druid Users,
It takes > 30 secs to respond to GET/POST requests associated with submitting new Kafka Ingestion job, getting task status & logs.
Druid version: 0.13-incubating
Batch Ingestion (Hadoop): 6 jobs which run nightly
Reatime Ingestion (Kafka) : 18 Kafka & ~230 Running Tasks.
Meta data storage : Mysql RDS (Aurora). 15 open connections to RDS. CPU Util peaks every 15 mins to 30%.
2 Overlords in HA Mode
30 Nodes with historical & MiddelManager process co-located. (Attached config details)
No Errors/WARN in Overlord logs that seem indicative of the slowness. DEBUG logs too verbose and no obvious pattern indicating slowness.
Attaching configs, heap & thread dump of the overlord.
Would highly appreciate any pointers on how I can root cause the slowness and fix it.
Thanks a lot in advance !
jstack0122 (537 KB)
heap_dump.hprof (1.02 MB)
druid_Configs (4.57 KB)