I have problem with new kafka indexing service.
1st druid machine with broker, coordinator & historical nodes running
2nd druid machine with overlord node (overlord running in “local” mode). (AWS EC2 c4.xlarge instance, 4 CPU, 7.5 RAM)
custom_activity with ~ 12 million messages lag
push_sent with ~ 200 K messages lag
~ 25 other topics with 50-100 K messages lag
When I start overlord node, it successfully process some tasks, until it start custom_activity and push_sent topic tasks. Task duration is 3M and 30S.
Then tasks hangs, and CPU usage going down. In task logs I see “org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Marking the coordinator 2147483434 dead” records. Network between kafka broker and overlord is ok, from kafka broker machine I see connections from overlord node (using netstat).
Tasks payload, logs, overlord logs, overlord properties, cpu & memory usage logs is attached.
Thanks in advance!
overlord_log.txt (4.35 KB)
peon_custom_activity_log.txt (584 KB)
overlord.properties (824 Bytes)