[druid-user] Re: Druid is not working properly

Did you get this working OK? You may want to ask in one of the Superset groups, too?

Hi Guys. Did you get any solution for this issue?
I see the same error and after checking master logs it seems it has a problem with a segment. Here is the log:

2022-01-31T08:34:29,133 INFO [LeaderSelector[/druid/overlord/_OVERLORD]] org.apache.druid.indexing.overlord.TaskLockbox - Cannot create a new taskLockPosse for request[TimeChunkLockRequest{lockType=EXCLUSIVE, groupId=‘coordinator-issued_compact_leadview-backend-enriched-v1_nlmakcep_2022-01-23T23:59:21.931Z’, dataSource=‘leadview-backend-enriched-v1’, interval=2022-01-04T00:00:00.000Z/2022-01-05T00:00:00.000Z, preferredVersion=‘2022-01-24T00:00:10.598Z’, priority=25, revoked=false}] because existing locks[[TaskLockPosse{taskLock=TimeChunkLock{type=EXCLUSIVE, groupId=‘coordinator-issued_compact_leadview-backend-enriched-v1_pkekhijg_2022-01-23T23:59:21.891Z’, dataSource=‘leadview-backend-enriched-v1’, interval=2022-01-04T00:00:00.000Z/2022-01-05T00:00:00.000Z, version=‘2022-01-23T23:59:21.924Z’, priority=25, revoked=false}, taskIds=[coordinator-issued_compact_leadview-backend-enriched-v1_pkekhijg_2022-01-23T23:59:21.891Z]}]] have same or higher priorities
2022-01-31T08:34:29,134 ERROR [LeaderSelector[/druid/overlord/_OVERLORD]] org.apache.druid.curator.discovery.CuratorDruidLeaderSelector - listener becomeLeader() failed. Unable to become leader: {class=org.apache.druid.curator.discovery.CuratorDruidLeaderSelector, exceptionType=class java.lang.RuntimeException, exceptionMessage=org.apache.druid.java.util.common.ISE: Could not reacquire lock on interval[2022-01-04T00:00:00.000Z/2022-01-05T00:00:00.000Z] version[2022-01-24T00:00:10.598Z] for task: coordinator-issued_compact_leadview-backend-enriched-v1_nlmakcep_2022-01-23T23:59:21.931Z}
java.lang.RuntimeException: org.apache.druid.java.util.common.ISE: Could not reacquire lock on interval[2022-01-04T00:00:00.000Z/2022-01-05T00:00:00.000Z] version[2022-01-24T00:00:10.598Z] for task: coordinator-issued_compact_leadview-backend-enriched-v1_nlmakcep_2022-01-23T23:59:21.931Z
at org.apache.druid.indexing.overlord.TaskMaster$1.becomeLeader(TaskMaster.java:159) ~[druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.curator.discovery.CuratorDruidLeaderSelector$1.isLeader(CuratorDruidLeaderSelector.java:97) [druid-server-0.21.0.jar:0.21.0]
at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:702) [curator-recipes-4.3.0.jar:4.3.0]
at org.apache.curator.framework.recipes.leader.LeaderLatch$9.apply(LeaderLatch.java:698) [curator-recipes-4.3.0.jar:4.3.0]
at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:100) [curator-framework-4.3.0.jar:4.3.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_292]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_292]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
Caused by: org.apache.druid.java.util.common.ISE: Could not reacquire lock on interval[2022-01-04T00:00:00.000Z/2022-01-05T00:00:00.000Z] version[2022-01-24T00:00:10.598Z] for task: coordinator-issued_compact_leadview-backend-enriched-v1_nlmakcep_2022-01-23T23:59:21.931Z
at org.apache.druid.indexing.overlord.TaskLockbox.syncFromStorage(TaskLockbox.java:190) ~[druid-indexing-service-0.21.0.jar:0.21.0]
at org.apache.druid.indexing.overlord.TaskMaster$1.becomeLeader(TaskMaster.java:114) ~[druid-indexing-service-0.21.0.jar:0.21.0]
… 7 more

1 Like

I also removed all data from Zookeeper and tried to restart the service, but no success.
This is a cluster under load for two years and this issue recently happened. We couldn’t make it work so far, any clue would be highly appreciated :pray:

Again this does look like there is an issue with Zookeeper here somewhere on the Druid side: if you did kill and restart ZK, could you check that your common.runtime.properties file, which references Zookeeper, is set right on all the processes?

If I remember right, you can also see the config on all the processes in turn by checking the status APIs. That may indicate where you have an issue with config maybe?