How to make Druid remove message from RabbitMQ's queue after reading it.

Hi,

We try to use RabbitMQ to inject data into Druid.

This is part of our spec file for Druid’s realtime, by modifying wikipedia.spec example.

"ioConfig" : {
  "type" : "realtime",
  "firehose": {
    "type": "rabbitmq",
    "connection" : {
        "host": "localhost",
        "port": "5672",
        "username": "guest",
        "password": "guest"
    },
    "config" : {
        "exchange": "amq.direct",
        "queue": "druid",
        "routingKey": "druid",
        "durable": "true",
        "exclusive": "false",
        "autoDelete": "false",
        "maxRetries": "10",
        "retryIntervalSeconds": "1",
        "maxDurationSeconds": "300"
    }
  },
  "plumber": {
    "type": "realtime"
  }
},

``

After placing a data into RabbitMQ’s queue, I can see Druid can receive it successfully, by verifying using the following commnad

ubuntu@ip-172-30-1-252:~/druid$ curl -X POST ‘http://localhost:8084/druid/v2/?pretty’ -H ‘content-type: application/json’ -d @select.json
[ {
“timestamp” : “2013-08-31T01:02:33.000Z”,
“result” : {
“pagingIdentifiers” : {
“wikipedia_2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z_2013-08-31T00:00:00.000Z” : 0
},
“events” : [ {
“segmentId” : “wikipedia_2013-08-31T00:00:00.000Z_2013-09-01T00:00:00.000Z_2013-08-31T00:00:00.000Z”,
“offset” : 0,
“event” : {
“timestamp” : “2013-08-31T01:02:33.000Z”,
“continent” : “North America”,
“robot” : “false”,
“country” : “United States”,
“city” : “San Francisco”,
“newPage” : “true”,
“unpatrolled” : “true”,
“namespace” : “article”,
“anonymous” : “false”,
“language” : “en”,
“page” : “Gypsy Danger”,
“region” : “Bay Area”,
“user” : “doggy”,
“deleted” : 200.0,
“added” : 57.0,
“count” : 1,
“delta” : -143.0
}
} ]
}
} ]

``

However, when I exam on RabbitMQ’s queue, I expect Druid will acknowledge RabbitMQ, so that the data can be removed from RabbitMQ’s queue, after Druid had read it. However, the data is still sitting in RabbitMQ’s queue. Please see attachment.

Is there anything we can do, so that the data which is already picked up by Druid, will be removed from RabbitMQ’s queue?

Thanks.
Cheok

Hi,

RabbitMQ is a community contributed extension and not supported by the committers. We don’t have much expertise with RabbitMQ unfortunately.

The rabbit extension is supposed to ack messages when Druid does an intermediate persist. The frequency of that is determined by the intermediatePersistPeriod, which is 10 minutes by default, so unacked messages could stay around that long. You could make it happen more often by setting the intermediatePersistPeriod shorter.

However, I would recommend also taking a look at one of the Tranquility based options like Core or Server, as they are more commonly used in production.

Thank you Gian & Fangjin.

We had given up to try on RabbitMQ, after successful running Tranquility packaged under imply, for the very first time.

Thanks for respond anyway.