Is it possible to for druid to return less data during handoff but historical yet to load everything

We are using



with 4 nodes

1 ) realtime, historical,zookeeper, kafka

  1. historical,zookeeper, kafka

3)historical,zookeeper, kafka, coordinator, indexer


It happens around 11 or 12 minute of some hours in a day


Can you clarify how you are verifying the data transferred? This might be useful to read:

(Not all of my events were ingested)

We have written a job which continuously checks on druid one particular dimension total count events every 10 sec. and also store this to local memory.

There is another thread which keeps on writing new events (thus increase above mentioned counter)

Then have seen that next time job queries druid have shown less numbers than stored in local memory.

Have seen in logs this occurs about 10 or 11 minuter of some hours in day.


Hi Parampreet, what query are you issuing for check total counts?

Hi Fangjin,

We have count type as aggregator over table name playerevent, and have action, publisherID, campaignID as dimensions in it.

Query is as below have checked in logs, query is same just update in time interval (toTime increases in the following cycle of job)

[queryType:groupBy, dataSource:playerevent, granularity:all, dimensions:[action], aggregations:[[type:longSum, name:count, fieldName:count]], intervals:[2015-07-29T00:00:00.000Z/2015-07-29T11:46:41.000Z], filter:[type:and, fields:[[type:selector, dimension:publisherID, value:8a80813a468241a701468245fab80000], [type:selector, dimension:campaignID, value:ff8080814e95e49d014e95e909b00000], [type:or, fields:[[type:selector, dimension:action, value:Load], [type:selector, dimension:action, value:CountsAsView]]]]]]

Querying on first node of druid cluster on broker.

Attaching the cluster config files



druid.tar.gz (912 Bytes)


I wonder if Druid is self throttling because persists are occurring too rapidly. If you are using 0.8.0, can you post the ingest metrics, especially ingest/persists/backPressure