help!!! druid has lost litte data every second!

we have data 6k-7k ingestion into druid, but if I query count(*) from druid, druid has data less than we push into kafka, it loss 40 - 50 data,so we get 5.96k-6.96k data ervery second.
it will be more large a day.

here is spec below

[

{

"dataSchema" : {

  "dataSource" : "uve_stat_dim",

  "parser" : {

    "type" : "string",

    "parseSpec" : {

      "format" : "json",

      "timestampSpec" : {

        "column" : "reqtime",

        "format" : "posix"

      },

      "dimensionsSpec" : {

        "dimensions": ["mode","uid","loadmore","feedtype","from","unread_status","platform","version"],

        "dimensionExclusions" : [],

        "spatialDimensions" : []

      }

    }

  },

  "metricsSpec" : [

  {

    "type" : "count",

    "name" : "count"

  }],

  "granularitySpec" : {

    "type" : "uniform",

    "segmentGranularity" : "HOUR",

    "queryGranularity" : "second"

  }

},

"ioConfig" : {

  "type" : "realtime",

  "firehose": {

    "type": "kafka-0.8",

    "consumerProps": {

      "zookeeper.connect": "10.77.96.56:2181",

      "zookeeper.connection.timeout.ms" : "15000",

      "zookeeper.session.timeout.ms" : "15000",

      "zookeeper.sync.time.ms" : "5000",

      "group.id": "druid-dim-yangyang21",

      "fetch.message.max.bytes" : "5242930",

      "auto.offset.reset": "largest",

      "auto.commit.enable": "false"

    },

    "feed": "uve_stat_handle_1"

  },

  "plumber": {

    "type": "realtime"

  }

},

"tuningConfig": {

  "type" : "realtime",

  "maxRowsInMemory": 500000,

  "intermediatePersistPeriod": "PT30m",

  "windowPeriod": "PT50m",

  "basePersistDirectory": "\/data1\/druid-0.8.2\/basePersist",

  "rejectionPolicy": {

    "type": "messageTime"

  }

}

}

]

if I use ‘grep ingest/events/’ in real log, the value in throwAway and unparseable is all zero.

I calc the data in one minute in druid and kafka

     count data in druid                                       ,count data in kaka    

1 [{“timestamp”:“2015-11-24T05:05:00.000Z”,“result”:{“count”:6577}}, 6644

2 {“timestamp”:“2015-11-24T05:05:01.000Z”,“result”:{“count”:6650}}, 6705

3 {“timestamp”:“2015-11-24T05:05:02.000Z”,“result”:{“count”:6523}}, 6563

4 {“timestamp”:“2015-11-24T05:05:03.000Z”,“result”:{“count”:6473}}, 6523

5 {“timestamp”:“2015-11-24T05:05:04.000Z”,“result”:{“count”:6527}}, 6588

6 {“timestamp”:“2015-11-24T05:05:05.000Z”,“result”:{“count”:6469}}, 6509

7 {“timestamp”:“2015-11-24T05:05:06.000Z”,“result”:{“count”:6506}}, 6542

8 {“timestamp”:“2015-11-24T05:05:07.000Z”,“result”:{“count”:6462}}, 6504

9 {“timestamp”:“2015-11-24T05:05:08.000Z”,“result”:{“count”:6374}}, 6424

10 {“timestamp”:“2015-11-24T05:05:09.000Z”,“result”:{“count”:6664}}, 6714

11 {“timestamp”:“2015-11-24T05:05:10.000Z”,“result”:{“count”:6598}}, 6639

12 {“timestamp”:“2015-11-24T05:05:11.000Z”,“result”:{“count”:6566}}, 6604

13 {“timestamp”:“2015-11-24T05:05:12.000Z”,“result”:{“count”:6538}}, 6570

14 {“timestamp”:“2015-11-24T05:05:13.000Z”,“result”:{“count”:6460}}, 6490

15 {“timestamp”:“2015-11-24T05:05:14.000Z”,“result”:{“count”:6617}}, 6673

16 {“timestamp”:“2015-11-24T05:05:15.000Z”,“result”:{“count”:6480}}, 6530

17 {“timestamp”:“2015-11-24T05:05:16.000Z”,“result”:{“count”:6574}}, 6632

18 {“timestamp”:“2015-11-24T05:05:17.000Z”,“result”:{“count”:6559}}, 6599

19 {“timestamp”:“2015-11-24T05:05:18.000Z”,“result”:{“count”:6413}}, 6457

20 {“timestamp”:“2015-11-24T05:05:19.000Z”,“result”:{“count”:6459}}, 6500

21 {“timestamp”:“2015-11-24T05:05:20.000Z”,“result”:{“count”:6403}}, 6469

22 {“timestamp”:“2015-11-24T05:05:21.000Z”,“result”:{“count”:6514}}, 6564

23 {“timestamp”:“2015-11-24T05:05:22.000Z”,“result”:{“count”:6580}}, 6630

24 {“timestamp”:“2015-11-24T05:05:23.000Z”,“result”:{“count”:6471}}, 6503

25 {“timestamp”:“2015-11-24T05:05:24.000Z”,“result”:{“count”:6463}}, 6502

26 {“timestamp”:“2015-11-24T05:05:25.000Z”,“result”:{“count”:6655}}, 6709

27 {“timestamp”:“2015-11-24T05:05:26.000Z”,“result”:{“count”:6681}}, 6726

28 {“timestamp”:“2015-11-24T05:05:27.000Z”,“result”:{“count”:6626}}, 6673

29 {“timestamp”:“2015-11-24T05:05:28.000Z”,“result”:{“count”:6358}}, 6402

30 {“timestamp”:“2015-11-24T05:05:29.000Z”,“result”:{“count”:6616}}, 6649

31 {“timestamp”:“2015-11-24T05:05:30.000Z”,“result”:{“count”:6490}}, 6543

32 {“timestamp”:“2015-11-24T05:05:31.000Z”,“result”:{“count”:6589}}, 6633

33 {“timestamp”:“2015-11-24T05:05:32.000Z”,“result”:{“count”:6553}}, 6591

34 {“timestamp”:“2015-11-24T05:05:33.000Z”,“result”:{“count”:6455}}, 6511

35 {“timestamp”:“2015-11-24T05:05:34.000Z”,“result”:{“count”:6507}}, 6565

36 {“timestamp”:“2015-11-24T05:05:35.000Z”,“result”:{“count”:6447}}, 6502

37 {“timestamp”:“2015-11-24T05:05:36.000Z”,“result”:{“count”:6493}}, 6531

38 {“timestamp”:“2015-11-24T05:05:37.000Z”,“result”:{“count”:6452}}, 6505

39 {“timestamp”:“2015-11-24T05:05:38.000Z”,“result”:{“count”:6443}}, 6482

40 {“timestamp”:“2015-11-24T05:05:39.000Z”,“result”:{“count”:6511}}, 6551

41 {“timestamp”:“2015-11-24T05:05:40.000Z”,“result”:{“count”:6401}}, 6448

42 {“timestamp”:“2015-11-24T05:05:41.000Z”,“result”:{“count”:6546}}, 6586

43 {“timestamp”:“2015-11-24T05:05:42.000Z”,“result”:{“count”:6419}}, 6452

44 {“timestamp”:“2015-11-24T05:05:43.000Z”,“result”:{“count”:6470}}, 6523

45 {“timestamp”:“2015-11-24T05:05:44.000Z”,“result”:{“count”:6515}}, 6558

46 {“timestamp”:“2015-11-24T05:05:45.000Z”,“result”:{“count”:6468}}, 6517

47 {“timestamp”:“2015-11-24T05:05:46.000Z”,“result”:{“count”:6505}}, 6549

48 {“timestamp”:“2015-11-24T05:05:47.000Z”,“result”:{“count”:6493}}, 6538

49 {“timestamp”:“2015-11-24T05:05:48.000Z”,“result”:{“count”:6522}}, 6561

50 {“timestamp”:“2015-11-24T05:05:49.000Z”,“result”:{“count”:6565}}, 6618

51 {“timestamp”:“2015-11-24T05:05:50.000Z”,“result”:{“count”:6630}}, 6679

52 {“timestamp”:“2015-11-24T05:05:51.000Z”,“result”:{“count”:6428}}, 6475

53 {“timestamp”:“2015-11-24T05:05:52.000Z”,“result”:{“count”:6439}}, 6471

54 {“timestamp”:“2015-11-24T05:05:53.000Z”,“result”:{“count”:6666}}, 6714

55 {“timestamp”:“2015-11-24T05:05:54.000Z”,“result”:{“count”:6536}}, 6581

56 {“timestamp”:“2015-11-24T05:05:55.000Z”,“result”:{“count”:6544}}, 6576

57 {“timestamp”:“2015-11-24T05:05:56.000Z”,“result”:{“count”:6462}}, 6521

58 {“timestamp”:“2015-11-24T05:05:57.000Z”,“result”:{“count”:6462}}, 6496

59 {“timestamp”:“2015-11-24T05:05:58.000Z”,“result”:{“count”:6518}}, 6564

60 {“timestamp”:“2015-11-24T05:05:59.000Z”,“result”:{“count”:6406}} 6447

thank you!

“queryGranularity” : “second”

Druid rolls up data in second.

在 2015年11月24日星期二 UTC+8下午2:06:05,sailin…@gmail.com写道:

More about summarization/rollup in Druid: http://druid.io/docs/latest/design/index.html

Hi,I had read the document you give.

I give some data for example,our data format is below.

{“mode”:“incr”,“uid”:“5766340798”,“loadmore”:“0”,“feedtype”:“main”,“from”:“1056095010”,“unread_status”:“25”,“version”:“5.6.0”,“platform”:“android”,“reqtime”:1448359100}

{“mode”:“incr”,“uid”:“1794192911”,“loadmore”:“0”,“feedtype”:“other”,“from”:“1051393010”,“unread_status”:“2”,“version”:“5.1.3”,“platform”:“iphone”,“reqtime”:1448359100}

{“mode”:“incr”,“uid”:“5726838219”,“loadmore”:“0”,“feedtype”:“main”,“from”:“1053095010”,“unread_status”:“25”,“version”:“5.3.0”,“platform”:“android”,“reqtime”:1448359100}

{“mode”:“incr”,“uid”:“5614310995”,“loadmore”:“0”,“feedtype”:“main”,“from”:“1056193010”,“unread_status”:“15”,“version”:“5.6.1”,“platform”:“iphone”,“reqtime”:1448359100}

{“mode”:“incr”,“uid”:“1767898044”,“loadmore”:“1”,“feedtype”:“main”,“from”:“1056193010”,“unread_status”:“15”,“version”:“5.6.1”,“platform”:“iphone”,“reqtime”:1448359100}

{“mode”:“incr”,“uid”:“5745039583”,“loadmore”:“0”,“feedtype”:“main”,“from”:“1056193010”,“unread_status”:“5”,“version”:“5.6.1”,“platform”:“iphone”,“reqtime”:1448359100}

I get 1 second data from kafka.I push this all to a file(test1).use this java code,I get the same count with druid.

public static void main(String args) throws IOException, JSONException {

File file = new File(“test1”);

FileReader fr = new FileReader(file);

BufferedReader br = new BufferedReader(fr);

Map<String, List> map = Maps.newHashMap();

String tmp = null;

int count = 0;

while((tmp = br.readLine()) != null){

JSONObject obj = new JSONObject(tmp);

String uid = obj.getString(“uid”);

if(map.get(uid) == null){

map.put(uid, new LinkedList());

}

List j = map.get(uid);

boolean f = true;

for(JSONObject i : j){

if(i.equals(obj)){

f = false;

}

}

if(f){

j.add(obj);

}

}

Collection<List> sj= map.values();

for(List k : sj){

count = count + k.size();

}

System.out.println(count);

}

``

It have only one metrics(count), is druid Calculation data like code above, how can I get the exact total num of my data raw. thank you!!!

在 2015年11月24日星期二 UTC+8下午3:02:34,Fangjin Yang写道:

See http://druid.io/docs/latest/ingestion/schema-design.html Counting the number of ingested events.
Add a metric and set it to 1 for every record.

在 2015年11月24日星期二 UTC+8下午6:28:01,sailin…@gmail.com写道:

thank you very mush。 I have do it,add a metric and set it to 1 for every record.it waoks well!

在 2015年11月25日星期三 UTC+8下午2:14:41,宾莉金写道: