I have Kafka Indexing Service running against a kafka topic that has avro messages. These avro messages are json encoded.
The schemas are published to schema registry (confluent).
There are some 160K messages with fully populated payload in the topic. The segments were created in S3 however none of the dimensions shows any cardinality.
The overlord console shows shards for the segment, however only 1 dimension is shown for each of them while there are more than 20 dimensions.
Any help is highly appreciated ! I suspect if it’s to do with the json path mapping ?
@Indexing Task
{
“type” : “kafka”,
“dataSchema” : {
“dataSource” : “”,
“parser” : {
“type” : “avro_stream”,
“avroBytesDecoder”:{
“type”:“schema_registry”,
“url”:“http://”
},
“parseSpec” : {
“format” : “avro”,
“flattenSpec”: {
“useFieldDiscovery”: true,
“fields”: [
{
“type”:“root”,
“name”:“contentId”,
“expr”:"$.contentId.string"
},
…
“dimensionsSpec” : {},
“timestampSpec”: {
“column”: “eventTime”,
“format”: “auto”
}
}
},
“metricsSpec” : ,
“granularitySpec” : {
“type” : “uniform”,
“segmentGranularity” : “DAY”,
“queryGranularity” : “NONE”,
“rollup” : false
}
},
“ioConfig” : {
“topic” : “”,
“consumerProperties”: {
“bootstrap.servers”: “”,
“group.id”: “”
},
“replicas”:“2”,
“taskCount”:“1”,
“taskDuration”: “PT30M”,
“useEarliestOffset”:true
},
“tuningConfig” : {
“type” : “kafka”,
“resetOffsetAutomatically”:true
}
}
``
@AVRO Schema
{
“schema”: “{“type”:
“record”,
“name”:
“AtomicEvent”,
“namespace”:”<Masked",
“fields”:[
{
…
{“name”:“contentId”,“type”:[“null”,“string”],“default”:null}
…
}
``
2018-06-06T13:47:13,228 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[userId] inverted with cardinality[0] in 51 millis.
2018-06-06T13:47:13,279 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[eventType] inverted with cardinality[0] in 51 millis.
2018-06-06T13:47:13,334 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[eventAction] inverted with cardinality[0] in 55 millis.
2018-06-06T13:47:13,392 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[userbrowser.name] inverted with cardinality[0] in 58 millis.
2018-06-06T13:47:13,443 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[userbrowser.category] inverted with cardinality[0] in 51 millis.
2018-06-06T13:47:13,498 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[userbrowser.os] inverted with cardinality[0] in 55 millis.
2018-06-06T13:47:13,547 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[userbrowser.version] inverted with cardinality[0] in 49 millis.
2018-06-06T13:47:13,596 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[userbrowser.vendor] inverted with cardinality[0] in 49 millis.
2018-06-06T13:47:13,647 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[userbrowser.os_version] inverted with cardinality[0] in 51 millis.
2018-06-06T13:47:13,704 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[userAgent] inverted with cardinality[0] in 57 millis.
2018-06-06T13:47:13,756 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[eventSubType] inverted with cardinality[0] in 52 millis.
2018-06-06T13:47:13,817 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[searchKeywords] inverted with cardinality[0] in 61 millis.
2018-06-06T13:47:13,882 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[actor.districtRefId] inverted with cardinality[0] in 65 millis.
2018-06-06T13:47:13,931 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[actor.districtId] inverted with cardinality[0] in 49 millis.
2018-06-06T13:47:13,985 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[actor.schoolPid] inverted with cardinality[0] in 54 millis.
2018-06-06T13:47:14,036 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[actor.districtPid] inverted with cardinality[0] in 51 millis.
2018-06-06T13:47:14,086 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[actor.stateId] inverted with cardinality[0] in 50 millis.
2018-06-06T13:47:14,152 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[actor.roles] inverted with cardinality[0] in 66 millis.
2018-06-06T13:47:14,203 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[actor.browserId] inverted with cardinality[0] in 51 millis.
2018-06-06T13:47:14,256 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[target.viewType] inverted with cardinality[0] in 53 millis.
2018-06-06T13:47:14,314 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[target.extensions.programId] inverted with cardinality[0] in 58 mi
llis.
2018-06-06T13:47:14,368 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[target.extensions.programName] inverted with cardinality[0] in 54
millis.
2018-06-06T13:47:14,417 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[usageContext.id] inverted with cardinality[0] in 49 millis.
2018-06-06T13:47:14,469 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[usageContext.name] inverted with cardinality[0] in 52 millis.
2018-06-06T13:47:14,519 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[usageContext.type] inverted with cardinality[0] in 50 millis.
2018-06-06T13:47:14,577 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[usageContext.description] inverted with cardinality[0] in 58 milli
s.
2018-06-06T13:47:14,626 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[usageContext.extensions.assignmentId] inverted with cardinality[0]
in 49 millis.
2018-06-06T13:47:14,679 INFO [caliper-sample-atomic-avro-kafka-int-incremental-persist] io.druid.segment.StringDimensionMergerV9 - Completed dim[usageContext.resourceId] inverted with cardinality[0] in 53 millis
.
…
…
…
2018-06-06T13:47:17,407 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.BaseAppenderatorDriver - New segment[caliper-sample-atomic-avro-kafka-int_2018-06-05T00:00:00.000Z_2018-06-06T00:00:
00.000Z_2018-06-06T13:47:17.390Z] for row[MapBasedInputRow{timestamp=2018-06-05T00:06:33.467Z, event={contentId=null, userId=null, eventType=null, eventAction=null, userbrowser.name=null, userbrowser.category=nu
ll, userbrowser.os=null, userbrowser.version=null, userbrowser.vendor=null, userbrowser.os_version=null, userAgent=null, eventSubType=null, searchKeywords=null, actor.districtRefId=null, actor.districtId=null, a
ctor.schoolPid=null, actor.districtPid=null, actor.stateId=null, actor.roles=null, actor.browserId=null, target.viewType=null, target.extensions.programId=null, target.extensions.programName=null, usageContext.i
d=null, usageContext.name=null, usageContext.type=null, usageContext.description=null, usageContext.extensions.assignmentId=null, usageContext.resourceId=null}, dimensions=[contentId, userId, eventType, eventAct
ion, userbrowser.name, userbrowser.category, userbrowser.os, userbrowser.version, userbrowser.vendor, userbrowser.os_version, userAgent, eventSubType, searchKeywords, actor.districtRefId, actor.districtId, actor
.schoolPid, actor.districtPid, actor.stateId, actor.roles, actor.browserId, target.viewType, target.extensions.programId, target.extensions.programName, usageContext.id, usageContext.name, usageContext.type, usa
geContext.description, usageContext.extensions.assignmentId, usageContext.resourceId]}] sequenceName[index_kafka_caliper-sample-atomic-avro-kafka-int_3483c0c34065dd8_0].
…
…
…
…
…
2018-06-06T14:17:15,216 DEBUG [appenderator_merge_0] com.amazonaws.request - Received successful response: 200, AWS Request ID: 758B95442174AD4C
2018-06-06T14:17:15,217 DEBUG [appenderator_merge_0] com.amazonaws.requestId - x-amzn-RequestId: not available
2018-06-06T14:17:15,217 DEBUG [appenderator_merge_0] com.amazonaws.requestId - AWS Request ID: 758B95442174AD4C
2018-06-06T14:17:15,218 INFO [appenderator_merge_0] io.druid.storage.s3.S3DataSegmentPusher - Deleting temporary cached index.zip
2018-06-06T14:17:15,228 INFO [appenderator_merge_0] io.druid.storage.s3.S3DataSegmentPusher - Deleting temporary cached descriptor.json
2018-06-06T14:17:15,258 INFO [appenderator_merge_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Pushed merged index for segment[caliper-sample-atomic-avro-kafka-int_2018-05-29T00:00:00.000Z_2018-05-30T00:00:00.000Z_2018-06-06T11:03:08.775Z_2], descriptor is: DataSegment{size=114414, shardSpec=NumberedShardSpec{partitionNum=2, partitions=0}, metrics=, dimensions=, version=‘2018-06-06T11:03:08.775Z’, loadSpec={type=>s3_zip, bucket=>hmheng-data-services/druid-segments/int, key=>segments/sample-data/caliper-sample-atomic-avro-kafka-int/2018-05-29T00:00:00.000Z_2018-05-30T00:00:00.000Z/2018-06-06T11:03:08.775Z/2/index.zip, S3Schema=>s3n}, interval=2018-05-29T00:00:00.000Z/2018-05-30T00:00:00.000Z, dataSource=‘caliper-sample-atomic-avro-kafka-int’, binaryVersion=‘9’}
2018-06-06T14:17:15,264 INFO [appenderator_merge_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Pushing merged index for segment[caliper-sample-atomic-avro-kafka-int_2018-05-31T00:00:00.000Z_2018-06-01T00:00:00.000Z_2018-06-06T13:46:53.788Z].
2018-06-06T14:17:15,267 INFO [appenderator_merge_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=caliper-sample-atomic-avro-kafka-int_2018-05-31T00:00:00.000Z_2018-06-01T00:00:00.000Z_2018-06-06T13:46:53.788Z, count=0}]
2018-06-06T14:17:15,267 INFO [appenderator_merge_0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Adding hydrant[FireHydrant{, queryable=caliper-sample-atomic-avro-kafka-int_2018-05-31T00:00:00.000Z_2018-06-01T00:00:00.000Z_2018-06-06T13:46:53.788Z, count=1}]
2018-06-06T14:17:15,274 WARN [appenderator_merge_0] io.druid.segment.IndexMerger - Indexes have incompatible dimension orders, using lexicographic order.
2018-06-06T14:17:15,277 INFO [appenderator_merge_0] io.druid.segment.IndexMergerV9 - Using SegmentWriteOutMediumFactory[TmpFileSegmentWriteOutMediumFactory]
2018-06-06T14:17:15,312 INFO [appenderator_merge_0] io.druid.segment.IndexMergerV9 - Completed version.bin in 22 millis.
2018-06-06T14:17:15,336 INFO [appenderator_merge_0] io.druid.segment.IndexMergerV9 - Completed factory.json in 24 millis
2018-06-06T14:17:15,336 INFO [appenderator_merge_0] io.druid.segment.IndexMergerV9 - Completed dim conversions in 0 millis.
2018-06-06T14:17:15,532 INFO [appenderator_merge_0] io.druid.segment.IndexMergerV9 - completed walk through of 37,234 rows in 139 millis.
2018-06-06T14:17:15,562 INFO [appenderator_merge_0] io.druid.segment.IndexMergerV9 - Completed time column in 29 millis.
2018-06-06T14:17:15,562 INFO [appenderator_merge_0] io.druid.segment.IndexMergerV9 - Completed metric columns in 0 millis.
2018-06-06T14:17:15,567 INFO [appenderator_merge_0] io.druid.segment.IndexMergerV9 - Completed index.drd in 5 millis.
``