Facing issues with QuantilesDataSketches

I am new to quantile data sketches though I used approximate histogram aggregator. I tries giving below configuration in a metric (resp_time) in my kafka indexing supervisor task.

      {
        "type": "quantilesDoublesSketch",
        "name": "resp_time",
        "fieldName": "resp",
        "k": 64
      }


After submission KIS, I check every task is failing with below errors . Can someone please help me understanding, am i missing something.

Thanks

2020-03-16T10:20:45,734 INFO [task-runner-0-priority-0] org.apache.druid.server.coordination.BatchDataSegmentAnnouncer - Announcing segment[quantile test_2020-01-24T19:00:00.000Z_2020-01-24T20:00:00.000Z_2020-03-16T10:20:45.531Z] at existing path[/druid-017-tte/segments/10.61.141.44:8101/10.61.141.44:8101_indexer-executor__default_tier_2020-03-16T10:20:33.982Z_504e92c957734d998e33346d3c125cbc0]
2020-03-16T10:20:45,748 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - Encountered exception in run() before persisting.
org.apache.datasketches.SketchesArgumentException: Possible corruption: PreLongs must be 1 or 2: 53
at org.apache.datasketches.quantiles.DirectUpdateDoublesSketchR.checkPreLongs(DirectUpdateDoublesSketchR.java:225) ~[datasketches-java-1.1.0-incubating.jar:?]
at org.apache.datasketches.quantiles.DirectUpdateDoublesSketchR.wrapInstance(DirectUpdateDoublesSketchR.java:77) ~[datasketches-java-1.1.0-incubating.jar:?]
at org.apache.datasketches.quantiles.DoublesSketch.wrap(DoublesSketch.java:198) ~[datasketches-java-1.1.0-incubating.jar:?]
at org.apache.druid.query.aggregation.datasketches.quantiles.DoublesSketchOperations.deserializeFromByteArray(DoublesSketchOperations.java:55) ~[?:?]
at org.apache.druid.query.aggregation.datasketches.quantiles.DoublesSketchOperations.deserializeFromBase64EncodedString(DoublesSketchOperations.java:50) ~[?:?]
at org.apache.druid.query.aggregation.datasketches.quantiles.DoublesSketchOperations.deserialize(DoublesSketchOperations.java:37) ~[?:?]
at org.apache.druid.query.aggregation.datasketches.quantiles.DoublesSketchComplexMetricSerde$1.extractValue(DoublesSketchComplexMetricSerde.java:100) ~[?:?]
at org.apache.druid.segment.serde.ComplexMetricExtractor.extractValue(ComplexMetricExtractor.java:41) ~[druid-processing-0.17.0.jar:0.17.0]
at org.apache.druid.segment.incremental.IncrementalIndex$1IncrementalIndexInputRowColumnSelectorFactory$1.getObject(IncrementalIndex.java:197) ~[druid-processing-0.17.0.jar:0.17.0]
at org.apache.druid.query.aggregation.datasketches.quantiles.DoublesSketchMergeAggregator.updateUnion(DoublesSketchMergeAggregator.java:75) ~[?:?]
at org.apache.druid.query.aggregation.datasketches.quantiles.DoublesSketchMergeAggregator.aggregate(DoublesSketchMergeAggregator.java:45) ~[?:?]
at org.apache.druid.segment.incremental.OnheapIncrementalIndex.doAggregate(OnheapIncrementalIndex.java:252) ~[druid-processing-0.17.0.jar:0.17.0]
at org.apache.druid.segment.incremental.OnheapIncrementalIndex.addToFacts(OnheapIncrementalIndex.java:166) ~[druid-processing-0.17.0.jar:0.17.0]
at org.apache.druid.segment.incremental.IncrementalIndex.add(IncrementalIndex.java:607) ~[druid-processing-0.17.0.jar:0.17.0]
at org.apache.druid.segment.realtime.plumber.Sink.add(Sink.java:210) ~[druid-server-0.17.0.jar:0.17.0]
at org.apache.druid.segment.realtime.appenderator.AppenderatorImpl.add(AppenderatorImpl.java:259) ~[druid-server-0.17.0.jar:0.17.0]
at org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver.append(BaseAppenderatorDriver.java:408) ~[druid-server-0.17.0.jar:0.17.0]
at org.apache.druid.segment.realtime.appenderator.StreamAppenderatorDriver.add(StreamAppenderatorDriver.java:186) ~[druid-server-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner.runInternal(SeekableStreamIndexTaskRunner.java:681) [druid-indexing-service-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner.run(SeekableStreamIndexTaskRunner.java:278) [druid-indexing-service-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.seekablestream.SeekableStreamIndexTask.run(SeekableStreamIndexTask.java:164) [druid-indexing-service-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:419) [druid-indexing-service-0.17.0.jar:0.17.0]
at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:391) [druid-indexing-service-0.17.0.jar:0.17.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_242]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]

What is the type of the “resp” field?
I believe that the Qantiles sketch aggregator tries to detect if the input is raw data or sketches to be merged by field type. If it is a floating point type, it must be raw data, otherwise it tries to interpret it as a sketch.

Hi Alex,

Field name “resp” is of floating type. Quantiles sketches aggregator spec configs which I added under supervisor, was getting the data from kafka and data is completely raw data ( floating type for resp field ).

Thanks

manish!!

What version of Druid are you using?
According to the stack trace, Druid is trying to use merge aggregator instead of build aggregator, so it is trying to interpret the input as a sketch, not raw data. Let us see what the condition is exactly:

metricFactory.getColumnCapabilities(fieldName) != null && ValueType.isNumeric(metricFactory.getColumnCapabilities(fieldName).getType())

Could you double check that the field exists and has a numeric type?