Fallback not applied when values are all null in a segment

Hello,

We have found a particular behaviour when using the fallback function on a dimension with potential null value in it.

We load a druid via batch mode hadoop. The data is store in a TSV.

The datasource have daily segment and a 30 min granularity.

Data :

  • 14th/7 have only empty values in the dimension
  • 15th/7 have empty and non empty values in it.

our query is

{

“queryType”: “topN”,

“dataSource”: “audience-breakdown”,

“intervals”: “2017-07-15T00Z/2017-07-15T23:59:59.999Z”,

“granularity”: “all”,

“context”: {

“timeout”: 40000

},

“dimension”: {

“type”: “extraction”,

“dimension”: “smDistribution”,

“outputName”: “smDistribution.fallback_N_A-da1”,

“extractionFn”: {

“type”: “lookup”,

“retainMissingValue”: true,

“lookup”: {

“type”: “map”,

“map”: {

“”: “NA”

}

}

}

},

“aggregations”: [

{

“name”: “sum_tls”,

“type”: “doubleSum”,

“fieldName”: “tls”

}

],

“metric”: “sum_tls”,

“threshold”: 5

}

``

When we query the 14th/7 only, the result is :

[

{

“timestamp”: “2017-07-14T00:00:00.000Z”,

“result”: [

{

smDistribution.fallback_N_A-da1”: null,

“sum_tls”: 16819175

}

]

}

]

``

The fallback doesn’t work

When we query the 15th/7 only, the result is
[

{

“timestamp”: “2017-07-15T00:00:00.000Z”,

“result”: [

{

smDistribution.fallback_N_A-da1”: “NA”,

“sum_tls”: 1310586

},

{

smDistribution.fallback_N_A-da1”: “XXX”,

“sum_tls”: 69811

}

]

}

]

``

The fallback work correctly.

When we query both 14th/7 and 15th/7 only, the result is

[

{

“timestamp”: “2017-07-14T00:00:00.000Z”,

“result”: [

{

smDistribution.fallback_N_A-da1”: null,

“sum_tls”: 16819175

},

{

smDistribution.fallback_N_A-da1”: “NA”,

“sum_tls”: 1310586

},

{

smDistribution.fallback_N_A-da1”: “XXX”,

“sum_tls”: 69811

}

]

}

]

``

The fallback works partially.

Is there a way to have a working fallback or is that a bug ?

Thanks

We use imply 2.2.3 distribution.

We just upgrade to imply 2.3.0, the problem still exist

Thanks for the report – it’s a bug in Druid. This patch should fix it: https://github.com/druid-io/druid/pull/4717

Thanks to have fix it.
We use imply distribution like i said before. How can we integrate this bug fix until it’s released in imply ?

It will end up in the Imply distribution next time we sync with community Druid sources, since the patch is in Druid master. We usually do that concurrently with community Druid releases. You could also apply it yourself to the version of community Druid that your distro version is based on (the Imply release notes will tell you what that is) and replace the Imply distro’s bundled Druid with your custom version.