Druid realtime tasks created by Tranquility Server runs for incorrect duration

Hello,

I have been trying out druid over past few days. I have some questions related to “realtime” tasks in druid which are created by tranquility server.

Versions:

druid: 0.12.3

tranquility: 0.8.2

zookeeper: 3.4.10

My tranquility configuration is specified below:

{

“dataSources” : {

“pageviews” : {

“spec”
: {

“dataSchema” : {

“dataSource” : “pageviews”,

“parser”
: {

“type” : “string”,

“parseSpec” : {

“timestampSpec” : {

“column” : “time”,

“format” : “auto”

},

“dimensionsSpec” : {

“dimensions” : [“url”, “user”,
“os”],

“dimensionExclusions” : [

“time”

]

},

“format” : “json”

}

},

“granularitySpec” : {

“type” : “uniform”,

“segmentGranularity” : “fifteen_minute”,

“queryGranularity” : “fifteen_minute”

},

“metricsSpec” : [

{“name”: “views”, “type”:
“count”}

]

},

“ioConfig”
: {

“type” : “realtime”

},

“tuningConfig” : {

“type” : “realtime”,

“maxRowsInMemory” : “100000”,

“intermediatePersistPeriod” : “PT10M”,

“windowPeriod” : “PT13M”

}

},

“properties” : {

“task.partitions” : “1”,

“task.replicants” : “1”

}

}

},

“properties” : {

“zookeeper.connect” : “localhost”,

“druid.discovery.curator.path” : “/druid/discovery”,

“druid.selectors.indexing.serviceName” :
“druid/overlord”,

“http.port” : “8200”,

“http.threads” : “8”

}

}

``

As you can see :

windowPeriod : 13 Minutes

intermediatePersistPeriod: 10 Minutes

segmentGranularity & queryGranularity: 15 Minutes

I could send data over HTTP to tranquility server and was able to query druid without any issues.

However i noticed that the realtime tasks that get created run for longer duration that i would expect.

Following are the screenshots for a realtime task.

This task should receive data from 10.15 to 10.30 (segmentGranularity is 15 minutes).

when i checked the logs for this task i noticed following:

As can be seen the above task will stop at **2018-11-02T10:41:00.000Z. **

Question:

  1. How does druid arrives at this stop time? shouldn’t the task stop at 10.30 ? does this have anything to do with windowPeriod ? Even then windowPeriod is 13 Minutes.

I also noticed that after a task completes, we can see the datasource in the druid console as shown below. However if we send more data there would be more realtime tasks running ( see the image : Indexing tasks: 1 running). If you hover over the “pageviews” datasource, it says “no realtime tasks”. The realtime indexing task running, is for the “pageviews” datasource.

  1. Why does the console show “no realtime tasks” while there are realtime tasks running for a datasource?

Thanks,

Prathamesh

There is a grace period that being used on top of the window period, checkout
druidBeam.firehoseGracePeriod from https://github.com/druid-io/tranquility/blob/master/docs/configuration.md

Thanks Dan! I was puzzled by the shutdown time exceeding the window period. This cleared my doubt.