Druid console doesn't report real time tasks for an existing datasource

Hello,

I have noticed that druid console incorrectly reports “no realtime tasks” against a datasource while there are active realtime tasks running for the same datasource.

Versions:

druid: 0.12.3

tranquility: 0.8.2

zookeeper: 3.4.10

My tranquility configuration is specified below:

{

“dataSources” : {

“pageviews” : {

 "spec" : {

   "dataSchema" : {

     "dataSource" : "pageviews",

     "parser" : {

       "type" : "string",

       "parseSpec" : {

         "timestampSpec" : {

           "column" : "time",

           "format" : "auto"

         },

         "dimensionsSpec" : {

           "dimensions" : ["url", "user", "os"],

           "dimensionExclusions" : [

             "time"

           ]

         },

         "format" : "json"

       }

     },

     "granularitySpec" : {

       "type" : "uniform",

       "segmentGranularity" : "fifteen_minute",

       "queryGranularity" : "fifteen_minute"

     },

     "metricsSpec" : [

               {"name": "views", "type": "count"}

     ]

   },

   "ioConfig" : {

     "type" : "realtime"

   },

   "tuningConfig" : {

     "type" : "realtime",

     "maxRowsInMemory" : "100000",

     "intermediatePersistPeriod" : "PT10M",

     "windowPeriod" : "PT13M"

   }

 },

 "properties" : {

   "task.partitions" : "1",

   "task.replicants" : "1"

 }

}

},

“properties” : {

“zookeeper.connect” : “localhost”,

“druid.discovery.curator.path” : “/druid/discovery”,

“druid.selectors.indexing.serviceName” : “druid/overlord”,

“http.port” : “8200”,

“http.threads” : “8”

}

}

``

After specifying the above tranquility server configuration:

  1. I sent some data over HTTP to tranquility server for the datasource “pageviews”.

  2. Realtime task was created based on the timestamp in the data.

  3. After the realtime task completed, datasource “pageviews” can be seen in druid console and can be queried successfully.

However if i send more data after a while, there were more realtime tasks created. These tasks can be seen running ( see the image : Indexing tasks: 1 running). If you hover over the “pageviews” datasource, it says “no realtime tasks”. The realtime indexing task which is running, is for the “pageviews” datasource.

Questions:

  1. Why does the console show “no realtime tasks” even though there are realtime tasks running for an existing datasource (“pageviews”)?

  2. Is it the case that druid determines whether the data indexed by a real time task is for an existing datasource or not only when the task completes?

Thanks,

Prathamesh

can anyone from Druid Dev team help with the above queries regarding realtime tasks?

Thanks,

Prathamesh

Hi Gian,

Have been looking for an answer to this query. would you be able to help?

Thanks,

Prathamesh

Hi Prathamesh,

The best place to see running/pending/completed realtime tasks is the overlord console (https://:8290).

Hope this helps,

Caroline

Hi Caroline,

Yes. we can see the running task in the overlord console. however my question was about the druid console running at port 8081 where one can see datasources.

But i have recently moved to the latest druid release and i believe the issue has been fixed. I can now see the realtime tasks listed against a datasource as can be seen below:

Thanks,

Prathamesh