Indexing Service: Submitted tasks but no data

Hey all,

Just a simple question on basics of indexing serivce:

So everything is setup correctly with tranquility and my indexing service running in remote (1 overlord, 3 middlemanagers). I am using a kafka stream to send data to druid and it looks like the task is popping up in the overlord console but it doesnt seem like data is being sent from what i see in the overlord logs. Nothing gets intermediately persisted.

How would i correctly send data to the indexing service? Do I have to batch everything up and then send? could i send realtime realtime events from kafka one by one to overlord and then would indexing service figure those events belong to a task? Sometime a task gets created for each event is send to the overlord :confused:

Any help is appreciated,

Thanks,

Hey Nicholas,

The indexing service supports a realtime use case like you’re trying to do, so you don’t need to batch your events. Do you see any exceptions in the task log in the overlord console? A common reason for events not being ingested is timestamp related issues - perhaps Tranquility is trying to extract a timestamp from the wrong column or your events are too old and their timestamps fall outside of the window period so they are being dropped. If a task is being created for each event sent to the overlord, then you have some issues related to your Tranquility config.

If you’re looking to push events from Kafka into Druid, you may be interested in this PR: https://github.com/druid-io/tranquility/pull/65

hey dude,

Thanks for the reply.

So timestamps were crazy late coming in from kafka stream. i increased window period just to see if things are coming through and that worked. but now im running into my later problem. overlord logs say indexing serivce is creating tasks for every single event i send. log looks like this:

2016-01-05T15:57:09,607 INFO [qtp292856581-39] io.druid.indexing.overlord.MetadataTaskStorage - Inserting task index_realtime_test-doe_2016-01-05T15:47:00.000Z_0_0 with status: TaskStatus{id=index_realtime_test-doe_2016-01-05T15:47:00.000Z_0_0, status=RUNNING, duration=-1}

Its just this line over and over. The way im sending to druid is using Spring’s ApplicationEventListener. So I publish each event from my consumer and listen for that event in my tranquility stuff and then send it to druid each time using druidService.apply(listOfEvents). Each list i send though would have only one element however. That wouldn’t be the issue would it?

These are my configs i have for tranquility. Using schemaless for now.

    @Bean
    public Service<List<Map<String, Object>>, Integer> druidService() {

        final List<AggregatorFactory> aggregators = ImmutableList.of(
            new CountAggregatorFactory("uuid"),
            new LongSumAggregatorFactory("pageURL", "pageURL")
        );
        final CuratorFramework curator = CuratorFrameworkFactory
                .builder()
                .connectString(druidZookeeperHost + ":" + druidZookeeperPort)
                .retryPolicy(new ExponentialBackoffRetry(1000, 20, 30000))
                .build();
        curator.start();

        return DruidBeams
                .builder(new Timestamper<Map<String, Object>>() {
                    private static final long serialVersionUID = -5195615126238524147L;

                    @Override
                    public DateTime timestamp(Map<String, Object> theMap) {
                        return new DateTime(theMap.get("timestamp"));
                    }
                })
                .curator(curator)
                .discoveryPath("/prod/discovery")
                .location(DruidLocation.create("druid/overlord", "test-doe"))
                .timestampSpec(new TimestampSpec("timestamp", "auto", null))
                .rollup(DruidRollup.create(DruidDimensions.schemaless(), aggregators, QueryGranularity.MINUTE))
                .tuning(ClusteredBeamTuning.builder()
                        .segmentGranularity(Granularity.MINUTE)
                        .windowPeriod(new Period("PT45M"))
                        .warmingPeriod(new Period("PT10M"))
                        .build())
                .buildJavaService();
    }

I answered my own question. I should be using Druid Tranquilizer. Found an example here: https://github.com/druid-io/tranquility/blob/master/core/src/test/java/com/metamx/tranquility/example/JavaExample.java

David,

Granted i know that i should be using Tranquilizer but for some reason i cant find the package in my project? Do you know what maven dependency i need to use this?

Hey Nicholas,

Tranquilizer hasn’t been pushed to Maven repo yet - expect it in the next few weeks around the same time Druid 0.8.3 is released. In the meantime, it should be fairly straightforward to build it from master and install to your local repo. Let me know if you have any other questions about this.

Sure thing.

Thanks a bunch.

Hey David,

Another issue:

I built from master and thats all good but when i send druid my overlord is still showing the same logs.

TaskStatus{id=index_realtime_some-test-doe_2016-01-05T19:40:00.000Z_0_0, status=RUNNING, duration=-1}

2016-01-05T19:40:24,183 INFO [qtp62610667-32] io.druid.indexing.overlord.MetadataTaskStorage - Inserting task index_realtime_some-test-doe_2016-01-05T19:41:00.000Z_0_0 with status: TaskStatus{id=index_realtime_some-test-doe_2016-01-05T19:41:00.000Z_0_0, status=RUNNING, duration=-1}

2016-01-05T19:40:24,524 INFO [qtp62610667-40] io.druid.indexing.overlord.MetadataTaskStorage - Inserting task index_realtime_some-test-doe_2016-01-05T19:42:00.000Z_0_0 with status: TaskStatus{id=index_realtime_some-test-doe_2016-01-05T19:42:00.000Z_0_0, status=RUNNING, duration=-1}

2016-01-05T19:40:24,874 INFO [qtp62610667-59] io.druid.indexing.overlord.MetadataTaskStorage - Inserting task index_realtime_some-test-doe_2016-01-05T19:43:00.000Z_0_0 with status: TaskStatus{id=index_realtime_some-test-doe_2016-01-05T19:43:00.000Z_0_0, status=RUNNING, duration=-1}

2016-01-05T19:40:25,142 INFO [qtp62610667-53] io.druid.indexing.overlord.MetadataTaskStorage - Inserting task index_realtime_some-test-doe_2016-01-05T19:44:00.000Z_0_0 with status: TaskStatus{id=index_realtime_some-test-doe_2016-01-05T19:44:00.000Z_0_0, status=RUNNING, duration=-1}

2016-01-05T19:40:25,478 INFO [qtp62610667-63] io.druid.indexing.overlord.MetadataTaskStorage - Inserting task index_realtime_some-test-doe_2016-01-05T19:45:00.000Z_0_0 with status: TaskStatus{id=index_realtime_some-test-doe_2016-01-05T19:45:00.000Z_0_0, status=RUNNING, duration=-1}

2016-01-05T19:40:25,808 INFO [qtp62610667-47] io.druid.indexing.overlord.MetadataTaskStorage - Inserting task index_realtime_some-test-doe_2016-01-05T19:46:00.000Z_0_0 with status: TaskStatus{id=index_realtime_some-test-doe_2016-01-05T19:46:00.000Z_0_0, status=RUNNING, duration=-1}

I have the same configs as i posted before. any ideas?

before tranquility logs were saying that messages were getting dropped which is a good sign that the tranquilizer stuff is working. i got some data from kafka with current time and it doesnt look like theyre getting dropped. There is no data being consumed.

Hey Nicholas,

Looks like you have your segment granularity configured to minute:

.segmentGranularity(Granularity.MINUTE)

So Druid is doing what it’s supposed to do and creating a new task every minute :slight_smile: Maybe try hourly segments?

Hey,

Ok segmentGranularity has been change and its still inserting tasks in the log like before. except now its insterting the same task over and over. are these lines events being ingested by overlord?

Another ques: The task that does get run now has a duration: -1. How am i sure if its running?

Hm, you should only be getting one of those logs when the task is first created, not on each event. It might be helpful to post your logs from the overlord, middle manager, and from the task. You can see your running tasks by going to overlord console at http://{OVERLORD_IP}:{OVERLORD_PORT}/console.html.

yo,

so i attached the logs here.

should i be sending the data to druid different using Tranquilizer? I kinda just followed the java example from that link i posted before.

middlemanager.log (8.75 KB)

overlord.log (11.3 KB)

task.log (65.6 KB)

Im gona attach my runtime.properties for overlord and middlemanager too if it helps.

middlemanager.runtime.properties.sh (786 Bytes)

overlord.runtime.properties.sh (486 Bytes)

So i tried just running the JavaExample.java just to see if that works.

Tranquility recognizes that the task is running but fails to send ‘requests’ to the task. It says its cause by this twitter.finagle exception:

Caused by: com.twitter.finagle.FailedFastException: Endpoint firehose:druid:overlord:idx-doe-40-0000-0000 is marked down. For more details see: https://twitter.github.io/finagle/guide/FAQ.html#why-do-clients-see-com-twitter-finagle-failedfastexception-s

Is this an endpoint for the peon worker on my middlemanager? The FAQ in that link says that a host must be down. I checked to make sure my app has inbound to all my druid ports and that looks fine.

I also noticed this from finagle as well:

Connection refused: /{{secret.ip}}:8087 from service: firehose:druid:overlord:idx-doe-40-0000-0000

where secret.ip is my middlemanger. does that mean overlord doesnt have access to middlemanager workers?

Hey Nicholas,

I’m not exactly sure what’s going on here, but the log message that’s getting spammed in the overlord is related to storing task information in the metadata store. The overlord will write information for task status, logs, and locks to a SQL database, and the fact that it keeps retrying suggests to me it’s not having success writing these entries. I noticed in your prop files that you didn’t configure an external postgres or mysql DB. By default, Druid will use an internal Derby database hosted on the coordinator if an external DB hasn’t been configured, and if the overlord is having trouble talking to the coordinator-hosted Derby, that might be the cause of these issues.

I’m hoping someone who’s seen similar issues will jump into this conversation, but in the meantime, it may not hurt to set up an external database and give that a try. It’s possible that the connection refused messages you’re seeing are a side-effect of the task never getting created in the first place because of the database write issues.

Hey David,

Thanks again for everything. youve been a great help so far.

Im actually using docker for my druid cluster and have a base image that contains all the common.runtime.properties each node is suppose to have. im connecting to an rds instance and i remember configuring that correctly since my overlord was throwing exceptions related to db connections.

Ill recheck these configs and see if there is anything wrong there.

In the meantime, i am also seeing transient errors in my tranquility logs. I remember seeing some google group discussions related to that so ill poke around there.

Thanks again,

Hey David,

I figured it out. It actually happened to be a Docker issue. Tranquility wasnt able to access the peon that spun up in the overlord host because i didnt expose the port that the peon was on in the docker run command (docker run -p port:port). I did that for all runner ports from that startPort property in middlemanger.

I just need help querying. i can send a querying straight to the worker and it returns results but when i send it to the broker in the cluster it still returns nothing. How do you query the indexing service? lol

Thanks for everything,

Correction: tranquility wasnt able to access worker on the middlemanager host.

Hey Nicholas,

Glad to hear you got it working!

That’s odd - segments being served by the worker should definitely be queryable through the broker. Is it possible that there are more blocked port issues preventing the different nodes from communicating with one another and with Zookeeper? When the task is submitted, you should see a log message like this on the broker:

INFO [ServerInventoryView-0] io.druid.client.BatchServerInventoryView - New Server[DruidServerMetadata{name=‘192.168.1.30:8300’, host=‘192.168.1.30:8300’, maxSize=0, tier=’_default_tier’, type=‘realtime’, priority=‘0’}]

The ports for the peons will start at whatever druid.indexer.runner.startPort is set to and increment from there.

Ah Ha!

Yeah i was looking for that exact log message in broker. The issue was exactly as you said: the indexing service was announcing to a different zookeeper path from the broker. they are the same now and everything looks gravy.

I think i had some bad tasks from before and periodically my broker returns this error in between results:

"error" : "Failure getting results from[http://{{worker}}:8088/druid/v2/] because of [org.jboss.netty.channel.ChannelException: Faulty channel in resource pool]"

when do peons usually disappear? no peon is running on this port i dont think.

Thanks again dude,