index_realtime ID is not repeatable

hi all
I run index_realtime task in overlord node,If I want to change the configuration, and restart.

I have to change my index_realtime task ID, but I can’t find the old data.

I can think now is to create a new task_id, MV old_task_id new_task_id, and run new_task

I don’t think this is a good way to ask if there is a more suitable way.

thanks

You might be interested in looking at tranquility: https://github.com/druid-io/tranquility

It manages realtime tasks so you don’t have to. It handles the “changing configuration” case by allowing the current task to hand off its data and exit, and spawning a new, replacement task that starts fresh with the new configuration.

hi Gian
Tranquility is to support custom Druid write data,But if my business line is too much, I need to maintain the program to maintain the Druid.

In fact, I don’t want to do this.I want to use the running index_realtime overlord approach, the business will be automated access. I only need to care about the index_realtime run.

So I think if index_realtime can repeat, I will be able to better manage my access data.If I need to make changes, or in some middleManager die, or server downtime, I just need to find my index realtime missing, and then restart,So my data will not be lost.

What do you think?

在 2015年11月11日星期三 UTC+8上午11:12:07,Gian Merlino写道:

Hi,

I think to correctly manage real-time index tasks, you are going to end up rewriting Tranquility and it might be easier just to reuse it. Tranquility is a library you can embed in your application. If you find the indexing service too much work to use, you can look into just using standalone realtime nodes.

hi Fangjin

**I was alone with real time node way now,**But too many real-time nodes, not to manage, and can not be automated.

Every time there is a new demand, I need to wait for a server to write configuration files, and assign a port, and then start the real-time node.

It makes me feel very tired.

So I want to manage all of my realtime task services through overlord.

What do you think?

在 2015年11月16日星期一 UTC+8上午12:43:22,Fangjin Yang写道:

hi Fangjin
I am building my Druid cluster, if you have time to answer my question, because it will decide what I do.
在 2015年11月16日星期一 UTC+8上午12:43:22,Fangjin Yang写道:

I agree. That’s the reason why we use the indexing service.

Hi,

Can you describe your requirements for realtime and then I can make suggestions.

hi Fangjin

Thank you very much for your help.

I want to build a data warehouse with Druid.Data are from Kafka into Druid,Druid provides storage and query.Before the data into Kafka, there is a layer of ETL.

I want to do is to make these links are automated, I only need to focus on the situation of the load of the history node cluster and cluster overlord all work node load situation can be.

We provide data center services.Now the amount of 10T per day

在 2015年11月19日星期四 UTC+8上午6:56:38,Fangjin Yang写道:

Hi,

This setup is a common one we see where you have a message bus (Kafka) feed into a stream processor (Spark streaming, Storm, or Samza) and finally into Druid. Tranquility was designed specifically for this setup and I highly recommend looking into it. Tranquility is a library that lives on the stream processor.

hi

You mean I use spark to read Kafka, and use the Tranquility write to Druid?

在 2015年11月22日星期日 UTC+8上午1:44:46,Fangjin Yang写道:

Hi
If I use Tranquility, I also need to consider the concurrency of writing. There is no test data on the amount of Tranquility in the way it is? Including Index service related configuration parameters

thanks

在 2015年11月22日星期日 UTC+8上午1:44:46,Fangjin Yang写道:

Hi Fangjin

I have a suggestion, if you can add a set of test data in the githup, and the associated configuration, to help group users to do a comparison. In order to have a basis to optimize.

在 2015年11月22日星期日 UTC+8上午1:44:46,Fangjin Yang写道:

Yes, use Spark streaming read from Kafka, Tranquility to write to indexing service. Tranquility has support for every indexing service configuration. Strongly strongly strongly recommend tranquility for realtime ingestion with indexing service.