Does Druid has alias of datasource

Hi, like ES each index has a alias, when the update data can not affect online service.
Can druid has this feature ?

Hi Zhenyuan, can you clarify a little more what you mean?

Thanks for your reply, for example
A datasource named TESTA is online serving. There are some changes for datasource TESTA(may be fix history data), so we need create a new datasource TESTB and after testing deploy it online.But serving layer also need change access from TESTA to TESTB. If datasource has a alias named TEST, so serving layer just access TEST and don’t need change datasource from TESTA to TESTB. As for data layer just need change alias from TEST -> TESTA to TEST -> TESTB.

在 2016年10月8日星期六 UTC+8上午4:31:17,Fangjin Yang写道:

You can just update the segments in place in Druid, it will do the update correctly for you.

–Eric

it is dangers to update online datasource. If has some error will no way to roll back, but has alias you can change alias to roll back

发自我的小米手机

在 Eric Tschetter echeddar@gmail.com,2016年10月8日 下午2:47写道:

We has a online datasource named funnel_analysis which has 3 dimensions and running 2 month.

Because the product changed datasource need add a new dimension. So what we do is to create a new datasource named funnel_analysis_v2 and

re-run 2 month data for it.After fully tested funnel_analysis_v2 we deploy funnel_analysis_v2 online and the application server deploy new version online and change access from funnel_analysis to funnel_analysis_v2 and funnel_analysis is also still running. If online environment has some error, we do roll back(change application server to old version which access funnel_analysis).

The problem what we met:

  1. If I modify directly online environment, it can’t be fully tested. So we create a new datasource for testing.

  2. After testing, we want to deploy new data online. So we want to not effect the application server (Such as don’t need deploy a new version in some just data repair situations, so I came up with alias).

Could you give me some suggestions about how to easily update data of datasource and has minimize the impact on online environment?

Hope for your advice.

在 2016年10月8日星期六 UTC+8下午2:47:46,Eric Tschetter写道:

Just to be clear, the change is to add a new dimension and that’s it, right? The old dimensions are still the same and the roll ups for the old dimensions are also still the same. Correct?

If that’s the case, you can just update in place, your old client will only know about the old dimensions and, when aggregated, those numbers will be the exact same. You can then still control when the new client starts using the new dimension whenever you want.

–Eric

Thanks.By the way if dimension num not changed but some change to the value of dimension. In this situation how to do ?

在 2016年10月9日星期日 UTC+8下午2:16:32,Eric Tschetter写道:

In that case, you have two choices. You can just update in place, but then the switch isn’t entirely “atomic”. If you want something more atomic, you can model it the same as adding a new dimension. That is, add a new dimension with the new values and then switch to using that.

This strategy of updating in place is especially important when you are dealing with large data sets. If you only have a couple of terabytes of data, then the cost of duplicating is cheap. But if your data source is hundreds of terabytes or petabytes, then the cost of needing to entirely replicate everything can become non trivial.

–Eric