What's the meaning of each tranquility config field?

Hi ,

As we can see the following scala sample code at https://github.com/druid-io/tranquility

I’m not sure the meaning of windowPeriod , partitions, replicants, segmentGranularity.

Will tranquility app drop a message that out of the time windows?

Could any expert share some detail on these configs?

Hey Xuehui,

windowPeriod and segmentGranularity mean the same thing they do in Druid (see http://druid.io/docs/latest/ingestion/realtime-ingestion.html). Basically, segmentGranularity is what time range each segment will cover, and affects how often segments will handed off. windowPeriod is how late events can be before they are dropped, and will affect how long segments must be kept open to wait for late events before handing them off. Tranquility will drop events that are older than your windowPeriod.

“partitions” is the number of Druid ingestion tasks that will get created in parallel. This is a per-datasource config, so even if you have many instances of tranquility running, they will all talk to the same set of Druid tasks.

“replicants” is the number of instances you’ll get for each Druid ingestion task. “1” is ok for testing, usually you would want “2” in production to get some redundancy.

Hope this helps.

Hi Gian,

Thanks a lot for the explaination.

Some more questions:

  1. If set segmentGranularity = 1h, whether can I set windowPeriod > PT1h to reduce message dropping ? What will happen if setting a wide windowPeriod?

  2. Will druid or tranquility app ensure never ingesting duplicated data/message when “replicants” is 2?

  3. I have some statistics miss-match cases between druid and our SQL storage, druid could be higher or lower than SQL. Do you have any hints?

Druid drops message that beyond the time windows can be one of the causes.

  1. The cold data size displayed in coordinator console keeps growing with time. I have to increase the “druid.server.maxSize” of historical node runtime.properties to avoid running out of storage which will cause realtime tasks hanging on due to cannot hand off data to historical nodes.

There is a P3D rule for my data source. But it seems that historical nodes still host all history data to local disk as local cache even the deep storage is hdfs.

loadByPeriod

P3D (3 days)

2 in cold

loadForever (default rule)

2 in cold

在 2015年10月15日星期四 UTC+8上午3:23:51,Gian Merlino写道:

QQ截图20151015160323.png

Add up some details on question 2:

Currently I have 2 tranquility app for realtime ingestion with replica =1 and partition =1, how to change the replica to 2 for these two tranquility apps? Is it Okay to update the config one by one(stop one, update it and bring it up ,then for the second)?

Is it fine that multiple tranquility apps running with different configurations in a short period during updating config?

在 2015年10月15日星期四 UTC+8下午4:18:51,Xuehui Chen写道:

http://druid.io/docs/latest/operations/rule-configuration.htmlInline.

Hi Gian,

Thanks a lot for the explaination.

Some more questions:

  1. If set segmentGranularity = 1h, whether can I set windowPeriod > PT1h to reduce message dropping ? What will happen if setting a wide windowPeriod?

This will increase the time taken before data is handed off. This requires more resources on your realtime ingestion component.

  1. Will druid or tranquility app ensure never ingesting duplicated data/message when “replicants” is 2?

Tranquility pushes data to multiple realtime consumers. With that said, Druid’s realtime ingestion is best effort and the steps to fix this are in progress.

  1. I have some statistics miss-match cases between druid and our SQL storage, druid could be higher or lower than SQL. Do you have any hints?

Druid drops message that beyond the time windows can be one of the causes.

The most likely cause of this is often because people do not use longSum to query for event count and understand rollup in Druid.

http://druid.io/docs/latest/ingestion/faq.html (Not all my events were ingested)

  1. The cold data size displayed in coordinator console keeps growing with time. I have to increase the “druid.server.maxSize” of historical node runtime.properties to avoid running out of storage which will cause realtime tasks hanging on due to cannot hand off data to historical nodes.

There is a P3D rule for my data source. But it seems that historical nodes still host all history data to local disk as local cache even the deep storage is hdfs.

loadByPeriod

P3D (3 days)

2 in cold

loadForever (default rule)

2 in cold

You have not configured any drop rules so all of your data will always be retained. Druid scans all rules as a list. Segments match to the first rule that applies to them. You need to explicitly set when data will be dropped.

Please see: http://druid.io/docs/latest/operations/rule-configuration.html

Thanks so much for the literal answers, Fangjin!

Will “drop rules” delete data across the druid cluster? In fact, I want to retain data for at least two years (in hdfs) for Y2Y comparison, while it’s better to keep just latest one month data in the disk of historical nodes.

在 2015年10月19日星期一 UTC+8上午12:52:56,Fangjin Yang写道:

Hey Xuehui,

“Drop rules” will mark the segments unused and unload them from the historical nodes. After that point the data will still be on deep storage, but you won’t be able to query it through Druid, since historical nodes do not load data on demand. If you want to be able to query the data after dropping it, you would have to mark the segments used once again, which would cause the historical nodes to load them back up.