How to check where the replicas for a given segment are?


We have a system with 8 historical nodes, and a data source with a rule like this:

  • {

    • tieredReplicants:


      • _default_tier: 2


    • type: “loadForever”


So, I’d expect to see a given segment on two different nodes in the cluster. Many segments, do in fact exist on two nodes. But, when I look at this particular Segment metadata, I only see it on one node:

  • metadata:


    • dataSource: “ignite_sopsod_cube_t35”,

    • interval: “2012-04-18T00:00:00.000Z/2012-04-19T00:00:00.000Z”,

    • version: “2015-04-29T11:47:47.213Z”,

    • loadSpec:


      • type: “s3_zip”,

      • bucket: “some-bucket”,

      • key: “druid/datasource/ignite_sopsod_cube_t35/ignite_sopsod_cube_t35/20120418T000000.000Z_20120419T000000.000Z/2015-04-29T11_47_47.213Z/0/”


    • dimensions:“is_pvr,has_searched,page_type_sk,is_mdpct,row_type_sk,subregion_sk,allocation_price_plan_id,custom_group_sk,test_id,content_type_sk,is_sims,is_jfk,ui_version_sk,device_category_sk,is_top10,test_cell_nbr,allocation_type_sk,is_allocation_price_plan_migrated,is_serialized”,

    • metrics: “sop_stream_secs,sop_stream_cnt,sod_stream_secs,sod_stream_cnt”,

    • shardSpec:


      • type: “none”


    • binaryVersion: 9,

    • size: 123456,

    • identifier: “ignite_sopsod_cube_t35_2012-04-18T00:00:00.000Z_2012-04-19T00:00:00.000Z_2015-04-29T11:47:47.213Z”


  • servers:



Am I misunderstanding how this is supposed to work? Or misunderstanding what that servers property is? I also see some segments that have 4 nodes in servers, which also doesn’t seem right.

Hi Ted, the coordinator endpoints should reflect a relatively up to date view of the cluster.

Druid tries to prioritize loading data that does not currently exist in the cluster first. Replicas of existing segments are loaded slower, and there is a throttle as to how many replicas can be loaded at a given time. This is dynamically configurable using the coordinator console.

It is possible that more than 2 copies of a segment exist in the cluster at a given time. Druid may choose to move segments which causes a historical node to download a segment from deep storage. Once a segment is downloaded on a new historical, the segment on the old historical is dropped.

Does that make sense?

– FJ

Thanks for the quick response.

Yeah, that makes sense. I did see another thread asking about slow ingestion performance from s3, and recommendations to possibly increase replicationThrottleLimit and maxSegmentsToMove.

If I look at the full load status page(I should have done this before asking the first question…), I see this line in there:

ignite_sopsod_cube_t35: 334

Simple load status page shows zero segments, so I guess it’s still replicating although it has been some hours(~12ish) since the load has been complete.

Would you recommend upping the replicationThrottleLimit?

Hmmm, 12ish hours is a very long time to wait for replication completion. Do you see any error messages about replicating segments being stuck?

In any case, we run with these numbers in production:

maxSegmentsToMove: 200

replicationThrottleLimit: 300

This has worked reasonably well for 500k-1M segments or so.

Increasing the replicationThrottleLimit seems to have resolved the issue. Full load status is now showing 0 remaining segments for all data sources.

Great to hear!