How to fill new historical nodes

Hello.

I have added 3 historical nodes to the clusted but they still show up as empty after half a day since I’ve set them up. Or am I understanding this chart in a wrong way?

Coordinator should eventually balance the segments across all your historical nodes.

Assuming you have more then 2 segments, It should try to move existing segments and balance it across all historical nodes.

I guess there might be some config error causing segments to not being loaded on those nodes.
Check coordinator logs and logs for newly added historical nodes for more details.

Thanks! I have checked logs on historical nodes and there is an error popping up there. But I’m confused by the logs…

2016-08-09T04:07:45,949 INFO [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Loading segment tdr_v3_2014-08-19T00:00:00.000Z_2014-08-20T00:00:00.000Z_2015-10-17T18:57:32.066Z
2016-08-09T04:07:45,949 INFO [ZkCoordinator-0] io.druid.segment.loading.SegmentLoaderLocalCacheManager - Deleting directory[/tmp/druid/indexCache/tdr_v3/2014-08-19T00:00:00.000Z_2014-08-20T00:00:00.000Z/2015-10-17T18:57:32.066Z/0]
2016-08-09T04:07:45,949 INFO [ZkCoordinator-0] io.druid.segment.loading.SegmentLoaderLocalCacheManager - Deleting directory[/tmp/druid/indexCache/tdr_v3/2014-08-19T00:00:00.000Z_2014-08-20T00:00:00.000Z/2015-10-17T18:57:32.066Z]
2016-08-09T04:07:45,949 INFO [ZkCoordinator-0] io.druid.segment.loading.SegmentLoaderLocalCacheManager - Deleting directory[/tmp/druid/indexCache/tdr_v3/2014-08-19T00:00:00.000Z_2014-08-20T00:00:00.000Z]
2016-08-09T04:07:45,950 WARN [ZkCoordinator-0] io.druid.server.coordination.BatchDataSegmentAnnouncer - No path to unannounce segment[tdr_v3_2014-08-19T00:00:00.000Z_2014-08-20T00:00:00.000Z_2015-10-17T18:57:32.066Z]
2016-08-09T04:07:45,950 INFO [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Completely removing [tdr_v3_2014-08-19T00:00:00.000Z_2014-08-20T00:00:00.000Z_2015-10-17T18:57:32.066Z] in [30,000] millis
2016-08-09T04:07:45,982 INFO [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Completed request [LOAD: tdr_v3_2014-08-19T00:00:00.000Z_2014-08-20T00:00:00.000Z_2015-10-17T18:57:32.066Z]
2016-08-09T04:07:45,982 ERROR [ZkCoordinator-0] io.druid.server.coordination.ZkCoordinator - Failed to load segment for dataSource: {class=io.druid.server.coordination.ZkCoordinator, exceptionType=class io.druid.segment.loading.SegmentLoadingException, exceptionMessage=Exception loading segment[tdr_v3_2014-08-19T00:00:00.000Z_2014-08-20T00:00:00.000Z_2015-10-17T18:57:32.066Z], segment=DataSegment{size=157714840, shardSpec=NoneShardSpec, metrics=[count, debit, volume, debit_without_nds, record_count], dimensions=[address, base_bill_type, batch_tariff_plan, client_class, client_type, connect_type, customer, debit_not_zero, department, device, device_group, household_type, is_fr, provider_id, region, region_type, report_period_id, service, service_packet, source_group, tariff, tariff_plan, tariff_plan_association, tariff_plan_group, tax_type_id, tdr_group_id, town, town_status, town_type, trade_mark, unit], version=‘2015-10-17T18:57:32.066Z’, loadSpec={type=local, path=/home/druid/localStorage/tdr_v3/2014-08-19T00:00:00.000Z_2014-08-20T00:00:00.000Z/2015-10-17T18:57:32.066Z/0/index.zip}, interval=2014-08-19T00:00:00.000Z/2014-08-20T00:00:00.000Z, dataSource=‘tdr_v3’, binaryVersion=‘9’}}
io.druid.segment.loading.SegmentLoadingException: Exception loading segment[tdr_v3_2014-08-19T00:00:00.000Z_2014-08-20T00:00:00.000Z_2015-10-17T18:57:32.066Z]
at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:309) ~[druid-server-0.9.1.1.jar:0.9.1.1]
at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:350) [druid-server-0.9.1.1.jar:0.9.1.1]
at io.druid.server.coordination.SegmentChangeRequestLoad.go(SegmentChangeRequestLoad.java:44) [druid-server-0.9.1.1.jar:0.9.1.1]
at io.druid.server.coordination.ZkCoordinator$1.childEvent(ZkCoordinator.java:152) [druid-server-0.9.1.1.jar:0.9.1.1]
at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:522) [curator-recipes-2.10.0.jar:?]
at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:516) [curator-recipes-2.10.0.jar:?]
at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93) [curator-framework-2.10.0.jar:?]
at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) [guava-16.0.1.jar:?]
at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:85) [curator-framework-2.10.0.jar:?]
at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:514) [curator-recipes-2.10.0.jar:?]
at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35) [curator-recipes-2.10.0.jar:?]
at org.apache.curator.framework.recipes.cache.PathChildrenCache$9.run(PathChildrenCache.java:772) [curator-recipes-2.10.0.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_91]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_91]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_91]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]
Caused by: io.druid.segment.loading.SegmentLoadingException: /tmp/druid/indexCache/tdr_v3/2014-08-19T00:00:00.000Z_2014-08-20T00:00:00.000Z/2015-10-17T18:57:32.066Z/0/index.drd (No such file or directory)
at io.druid.segment.loading.MMappedQueryableIndexFactory.factorize(MMappedQueryableIndexFactory.java:52) ~[druid-server-0.9.1.1.jar:0.9.1.1]
at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:96) ~[druid-server-0.9.1.1.jar:0.9.1.1]
at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:152) ~[druid-server-0.9.1.1.jar:0.9.1.1]
at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:305) ~[druid-server-0.9.1.1.jar:0.9.1.1]
… 18 more
Caused by: java.io.FileNotFoundException: /tmp/druid/indexCache/tdr_v3/2014-08-19T00:00:00.000Z_2014-08-20T00:00:00.000Z/2015-10-17T18:57:32.066Z/0/index.drd (No such file or directory)
at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_91]
at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_91]
at java.io.FileInputStream.(FileInputStream.java:138) ~[?:1.8.0_91]
at io.druid.segment.SegmentUtils.getVersionFromDir(SegmentUtils.java:43) ~[druid-api-0.9.1.1.jar:0.9.1.1]
at io.druid.segment.IndexIO.loadIndex(IndexIO.java:211) ~[druid-processing-0.9.1.1.jar:0.9.1.1]
at io.druid.segment.loading.MMappedQueryableIndexFactory.factorize(MMappedQueryableIndexFactory.java:49) ~[druid-server-0.9.1.1.jar:0.9.1.1]
at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:96) ~[druid-server-0.9.1.1.jar:0.9.1.1]
at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:152) ~[druid-server-0.9.1.1.jar:0.9.1.1]
at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:305) ~[druid-server-0.9.1.1.jar:0.9.1.1]
… 18 more

``

At first it deletes the directory and right after that throws an exception that it doesn’t exist, what’s wrong here?

Ah, it seems you are using default local disk for deep store.
fwiw, Deep Storage needs to be accesible by all the historical and indexing nodes.

Changing the deep storage to S3 or HDFS should fix the issue.

Refer - http://druid.io/docs/latest/dependencies/deep-storage.html for more details about deep storage and related configs.

Hi Nishant,

I too set up local deep storage and was not able to distribute data.
I have one question related to the S3 storage.

1,Does changing the deep storage option to S3 or HDFS ensure that data gets distributed evenly?or are there any additional changes required?
2,we have been spinning up ec2 instances with pre-defined roles,which itself is sufficient to access the S3 bucket.So is there any way we can bypass the s3 access key and secret key in the common configuration file?

Sunil

Sunil, segment distribution in Druid has nothing to do deep storage. The Druid coordinator uses a load balancing algorithm to try and distribute segments as evenly as possible based on expected query loads. This balance does not happen instantaneously but rather occurs over time, depending how how the max Segments to move configuration is set on the coordinator. Moving too many segments at a time to rebalance things can cause resource contention and potentially impact performance.