I’ve build a cluster with 2 datanode exact same configuration on both
On each datanode I’ve the historical node and the middleManager node
The segments are stored locally on each machine
I’m indexing data using hadoop cluster.
The task submitted goes always on the same node and the segments are also always saved on the same node.
how to leverage the fact that I’ve 2 nodes.
When I query the data of course only 1 node is working
I’m using AWS machine my deep storage is on S3 index tasks are handled by Hadoop EMR
Could you help me to understand