[0.9.1.1] Select query paging enters loop bug still happening if you query historical nodes directly

Regarding the bug fix in https://github.com/druid-io/druid/pull/2480 -

I had noted in the pull request that we were still observing the issue in 0.9.1.1.

We were observing the looping/cycling behavior in our integration tests. It turns out that this was an artifact of how these tests were set up. In our integration tests, we only start up a single historical node along with a broker (and zookeeper), then we were querying the historical node directly, thinking that since there was only one node, it would be okay to bypass the broker. But these queries still showed the looping/cycling behavior (and made our tests failed). Eventually we figured out that if we pointed the tests at the broker instead, we saw the desired behavior - no loops.

Although it might not be recommended to query historicals directly, as long as it is allowed I would consider this to be a bug, although a low priority one since you can always work around it by querying the broker. Should I open a new defect for it?

Thanks,

Ryan

Thanks Ryan, this is something we’d probably want to fix. While not necessarily critical, querying historical nodes directly can be useful when debugging. Your may also be a symptom of a deeper problem somewhere else.

Fyi, there are some things that only the broker does, like rewrite-based filter and dimension extraction optimizations, toolChest based segment pruning, and dimension sharding based segment pruning. So in general querying a historical node directly is not going to be totally equivalent to going through the broker, even if you only have one historical node.

In this case, it looks like segment pruning was involved in the bug fix (it had modifications to toolChest.filterSegments), so that’s probably related to why it doesn’t work on historicals.

I agree with Xavier that it would be good to be able to query historicals directly, so it would be good to find a way to make that work properly. I think the uses can go beyond debugging - communicating directly with historicals is important for parallel extraction of data into external processsing engines, if you want to do that sort of thing.

Thanks for the info… to be sure, our integration tests are now going to go through the broker - it’s what our production system does, after all. :slight_smile: