How can I improve concurrent query performance?

A community member recently posted this performance question. One of the founders replied with:

When doing a lot of concurrent queries it’s super important to minimize the CPU usage of each query. A small change in CPU use can make a big difference.

Some things that help here:

  • Make sure you are doing filtering efficiently: ensure your time filters are planned to intervals, and if you have a specific column you’re often filtering on, consider applying secondary partitioning (aka clustering)
  • Avoid unnecessary query time expressions
  • Use the flame graph technique to find where time is being spent

That’s a great summary @Mark_Herrera thank you for posting!

I once heard a talk from @Gian_Merlino2 on performance of the “scatter gather” query performance – I recently posted this… Apache Druid Adoption – Aim for sub-second queries - YouTube