Is there a limit to number of concurrent queries which can be fired on a druid cluster
Yes, the limit is set by the number of http threads and processing threads, but mostly processing threads… as well as the types of queries (some of which take more overall memory).
The cluster, overall, uses 1 processing thread per segment. So if the data you are querying covers 10,000 segments, then 10 nodes with 10 processing threads each means the 100 total processing threads are going to process about 100 segments each.
The broker is also a limiting factor. The broker merges the results of the individual historical nodes, and the processing thread count comes into play there as well.
You can have multiple brokers answering queries on the same set of historical nodes and indexing service tasks
Additionally the number of http threads should be set to something reasonable to make sure you can handle the number of concurrent queries you need both at the broker and historical level.
What is the most important factor which determines how many simultaneous queries can be fired
I’d say roughly this order: #1) Memory to disk ratio (how overcommitted is your memory) #2) size of backing memcached (but this depends on workload) #3) number of segments per CPU.
Is it possible, that one query take all the resources and other queries just need to wait for completion of this query
Yes, there are multiple techniques to alleviate this like setting query priority or using interval chunking, whereby a large timerange query is broken up (internally) into multiple smaller queries
Is there any documentation, by which I can understand these concepts. (regarding capacity planning)
Unfortunately capacity planning depends completely on what your data looks like and what your query patterns are. There are a LOT of tunings you can do depending on what your use cases are, the defaults for which are “pretty good”. If, for example, you do a LOT of regex queries, then roaring bitmaps tend to outperform concise, so setting the bitmap compression to roaring typically does better, but at the cost of increased segment size (see #1 most limiting consideration above).
Additionally, there is the consideration for the indexing service, which is a whole other can of operation-interestingness, though much simpler to capacity plan since you can specify the number of workers per middle manager.
I know some of the answers are kind of vague. If you want more explanation on some of them let me know.