I am running druid with single server(micro - Quick Start) on GCP.
Druid Conf: CPU - 4
ROM : 32 GB
My data source for druid is CSV files which are coming from cloud storage and mostly each file is having 2 millions rows and 20 such files are there daily.
Connected Druid with Superset for reports, which are used on daily basis.
Issue: Single report is having 15-20 charts, while loading same report or if same dashboard is accessed by multiple users then 502 and 500 errors are coming on multiple charts.
Need help with the configuration changes required, in the clusters…
Are u seeing any errors in the logs?
How many rows are in the complete data source?
Quickstarts are often good for small data and low concurrency. Try tuning the node with the following doc . May be you may need to scale up your node.
yes, there are some errors in the broker. logs.
org.jboss.netty.channel.ChannelException: Channel Disconnected
org.apache.druid.QueryInterruptedException: Channel Disconnected
org.apache.druid.server.QueryLifecycle - Exception while processing queryID [–54f4—]
Thanks, surely will follow the doc to tune the clusters.
Can you help me with, how can i increase the number of queries handled by druid simultaneously.
Concurrency is defined by the no of segments historical can scan in one sec. At a time one core in the historical can scan only one segment. It’s possible to calculate how many segments are often required to scan based on the time interval (__time) in the query. so I cannot give you a direct answer for this.
To optimize the performance both performance and concurrency in existing hardware, please follow the basic tuning doc which I shared in my previous email. Further the concurrency is not enough , you may need to scale the cluster (cpu and ram).