Re: [druid-user] Re: Data representation of huge amount of sensor data

Ingestion of that data shouldn’t be an issue… the rule of thumb is 10k/records / second / core… just make sure your kafka topic is partitioned optimally for parallel ingest.

I do not recommend having different data sources, unless you can UNION them.

For the data model, the consideration should be regarding what level of granularity you want. A simple model of timestamp, deviceid and metric should work just fine IMO, though I have not built a system like this in the past. DeviceID can be used as a secondary partition if you are putting them in a single datasource. Druid is great up to a few 100 columns, and is excellent at table scans.