[Upsert on existing Data] Use case where one dimension is getting change periodically

Hi,

While using druid, i came across an use-case where a dimension value is getting changed periodically and i always have to use the latest values. Can somebody share some thoughts over this use case?

For eg: a data source has two dimension a and b, has two metric c and d. The value of a is getting changed periodically. When some queries to this data source, aggregation should happen using latest values.

Thanks.

Hey Naveen!

I hope this finds you well, many members of our Druid community are currently using lookups to solve these types of challenges. See here: https://druid.apache.org/docs/latest/querying/lookups.html

Happy Druiding!!

Thanks for quick response Daniel.

Apology for not mentioning the complete details. Cardinality of that dimension is around 100 million and lookup doesn’t perform well on huge data plus i have to put filter on that dimension

Naveen,

Thats definitely a lot of data to manage in lookups. Have you tried JDBC based lookups you can do filter on that as well (which translates to where clauses in the SQL). Please take a look at the below documentation

https://druid.apache.org/docs/latest/development/extensions-core/lookups-cached-global.html