IngestionSpec that applies Regex to a dimension

Is it possible to define an IngestionSpec such that one of the dimensions has as Regex applied to it during ingestion.

For example, I have dimension for request_url but I would like to only ingest a portion of that data for aggregation. Essentially transform “GET /eu_cookie_notice.html HTTP/1.1” -> “/eu_cookie_notice.html” during ingestion via regex.

Based off the documentation it seems the DimensionSpec extraction function are only applied after the data has been ingested.



Hi William,

There’s currently no support for applying extraction functions to dimensions at ingestion time, that functionality may be added in the future, but I don’t think there are concrete plans for that presently.



So the options for manipulating the data during ingestion are to write a custom IngestionSpec or to change the data before it hits druid?

i do have the same problem and i think so we have to change the data before hitting to the druid