Hiya,
I can’t seem to find this spelled out anywhere.
For hadoop ingestion, the granularity input spec, is the filePattern expected to be a regex?
https://github.com/apache/incubator-druid/blame/master/docs/content/ingestion/hadoop.md#L135
Thanks,
Dyana
I had thought the same initially, but that does not need to be. You can provide the directory path. In the below sample, i am having the data loaded from a AVRO based hive table…
“ioConfig”: {
"type" : "hadoop",
"inputSpec" : {
"type" : "static",
“inputFormat”: “io.druid.data.input.avro.AvroValueInputFormat”,
"paths" : "s3://my-s3-bucket/my-hive-table/"
}
},
Yes, filePattern
is a regex, it’s used as follows in GranularityPathSpec:
Pattern fileMatcher = Pattern.compile(filePattern);