Hadoop ingestion, granularity input spec

Hiya,

I can’t seem to find this spelled out anywhere.

For hadoop ingestion, the granularity input spec, is the filePattern expected to be a regex?

https://github.com/apache/incubator-druid/blame/master/docs/content/ingestion/hadoop.md#L135

Thanks,

Dyana

I had thought the same initially, but that does not need to be. You can provide the directory path. In the below sample, i am having the data loaded from a AVRO based hive table…

“ioConfig”: {

 "type" : "hadoop",

 "inputSpec" : {

   "type" : "static",

“inputFormat”: “io.druid.data.input.avro.AvroValueInputFormat”,

   "paths" : "s3://my-s3-bucket/my-hive-table/"

 }

},

Yes, filePattern is a regex, it’s used as follows in GranularityPathSpec:


Pattern fileMatcher = Pattern.compile(filePattern);

  • Jon