Parquet ingestion using static-s3 firehose

We are currently using the static-s3 firehose with the index task to ingest data into our druid cluster, but this requires us to convert our parquet data to csv before ingestion. I would like to use the parquet ingestion extension without having tho set up hadoop, but the static-s3 firehose seems to require a StringInputRowParser, which is causing an error with the parquet extension. Is there a way to work around the firehose so that we can read in the parquet files we have on s3 without setting up a hadoop cluster?


as far as I know, using Hadoop index task is the only available way to load parquet files from s3. I raised



2018년 4월 5일 (목) 오전 7:08, hcohen@triplelift.com님이 작성:

Hi Jihoon,

Thanks for opening the issue, I’ll see if we could set up the hadoop index task in the mean time, but thanks for opening the ticket!