I am storing my raw CSV data as bz2 files on HDFS. Is there any way I can get Druid to decompress and ingest these bz2 files on the fly instead of having to decompress them myself first?
I am storing my TSV files with Snappy Compression in HDFS and I only make sure the input files end with “.snappy”. The index tasks reads these files without any problem. If your Hadoop cluster supports Bzip2 compression then index task should have no problems reading your files if they have an “*.bz2” suffix.
That’s perfect! Thanks