Can I ingest compressed (zip/lz4) files from Amazon S3 bucket directly to druid?

Can I ingest compressed (zip/lz4) files from Amazon S3 bucket directly to druid?

Any one ?

If you’re using Druid’s Hadoop indexing then you can use any compression format supported by Hadoop. That’s at least .gz, .lzo, possibly some others. If you’re using Druid’s native indexing then only .gz is supported.

Hi,
I tried hadoop ingestion with gz and lz4 files.

While gz task finished fine, the lz4 task failed with the following stack trace.

2017-01-03T13:13:09,116 ERROR [task-runner-0-priority-0] io.druid.indexer.IndexGeneratorJob - [File /tmp/druid-indexing/mydatasource/2017-01-03T131146.016Z/bf1aaf5eb303432dbb996a72435beab2/segmentDescriptorInfo does not exist.] SegmentDescriptorInfo is not found usually when indexing process did not produce any segments meaning either there was no input data to process or all the input events were discarded due to some error
Caused by: java.io.FileNotFoundException: File /tmp/druid-indexing/mydatasource/2017-01-03T131146.016Z/bf1aaf5eb303432dbb996a72435beab2/segmentDescriptorInfo does not exist.
	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:795) ~[?:?]
	at org.apache.hadoop.hdfs.DistributedFileSystem.access$700(DistributedFileSystem.java:106) ~[?:?]
	at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:853) ~[?:?]
	at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:849) ~[?:?]
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[?:?]
	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:860) ~[?:?]
	at io.druid.indexer.IndexGeneratorJob.getPublishedSegments(IndexGeneratorJob.java:109) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexer.HadoopDruidIndexerJob$1.run(HadoopDruidIndexerJob.java:87) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexer.JobHelper.runJobs(JobHelper.java:323) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexer.HadoopDruidIndexerJob.run(HadoopDruidIndexerJob.java:94) ~[druid-indexing-hadoop-0.9.1.1.jar:0.9.1.1]
	at io.druid.indexing.common.task.HadoopIndexTask$HadoopIndexGeneratorInnerProcessing.runTask(HadoopIndexTask.java:261) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_92]
	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) ~[?:1.8.0_92]
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) ~[?:1.8.0_92]
	at java.lang.reflect.Method.invoke(Unknown Source) ~[?:1.8.0_92]
	at io.druid.indexing.common.task.HadoopTask.invokeForeignLoader(HadoopTask.java:201) ~[druid-indexing-service-0.9.1.1.jar:0.9.1.1]
	... 7 more

FWIW, the MR job completed successfully. How can I see what the MR job produced, so that i can confirm that the output was indeed empty?

I am facing same issue - how did you fixed this ?