Kinesis GZIP Compression

Using the druid-kinesis-indexing-service to read Kinesis stream that is using GZIP compression. Is there a way in Druid to decompress the GZIP data? In Python I’ve used the ZLIB library for the decompression. decompressed_str = zlib.decompress(value_zip, 16+zlib.MAX_WBITS)

These are the only compression settings that I’m seeing. Not sure if there is something else that I may be missing.

https://druid.apache.org/docs/latest/development/extensions-core/kinesis-ingestion.html

dimensionCompression
String
Compression format for dimension columns. Choose from LZ4, LZF, or uncompressed.
no (default == LZ4)
metricCompression
String
Compression format for metric columns. Choose from LZ4, LZF, uncompressed, or none.
no (default == LZ4)
Any help is appreciated

Hi Huey,
Not sure if I am getting this correctly.

You have gzip data in your kinesis stream and if you are able to ingest data into druid from the kinesis stream successfully,

why are you worried about decompressing the data in druid ?

Not sure if I am missing any important point here. You should be able to query your data in druid once you ingest it into druid successfully.

Are you getting any error related to gzip compression when you are trying to ingest data into druid?

Thanks,

–siva

Hi Siva,

Yes, I’m receiving errors when trying to parse the data. I’m able to connect to the stream, but the data coming through is special characters. In python, I was able to decompressed the data when reading the stream in order to see the information.

Attached is the screen shot of what I see when connecting. Then I also get an error message stating 'Unable to parse row" when clicking on parse data, along with more special characters being shown.