Building extension

Hi All,

I’m planning to use druid parquet extension to directly feed parquet data to the druid. In this process, I noticed druid 0.9.1.1 uses parquet 1.8. However, we use spark 1.6.1 to write those parquet data in the past and spark 1.6.1 using parquet-mr 1.6, although it complains about 1.6, it is actually using 1.7 (https://issues.apache.org/jira/browse/SPARK-10954).
Now druid extension complaining about the version during read. I have build parquet extension with parquet 1.7 and it works fine, what I have noticed parquet extension uses Avro-extension and which is part of the core . I’m wondering is there a better way to build the specific extension without rebuilding whole tar from source ?.

org.apache.parquet.VersionParser$VersionParseException: Could not parse created_by: parquet-mr version 1.6.0 using format: (.+) version ((.) )?(build ?(.))

at org.apache.parquet.VersionParser.parse(VersionParser.java:112)

Thank you

Regards

Biswajit

To avoid the version conflicts, I usually create “shaded” jar. You just need to configure Maven to include the shade plugin, and specify which package to be shaded.
What it does is to rename the package name so it wouldn’t collide with existing package with same name but different version in the core.

Here is a example.

https://github.com/knoguchi/druid/blob/protobuf/extensions-core/protobuf-extensions/pom.xml#L71-L91

“com.google.protobuf” package in my extension was renamed to “shaded.com.google.protobuf”

Please note, the resulting jar file will be much bigger because it includes all the dependencies. Sometimes the jar is called “fat” jar.