Any experience running druid on MapR hadoop as deep storage?

Would greatly appreciate if someone could share the experience of running druid on MapR as deep storage.Any known issues , successful deployment would be helpful.

Hi Pushkar, MapR’s propietary file store sounds like it should be fine for Druid. You may need to create a deep storage extension for this store though.

Thanks.just to confirm does druid already have such indexer for open source hadoop?

Hi Pushkar, yes it does. See; http://druid.io/docs/latest/ingestion/batch-ingestion.html

A quick follow up.
mapR does support hadoop APIs on top of its own implementation of file system.So wanted to understand specifically what portion of current hadoop deep storage extension do you suspect might not be compatible with mapR.

Regards,

Pushkar

Hey Pushkar,

There are sometimes dependency conflicts with non-Apache distributions of hadoop. We have selected versions of libraries like guava and jackson that work well with the official Apache distribution, but sometimes other vendors elect to use older or newer versions. You’re more likely to have issues there than with the actual hadoop API. If you do have dependency issues, you can usually fix them by playing around with which jars Druid is using.

See here for some more tips: http://druid.io/docs/latest/operations/other-hadoop.html

I doubt that you need a new extension. MapR FS implements standard file system semantics. Just treat it like any ordinary shared storage … think of it like an uber Netapp. MapR’s platform exhibits other API’s as well such as HDFS, noSQL and streaming, but files should be all you really need.