Druid-spark connector

It seems the druid-spark connector is very old. What are the alternatives for loading data from druid into spark?

Hey Rajiv,

For Druid -> Spark, try checking out: https://github.com/implydata/druid-hadoop-inputformat, an unofficial Druid InputFormat. Spark can use that to read Druid data into an RDD - check the example in the README. It’s also unofficial and, currently, unmaintained, so you’d be taking on some maintenance effort if you want to use it.

For Spark -> Druid, there’s an unofficial Druid/Spark adapter at: https://github.com/metamx/druid-spark-batch. If you want to stick to official things, then the best approach would be to use Spark to write data to HDFS or S3 and then ingest it into Druid using Druid’s Hadoop-based or native batch ingestion. (Or even write it to Kafka using Spark Streaming and ingest from Kafka into Druid using Druid’s Kafka indexing service.)

For Spark->Druid there’s this new library (unofficial)

Sorry to bump an old thread, but this was one of the top results when searching for “druid spark ingest”.