Druid and machine learning

Hey guys, i want to use data from druid in machine learning.
i i want to use spark for that, how can i import data from druid/hdfs to spark rdd/dataframe?

should i use sql/query api? total size of training sample can be very huge

Hi,

You may consider using https://github.com/himanshug/druid-hadoop-utils to read druid segments stored on hdfs. It contains Hadoop InputFormat and pig loader.

ps: This code is in very early stage and things might change in future.

Thanks,
Akash