Data migration in druid

Hi guys,

I have been trying to migrate some old data to druid by putting to kafka queue and druid listens to the same queue. But not able to get the data to druid, i could confirm that data being added to kafka queue. Any thoughts on this? Please note that these data time stamps will be of old.

Thanks,
Suresh

It is likely that they are being dropped because of "windowTime". If
you are working with older data, it is currently recommended to ingest
data via hadoop jobs instead of direct from kafka.

If you don't run hadoop and don't want to set it up, some people have
successfully created setups where they ingest data direct from kafka
topics. This requires setting the "rejectionPolicy" to "messageTime"
and ensuring that your data is being delivered in time order
(http://druid.io/docs/latest/Realtime-ingestion.html).

You should also look into enabling metrics to be logged out, this will
provide some log lines that will tell you if messages are being
ingested or dropped on the floor (events/processed, events/thrownAway,
events/unparseable). This can be done by setting
druid.emitter=logging
(http://druid.io/docs/latest/Configuration.html). Also, in order to
avoid lots of log spew, make sure you don't have the other loggers
configured (set druid.monitoring.monitors=).

--Eric

Thanks Eric, We had set messageTime but the windowTime duration was lesser. We have modified the duration,it works now.

-Suresh