picking up indexing after succesful job (but peon disconnected)


I got the hadoop indexing working and the 2 jobs succesful: determine_partitions and indexing job which generates segments/partitions into hdfs.

The issue is that the peon lost connectivity to zookeeper and other parts of the cluster and restarted the whole process. I killed the newly created job.

Now the indexing job is finishing but probably, besides hdfs, the rest of the system e.g. zookeeper, mysql will not be populated with that metadata and segment announcement.

Is there a way to effectively continue the indexing process by creating metadata for data already “imported” into hdfs i.e. segment files created by the hadoop jobs?

Can I make a script to insert into mysql and/or zookeeper so that the coordinator knows there are more segments to distribute to the historical nodes?

Thank you,


Hi Nicolae, you can just reindex the data. Druid is smart enough to figure out only to use the most recent version of some indexed data for an interval of time.