We have been using Druid for some time to handle real time ingestion of data and that has been going well.
Now I need to add previously missing (prior) data for a development partner we have and we have taken his data and reformatted to fit the “real time messages” we have been ingesting.
This old data:
- is a part of data in an existing datasource.
- spans several months of data
- is not strictly ordered (not in 100% chronological order)
- does only create prior segments (no overlap in data)
- is stored as json files (array of entries)
Can someone please outline for me what needs to be done to index this so that it gets indexed and added to the existing datasource.