Currently, I’m trying to ingest a file that has a nested json structure. The timestamp is an attribute of a nested property. I’m only interested in the day aggregation. If I understand this correctly, I will need to use flattenSpec so I can use the nested timestamp in the timestampSpec.
Although, I noticed that when I add in the timestamp in the flattenSpec, the disk space on my ec2 gets consumed really fast as opposed to if I take it out. But I’m not sure how I can specify the nested timestamp property in the timestampSpec.
In regards to this, I have a couple of questions:
- Is there a way to specify the nested timestamp path onto the timestampSpec without adding it in the flattenSpec?
- During ingestion, does Druid download and save the flattened json in the node’s local disk? If I have thousands of files, does it mean this will get written as a flattened json onto the nodes’ local disk space?
I’m currently using the quickstart version of Druid 0.9.2.