Ingesting data with a nested timestamp for a day aggregation


Currently, I’m trying to ingest a file that has a nested json structure. The timestamp is an attribute of a nested property. I’m only interested in the day aggregation. If I understand this correctly, I will need to use flattenSpec so I can use the nested timestamp in the timestampSpec.

Although, I noticed that when I add in the timestamp in the flattenSpec, the disk space on my ec2 gets consumed really fast as opposed to if I take it out. But I’m not sure how I can specify the nested timestamp property in the timestampSpec.

In regards to this, I have a couple of questions:

  • Is there a way to specify the nested timestamp path onto the timestampSpec without adding it in the flattenSpec?
  • During ingestion, does Druid download and save the flattened json in the node’s local disk? If I have thousands of files, does it mean this will get written as a flattened json onto the nodes’ local disk space?

I’m currently using the quickstart version of Druid 0.9.2.