Ingestion for columns containing lists

Hi all

I am trying to ingest data from a Kafka stream (input format JSON). A single entry could look like this:
{“ID”: “X”, “industry”: “Farmers”, “region”: “A”, “timestamp”: [1, 2, 3, 4, 5], “volumes”: [0.6, 0.3, 0.8, 0.2, 0.9]}

The ingestion spec should lead to the following rows:
[X, Farmers, A, 1, 0.6]
[X, Farmers, A, 2, 0.3]
[X, Farmers, A, 3, 0.8]
[X, Farmers, A, 4, 0.2]
[X, Farmers, A, 5, 0.9]

I have tried various different ways of flattening the 2 columns, but most examples create a single entry taking the first elements of the list ([X, Farmers, A, 1, 0.6] in this case) per ingested JSON array. I have tried the alternative of flattening the lists before pushing them to Kafka, but that results in a large number of (avoidable) messages. In the presented example, the length of the list is limited, but in reality, I am working with very large time series, both in length and number.

Could anyone help me with the jq/path expression needed to get the desired result?

Welcome @Sergej_Jurev! I’ve been discussing this with a colleague, and he’s wondering if you can pre-process your data into something like this:

  "ID": "X",
  "industry": "Farmers",
  "region": "A",
  "combined": [
      "timestamp": 1,
      "volume": 0.6
      "timestamp": 2,
      "volume": 0.3
      "timestamp": 3,
      "volume": 0.8
      "timestamp": 4,
      "volume": 0.2
      "timestamp": 5,
      "volume": 0.9

If so, then he’s pretty close to having a solution. I’ll try to get him to post. If he can’t, I’ll post his solution.


Hi @Mark_Herrera. Thank you for your suggestion. This is indeed a possible structure for our messages. Do you have more information regarding the ingestion spec for your proposed message? Thanks again for your help!