Segment to Segment on non-aggregate data

Hi,

Attached sample data, segment_to_segment_raw.csv.

Approach 1:

segment_to_segment_agg_approach_1.json (854 Bytes)

segment_to_segment_agg_approach_2.json (957 Bytes)

segment_to_segment_raw.csv (174 Bytes)

segment_to_segment_raw_approach_1.json (980 Bytes)

segment_to_segment_raw_approach_2.json (1.07 KB)

The following spec adjusted from segment_to_segment_agg_approach_1.json should work:


{

"dataSchema": {

"dataSource": "test1",

"parser": {

"parseSpec": {

"format": "timeAndDims",

"dimensionsSpec": {

"dimensions": [

"category"

]

}

}

},

"metricsSpec": [

{

"name": "Total_Quantity",

"type": "longSum",

"fieldName": "qty"

},

{

"name": "Total_Amount",

"type": "longSum",

"fieldName": "amount"

}

],

"granularitySpec": {

"type": "uniform",

"segmentGranularity": "DAY",

"queryGranularity": "DAY",

"rollup": true

}

},

"ioConfig": {

"type": "index",

"firehose": {

"type": "ingestSegment",

"dataSource": "segment_to_segment_raw_approach_1",

"interval": "2019-01-01/2019-01-04",

"metrics": [

"qty",

"amount"

]

},

"appendToExisting": false

},

"tuningConfig": {

"type": "index"

},

"type": "index"

}

This adds a “metrics” list containing “qty” and “amount” to the ingest segment firehose.

If the ingest segment firehose is not provided with a list of dimensions, it will use the list of dimensions specified in the timeAndDims parsespec.

If no metric column names are specified in the ingest segment firehose, it will use the existing list of metrics from the existing segments.

In this case the original datasource was ingested with only dimensions and no metrics, and only the “category” column was specified as a dimension, so the “qty” and “amount” columns are not visible to the reindexing task (they were not specified as dimensions or metrics).

Hi Jonathan,

Apologies for delay in updating. Thanks for your response with details. Your solution works.

Regards, Chari.