Can we use thetaSketch more easier?

Hi, guys
I found thetaSketch very useful, but the query for user set operation (intersection/difference/union) would be very complex, and sometimes maybe it is impossible to build such query.

example 1 (complex query):

intersect ( user_id visit on yesterday in datasourceA, user_id visit on today datasourceA)

this would be a complex query according to doc (multiple intervals / mutiple filtered aggregator should be used)

example 2 (maybe impossible query):

intersect ( user_id visit on yesterday in datasourceA, user_id visit on yesterday in datasourceB )

(datasourceA and datasourceB may have different schema or user_id dimension name. )

to support query like example 2, can our druid provide a new type “compute thetaSketch”? and use some grammar like:

{

queryType: “thetaOp”,

operation: “intersect”,

fields: [

{ dataSource: "datasourceA", intervals: (yesterday), targetAggregator: { type: "thetaSketch", fieldName: "user_id" } },

{ dataSource: "datasourceA", intervals: (today), targetAggregator: { type: "thetaSketch", fieldName: "user_id" } } 

]

}

{

queryType: “thetaOp”,

operation: “intersect”,

fields: [

{ dataSource: "datasourceA", intervals: (yesterday), targetAggregator: { type: "thetaSketch", fieldName: "user_id" } },

{ dataSource: "datasourceB", intervals: (yesterday), targetAggregator: { type: "thetaSketch", fieldName: "user_id" } } 

]

}

And, if not considering provide such new query type, do you have any suggestion on the “example 2” case? maybe I could use thetaSketch library to compute the intersection?

Yes it is hard to do look back queries.
Currently at Yahoo we are working on a new type of query that will do similar thing, it is on development and i am not sure when it will be released, but it will come soon !

thanks for response and glad to hear that.

在 2016年2月19日星期五 UTC+8下午10:39:27,Slim Bouguerra写道: