Subtract two column from two different table: Apache Druid- Showing Error. "Can not build plan for mentioned Query"

I have two different table having timestamp column. I wanted to simply subtract these two column but getting error:

Select (select TIMESTAMP_TO_MILLIS(HD_start_time) FROM t ) - (Select TIMESTAMP_TO_MILLIS(HD_End_time) from f) as my1

Error: druid error: Unknown exception (org.apache.druid.java.util.common.ISE): Cannot build plan for query:.

Any input would be highly appreciated.

Why not use a JOIN? The query as you have written it is non-deterministic, so each run could bring back different values for start_time from the table.

I have tried to make join but its making cross join kind of thing because we dont have any other common column to make join on that.
Even tried to generate some index column to make join but unable to generate index column as well in druid. Now Row_number supports here.

Even tried join with inequality to generate index but join with inequality also doesn’t support in druid.

So I have tried all possible ways to just subtract two table with one to one column subtract without any success.

Please suggest if any other method you know to generate at least index column in Druid .

I tried below query to generate index but it does not support in druid:

I tried this alternative but it also reflecting error : I think druid also does not support other than equality in where condition here.
SELECT
(SELECT COUNT(*) FROM table i WHERE o.__time >= i.__time )rowNum
, __time
FROM table o

Regards
Vikram

Hey Vikram!

Could you expand a little more? I am not clear why there’s a subquery if you only need to subtract one timestamp from another in each row?

Hello Peter,
Let me explain to you:

I have two table:
Table1 - col1- Start_time(timestamp) and col2(with text value “true”)
Table2- col1-End_time(timestamp) and col2(with text value “false”)

I need to calculate time difference between these two timestamp corresponding to one to one mapping .

I dont have any primary key to join these two table with one to one .

When I am trying to simple join its making cross join and creating one to many columns.

So if you can suggest something so that I can simply join these two table with one to one mapping and then make difference.
Major thing is I dont have any primary key in both tables so join them based on that.

I thought to generate index column in both table so that I can join by using that and thats why I used subquery to generate that index column.

Regards
vikram

AHAAA

So table A:

start_time something
09:00 true
09:10 true
09:15 true

Table B:

end_time something
09:06 false
09:15 false
09:23 false

And what you want is:

result
00:06
00:15
00:23

Would that be right? And that is why you want ROW_NUMBER?

This sounds very much like a situation that occurs when working with sessions – ie one table is when the session started, the other is when the session ends.

I am still thinking… but I would say that this will be not very performant in Druid even if you can find a suitable key relationship between the tables. The merge of the two tables will occur in the Broker, and this is resource constrained. Druid’s design expects that the rows returned by the historical servers during the fan-in to the Broker will be aggregates, rather than raw rows, so that the calculations and the merges will be on a very small number of returned rows – I think the 100,000 row (soft) limit will apply, for example.

If possible – and to get the very highest performance out of Druid – I would suggest some up-stream processing that can put these two stamps into the same row on one table – and if this is a calculation that will be very common in queries, to front-load the calculation either in the source data or as a transform expression at ingestion time. For example, using a processor in Kafka that will output a new topic of events emitted when the end time is known.

Hello Peter,

Yes you understood problem correctly.

Basically these two information comes in one table only and timstamp value comes in single column and one after another in below fashion.

__Time Event
9:00 TRUE
9:06 FALSE
9:10 TRUE
9:15 FALSE

So very first time I was trying to implement lead/lag function(window func) but that does not support in druid so I decided to make two separate table one for true and another for false and subtract them so that I can calculate duration of one session and so on.

As you suggested to generate some event id corresponding to one session but for now its not feasible for me to implement this. This will require other team involvement and will take more time.

So if anything you can suggest at query level as temp solution would be great help to develop report.

Regards
Vikram

I suspect you may need to go back to the data model. Druid loves selects from single tables, and loves row-wise calculations – that is where I would start in this instance.