Introspect Query Troubles

Hi Guys,

I’ve been using facetjs for the last few weeks and it is awesome. However the one piece that has been flaking out on me a little bit is the introspect query which facet always opens up with. I’ve had it work in some druid setups and I’ve also had it return empty dimensions/metrics in others where there is clearly data in the druid. Right now my thought is that when I’m spoofing data to push into the realtime node I’m often just changing the timestamp and maybe a couple other dimensions (just for the purpose of running some hdfs experimentations). This seems to end up with inconsistent segment dimensions across the datasource I’m using which I’m seeing in the coordinator console. Is this the root of my introspect query problems or am I on the wrong path?

Thanks,

Michael Capeloto

I'm not sure which endpoint it is hitting, but I'm guessing it is hitting the

/druid/v2/datasources

endpoints to introspect. If that is the case, then that endpoint
actually cannot report any dimensions/metrics until at least one
handoff has occurred (while the segment is in the real-time world, it
actually doesn't know what dimensions it is going to get...). So,
likely it is happening when you are hitting a datasource that doesn't
have any segments handed off yet.

--Eric

Hey Eric that is the endpoint it was hitting, but there was definitely a handoff with the historical node based on the coordinator console and logs. I was thinking the same thing before the handoff happened on that setup. Unfortunately I already killed those boxes, but I was just wondering if inconsistent dimensions being shown in the coordinator would be a related issue.

Thanks,

Michael

Hi,

I can confirm that facet currently uses GET /druid/v2/datasources/ for introspection.

Sometimes that route returns empty results and Eric has pointed out the condition when that happens.

It is planned to switch facet introspection to use the segmentMetadata query instead. Eric would that help?

(I plan to do it anyway because segmentMetadata returns more data that facet can make use off)

Also I want to make the initial introspection step optional.

Michael, if you are using facetjs via JavaScript (as opposed to the CLI) you can currently provide an explicit list of attributes to the DruidDataset and avoid the introspection step.

Let me know if you want more info on that.

Vadim

Hi Vadim,

I am using facetjs via JavaScript but when we run with actual data, not the spoofed repetitive stuff I was using for some testing, everything works great. I’ll let you know if introspection gets buggy again but for now I think we’re fine. I was mainly just curious. I’m loving facetjs though it’s making life a lot easier.

Cheers,

Michael