Hello, my setup is as follows:
- Druid 0.15.1 on private cloud
- Hadoop EMR on aws.
I have successfully submitted an ingestion job from druid console, and i can see it completed successfully in Hadoop.
However, in S3, i only see logs for the job, and not the segments. Hence, segments are never loaded in druid with the error:
Caused by: java.lang.RuntimeException: No buckets?? seems there is no data to index.
Attaching my job spec and common properties:
Since the job in hadoop terminates successfylly, how can i further debug why segments are not showing up on S3?
spec.json (2.28 KB)
common.properties (975 Bytes)
Can you check your data is coherent with your spec interval ?
If you load data from 2018 but your spec specify interval of 2019, the ingestion won’t fail but you won’t have any segments.
I don’t remember exactly what you can search but in the ingestion logs there is a line where you can see how many lines are ingested
Yes, i ve checked the sample file i am using and all the timestamps fall within the ingestion period i have set on the spec.
Are the ingestion logs on hadoop yarn?
This is the druid report job:
“errorMsg”: “java.lang.RuntimeException: No buckets?? seems there is no data to index.”
I am guessing rowsThrownAway Is problematic?
Ok, problem was, i had “auto” in timestampSpec but my data was in posix. I mistakenly thought that “auto” meant it would automatically detect the timestamp format.