Druid Ingestion Job failed with no error

Relates to Apache Druid <apache-druid-0.17.0>

Druid Ingestion Job failed with no error

PUSHKAR JOSHI

22:26 (now)

to Druid User

Hello Everyone,

Recently i created a druid cluster with below server configuration.

  • Master Node - m5a.xlarge

  • Data Node (Middle Manager) - a1.metal

  • Data Node (Historical) - m5a.2xlarge (Mounted EBS volume 200.00 GB)

  • Query Node - m5a.xlarge

  • Zookeeper a1.metal

Min 100GB storage for all nodes EC2.

I used the official documentation for the setup instruction:

Druid version : apache-druid-0.17.0

We have used S3 as deep storage and RDS as mysql database, respective configuration and access is taken care and working perfectly fine.

Below are the extension used

druid.extensions.loadList=[“druid-parquet-extensions”, “druid-avro-extensions”, “druid-basic-security”, “druid-google-extensions”, “druid-protobuf-extensions”,“druid-lookups-cached-global”, “mysql-metadata-storage”, “druid-s3-extensions”,“druid-kafka-indexing-service”, “druid-datasketches”]

The problem is when we try to ingest some data, the job is getting failed continuously. There are no specific errors found on middele-manager or the master server.

The segments data is not getting saved on s3, however some small files are getting created so not an access issue.

I tried few troubleshooting:

All the ports were open to public still jobs were failing, so we identified this may not be a network issue so we brought the cluster under private VPN.

To find issue we did took below steps & after these step deleted all data of s3 and database from rds and restarted all services for fresh start.

Log level changes to debug for understanding and getting more details.

all file permission changed to 755

all files user owner changes to root

RDS and s3 access works fine.

Telnet and all ports are able to connect between the servers.

Can anyone please help us with this?

pfa some snap for ref. which shows segments available as false and only 1% data is available to load and some failed task. Please let me know if you need more information.

Hey @PUSHKAR_JOSHI - are there task logs being generated? Do they have any information in them?

This temporary log file location is ${druid.indexer.task.baseTaskDir}/${taskId}/log