I tried more batch ingestion from hadoop yesterday and there seemed to be some data loss even though the overlord console page shows all the tasks succeeded. I fired 10 tasks all at once, each of them with about 1G data to process and they are all ingesting data to the same datasource. I’m not sure if that caused the issue.
I’m doing this because the task will fail because the dataset is too large if I ingest the whole dataset all in one task.
Is this a known issue?