Tasks logs disappear

Hi all,

I’m having big trouble to understand why some tasks fail, because when I try to see the log file I get this: “No log was found for this task. The task may not exist, or it may not have begun running yet.”.

Even if the tasks has SUCCESS status!

I don’t know what to do… Could you please help me? It’s getting very annoying and I can’t go on with the job.

In the overlord logs I see this:

2016-08-09T11:02:32,560 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_hadoop_datasource-test_2016-08-09T11:02:32.531Z] location changed to [TaskLocation{host=‘localhost’, port=8106}].
2016-08-09T11:02:36,572 INFO [qtp377356799-87] io.druid.indexing.common.actions.LocalTaskActionClient - Performing action for task[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z]: LockTryAcquireAction{interval=2016-01-01T00:00:00.000Z/2016-01-02T00:00:00.000Z}
2016-08-09T11:02:36,572 INFO [qtp377356799-87] io.druid.indexing.overlord.TaskLockbox - Task[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z] already present in TaskLock[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z]
2016-08-09T11:02:37,589 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.RemoteTaskRunner - Worker[localhost:8091] wrote FAILED status for task [index_hadoop_datasource-test_2016-08-09T11:02:32.531Z] on [TaskLocation{host=‘localhost’, port=8106}]
2016-08-09T11:02:37,589 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.RemoteTaskRunner - Worker[localhost:8091] completed task[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z] with status[FAILED]
2016-08-09T11:02:37,590 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.TaskQueue - Received FAILED status for task: index_hadoop_datasource-test_2016-08-09T11:02:32.531Z
2016-08-09T11:02:37,590 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.RemoteTaskRunner - Cleaning up task[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z] on worker[localhost:8091]
2016-08-09T11:02:37,592 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.TaskLockbox - Removing task[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z] from activeTasks
2016-08-09T11:02:37,592 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.TaskLockbox - Removing task[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z] from TaskLock[index_hadoop_datasource-test_2016-08-09T11:02:32.531Z]
2016-08-09T11:02:37,592 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.TaskLockbox - TaskLock is now empty: TaskLock{groupId=index_hadoop_datasource-test_2016-08-09T11:02:32.531Z, dataSource=datasource-test, interval=2016-01-01T00:00:00.000Z/2016-01-02T00:00:00.000Z, version=2016-08-09T11:02:32.534Z}
2016-08-09T11:02:37,594 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.MetadataTaskStorage - Deleting TaskLock with id[1525]: TaskLock{groupId=index_hadoop_datasource-test_2016-08-09T11:02:32.531Z, dataSource=datasource-test, interval=2016-01-01T00:00:00.000Z/2016-01-02T00:00:00.000Z, version=2016-08-09T11:02:32.534Z}
2016-08-09T11:02:37,596 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.MetadataTaskStorage - Updating task index_hadoop_datasource-test_2016-08-09T11:02:32.531Z to status: TaskStatus{id=index_hadoop_datasource-test_2016-08-09T11:02:32.531Z, status=FAILED, duration=5033}
2016-08-09T11:02:37,598 INFO [Curator-PathChildrenCache-0] io.druid.indexing.overlord.TaskQueue - Task done: HadoopIndexTask{id=index_hadoop_datasource-test_2016-08-09T11:02:32.531Z, type=index_hadoop, dataSource=datasource-test}

Thank you.

Hi,
You need to configure task logging to store your task logs.

Please refer Task logging section here - http://druid.io/docs/latest/configuration/indexing-service.html

Thank you Nishant, I’m already on it!

Nishant,
I have configured task logging with type=hdfs and druid.indexer.logs.directory=hdfs_path

Still sometime logs go missing. It is weird that logs are not missing in all cases but only in certain cases (more when tasks fail as opposed to successful tasks). any other configuration required or any checks I may do?

Thanks

I’m having the same issue. Logs are shown ok while the peon task is running. Log on this Indexer console HTTP endpoint: /druid/indexer/v1/task/index_kafka_<my_task_id>/log
The log in that point in time is read from the middle manager local dir: /tmp/druid/task/index_kafka_<my_task_id>/log
After tasks are finished and their logs transferred to the paramterized shared folder: ${druid.indexer.logs.directory}

After that the HTTP endpoint starts showing this in the response: No log was found for this task. The task may not exist, or it may not have begun running yet.

Even though the log file actually exists in the shared deep storage folder, called like this: ${druid.indexer.logs.directory}/index_kafka_<my_task_id>.log

This to me seems a bug in overlord, because maybe it’s expecting the log file in deep storage on this path: ${druid.indexer.logs.directory}/index_kafka_<my_task_id>/log

Imply team, can you please check that, thanks…

I have a same issue, did you fix that?

Davor Poldrugo dpoldrugo@gmail.com 于2018年10月25日周四 上午12:04写道:

This to me seems a bug in overlord, because maybe it’s expecting the log file in deep storage on this path: ${druid.indexer.logs.directory}/index_kafka_<my_task_id>/log

If you’re using local file task log, this is how the path for the file log is constructed, it doesn’t create a separate directory per task:

private File fileForTask(final String taskid, String filename)
{
  return new File(config.getDirectory(), StringUtils.format("%s.%s", taskid, filename));
}

Likewise for HDFS task logs:


/**
* Due to [https://issues.apache.org/jira/browse/HDFS-13](https://issues.apache.org/jira/browse/HDFS-13) ":" are not allowed in
* path names. So we format paths differently for HDFS.
*/
private Path getTaskLogFileFromId(String taskId)
{
return new Path(mergePaths(config.getDirectory(), taskId.replaceAll(":", "_")));
}

I would double check that druid.indexer.logs.directory is set consistently on overlords and MMs, and also check that the directory is accessible from your overlord.