The log size of a Hadoop index task is always enormous on my environment. Even with only a few megabytes of data to index, the index log is in excess of 1 GB. The vast majority of the space in the log appears to be taken up by a giant JSON structure that details every possible segment in the interval specified in the task. If the task operates on a large number of segments (like every minute for 3 months), the log output is huge.
While this may not be a very common use case, I would like to know if there is a way to quiet this logging a bit without resorting to class log levels. I think there is some potentially useful info in some of these classes, I just don’t think outputting the massive JSON structure every time is useful.