Realtime java.util.common.ISE: Can not combine streams for version 2

Hi Team,

We use 0.11.0 version and have been facing issues with index job

We tried to looks at the code but haven’t been able to figure out anything yet.

the druid table Have one big string column included the log message.

when we lowed the segmentGranularity from thirty_minute to five_minute,druid run good and somtimes throw this Exception

Would be good if you can have a look at this once and guide urgently. Thanks in advance.

druid.service=druid/middleManager

druid.port=8091

Number of tasks per middleManager

druid.worker.capacity=9

Task launch parameters

druid.indexer.runner.javaOpts=-server -Xmx3g -Xss1m -XX:MaxDirectMemorySize=5g -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Duser.timezone=GMT+8 -Dfile.encoding=UTF-8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager

druid.indexer.task.baseTaskDir=var/druid/task

HTTP server threads

#druid.server.http.numThreads=20

Processing threads and buffers on Peons

druid.indexer.fork.property.druid.processing.buffer.sizeBytes=536870912

druid.indexer.fork.property.druid.processing.numThreads=5

druid.indexer.fork.property.druid.server.http.numThreads=100

Hadoop indexing

druid.indexer.task.hadoopWorkingPath=var/druid/hadoop-tmp

druid.indexer.task.defaultHadoopCoordinates=[“org.apache.hadoop:hadoop-client:2.7.3”]

“dataSources” : {

“app_log”: {

“spec”: {

“dataSchema”: {

“dataSource”: “app_log”,

“parser”: {

“type”: “string”,

“parseSpec”: {

“timestampSpec”: {

“column”: “timestamp”,

“format”: “auto”

},

“dimensionsSpec”: {

“dimensions”: ,

“dimensionExclusions”: [

“timestamp”

]

},

“format”: “json”

}

},

“metricsSpec”: [

{

“type”: “longSum”,

“name”: “response_time”,

“fieldName”: “response_time”

}

],

“granularitySpec”: {

“type”: “uniform”,

“segmentGranularity”: “five_minute”,

“queryGranularity”: “none”

}

},

“ioConfig”: {

“type”: “realtime”

},

“tuningConfig”: {

“type”: “realtime”,

“maxRowsInMemory”: “100000”,

“intermediatePersistPeriod”: “PT10M”,

“windowPeriod”: “PT10M”

}

},

“properties”: {

“task.partitions”: “1”,

“task.replicants”: “1”

}

},

Hey Mylinushr,

This error can happen when you have a single string column that is too big for one segment (>2GB for a column). You should be able to work around it by increasing your “task.partitions”. The idea there is that having more partitions means that each individual partition is smaller.

Gian

Hi Gian Merlino,

Thank you very much,You saved me,Amen! I’ve been working on this problem for a long time.

在 2018年10月2日星期二 UTC+8上午12:06:58,Gian Merlino写道: