realtime node query speed

Hi,

we’re experimenting with druid (0.7.3) since some weeks and have setup a small cluster. Our daily raw data is roughly 30-40GB and it’s ingested with one realtime node (m4.xlarge) with 4 CPUs and 16GB of memory. The daily segments are around 700MB. The response from our historical node is fine (with memcached) but I have problems getting the realtime node queries up to speed. Setup for realtime node see below. We use only timeseries and topn queries at the moment. Could you point me to a better utilisation of the machine to speed up queries?

Thanks

Thorsten

Here our setup:

realtime runtime.properties:

druid.host=^IP_ADDR:8080

druid.port=8080

druid.service=realtime

druid.processing.buffer.sizeBytes=1073741824

druid.processing.numThreads=3

druid.realtime.chathandler.type=announce

druid.realtime.specFile=/server/apps/druid/realtime/conf/mydata.spec

druid.monitoring.monitors=[“io.druid.segment.realtime.RealtimeMetricsMonitor”]

realtime server JVM settings:

java -Duser.timezone=UTC -Dfile.encoding=UTF-8 -server -Xmx2g -Xms2g -XX:NewSize=1024m -XX:MaxNewSize=1024m -XX:MaxDirectMemorySize=4g -XX:+UseConcMarkSweepGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Duser.timezone=UTC -Dfile.encoding=UTF-8 -Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager -Djava.io.tmpdir=/external/tmp -classpath /server/apps/druid/common/conf:/server/apps/druid/realtime/conf:lib/* io.druid.cli.Main server realtime

mydata.spec:

[

{

“dataSchema”: {

“dataSource”: “mydata”,

“parser”: {

“type”: “map”,

“parseSpec”: {

“format”: “json”,

“timestampSpec”: {

“column”: “time”,

“format”: “posix”

},

“dimensionsSpec”: {

“dimensions”: [“d1”, “d2”, “d3”, “d4”, “d5”, “d6”, “d7”, “d8”, “d9”, “d10”, “d11”, “d12”, “d13”, “d14”, “d15”, “d16”, “d16”],

“dimensionExclusions” : ,

“spatialDimensions” :

}

}

},

“metricsSpec”: [

{

“type”: “longSum”,

“name”: “m1”,

“fieldName”: “m1”

},

{

“type”: “longSum”,

“name”: “m1”,

“fieldName”: “m1”

}

],

“granularitySpec”: {

“type”: “uniform”,

“segmentGranularity”: “DAY”,

“queryGranularity”: “HOUR”

}

},

“ioConfig”: {

“type”: “realtime”,

“firehose”: {

“type”: “receiver”,

“serviceName”: “eventReceiverServiceName”,

“bufferSize”: 10000

},

“plumber”: {

“type”: “realtime”

}

},

“tuningConfig”: {

“type”: “realtime”,

“maxRowsInMemory”: 100000,

“windowPeriod”: “PT180m”,

“basePersistDirectory”: “/external/data/druid/basePersist”,

“rejectionPolicy”: {

“type”: “serverTime”

}

}

}

]

Can you elaborate on what kinds of times you are seeing and what you
are hoping to see?

If you could also provide the "query/time" metrics emitted by the
process at query time (set druid.emitter=logging in your
runtime.properties), that would be helpful.

--Eric

Hi,

the query - here a topn for a dimension called ‘domain’ - takes ~ 30 seconds. I would need a response time of 1 - 2 seconds

here the logs

2015-07-21T01:59:09,987 INFO [topN_connections_[2015-07-20T00:00:00.000Z/2015-07-20T23:00:00.000Z]] com.metamx.emitter.core.LoggingEmitter - Event [{“feed”:“metrics”,“timestamp”:“2015-07-21T01:59:09.987Z”,“service”:“realtime”,“host”:“172.31.23.32:8080”,“metric”:“query/time”,“value”:34286,“user2”:“connections”,“user4”:“topN/1000/domain”,“user5”:[“2015-07-20T00:00:00.000Z/2015-07-20T23:00:00.000Z”],“user6”:“true”,“user7”:“2 aggs”,“user8”:“f79044a6-2815-46e8-8510-9be9f42d3034”,“user9”:“PT1380M”}]

2015-07-21T01:59:09,987 INFO [topN_connections_[2015-07-20T00:00:00.000Z/2015-07-20T23:00:00.000Z]] com.metamx.emitter.core.LoggingEmitter - Event [{“feed”:“metrics”,“timestamp”:“2015-07-21T01:59:09.987Z”,“service”:“realtime”,“host”:“172.31.23.32:8080”,“metric”:“query/wait”,“value”:0,“user2”:“connections”,“user4”:“topN/1000/domain”,“user5”:[“2015-07-20T00:00:00.000Z/2015-07-20T23:00:00.000Z”],“user6”:“true”,“user7”:“2 aggs”,“user8”:“f79044a6-2815-46e8-8510-9be9f42d3034”,“user9”:“PT1380M”}]

2015-07-21T01:59:10,434 INFO [qtp1593333077-24] com.metamx.emitter.core.LoggingEmitter - Event [{“feed”:“metrics”,“timestamp”:“2015-07-21T01:59:10.434Z”,“service”:“realtime”,“host”:“172.31.23.32:8080”,“metric”:“request/time”,“value”:34734,“user2”:“connections”,“user3”:"{“finalize”:false,“queryId”:“f79044a6-2815-46e8-8510-9be9f42d3034”,“timeout”:300000}",“user4”:“topN”,“user5”:[“2015-07-20T00:00:00.000Z/2015-07-20T23:00:00.000Z”],“user6”:“true”,“user7”:“172.31.26.137”,“user8”:“f79044a6-2815-46e8-8510-9be9f42d3034”,“user9”:“PT1380M”}]

Cheers

Thorsten

Wow, yeah, that's pretty crazy slow. Can you double check that you
aren't running into GC issues? It's possible that 2GB is not quite
enough to hold the in-memory buffer and the data in memory at the same
time. The simplest way to check is to turn on verbose GC logging

And see if there is a ton of GC going on while the query is happening.
If that's the case, try pushing up the heap size to like -Xmx4g -Xms4g
or even higher.

Also, I'm going to venture a guess but at the beginning of the day
does it run relatively ok and then towards the end of the day start
slowing down? If so, you might also consider switching to hour
segmentGranularity or pushing the intermediatePersistPeriod up to like
PT30m or PT1h (for more information on this one, it would be
interesting to know how many partitions you are dealing with. If you
look in the segment persist directory, there should be a dir for the
interval, then the version and then 0/ 1/ 2/ 3/ 4/, what number is
that going up to?

--Eric

Hi Eric,

yeah, I can see a lot of those GC infos during query time

16045.002: [GC (Allocation Failure) 16045.002: [ParNew: 856667K->22119K(943744K), 0.0192188 secs] 1379025K->545708K(1992320K), 0.0192751 secs] [Times: user=0.07 sys=0.00, real=0.02 secs]

and there are roughly 1300 partitions in this directory for a finished day thats not merged yet.

I’ll increase the heap and change the intermediatePersistPeriod and see how that goes. I would assume that the intermediatePersistPeriod change will take a day to show results.

Cheers

Thorsten

after the intermediatePersistPeriod change to PT1h it still seems to write a partition every minute. I have set the “maxRowsInMemory”: 100000 - so would I need to increase this value as well to a higher setting for it overrides the intermediatePersistPeriod?

Cheers

Thorsten

Yeah, you will likely need to increase maxRowsInMemory if it's still
persisting every minute. Do you know how many rows you expect to
exist every hour? If not, try doing an hourly query with a count
aggregator on some of your data, that'll tell you. If it's at all
possible, setting the maxRowsInMemory a bit above that number would
likely be ideal.

--Eric

Hi Eric,

i have roughly 7 million rows per hour during peak times. I’ve set the maxRowsInMemory to 5 million yesterday. That gives me shards for every 45 - 60 minutes at the moment.

I wait for the day rollover until the older shards are persisted and see how the performance is then. If it’s still slow I’ll try hour segmentGranularity.

Thanks for you help so far

Cheers

Thorsten

Thorsten,

Generally speaking, 5 millions rows per segment is a pretty good spot
to be in terms of segment size. Given that your peak hours are 7
million rows, I would recommend maxRowsInMemory of 1 million and hour
segmentGranularity.

--Eric