Failed to batch load tutorial datasource wikipedia

I am using druid 0.12.3, tutorial works fine when testing on a single machine. Follow the production configuration instruction modified the configurations, then started druid on 4 VMs:

  1. zookeeper, coordinator, overlord on 1 VM

  2. historical node on 1 VM

  3. middlemanager on 1 VM

  4. broker on 1 VM

Rerun the batch load tutorial, coordinator console shows SUCCESS, so does log. But always shows 99% available on datasource. no data on historical node. on Middlemanager, all segments are created. Could someone help?

here is the bottom of the log file:

2018-11-19T14:26:48,702 INFO [publish-0] io.druid.indexing.common.actions.RemoteTaskActionClient - Submitting action for task[index_wikipedia_2018-11-19T14:26:37.783Z] to overlord: [SegmentInsertAction{segments=[DataSegment{size=4826703, shardSpec=NumberedShardSpec{partitionNum=0, partitions=0}, metrics=, dimensions=[channel, cityName, comment, countryIsoCode, countryName, isAnonymous, isMinor, isNew, isRobot, isUnpatrolled, metroCode, namespace, page, regionIsoCode, regionName, user, added, deleted, delta], version=‘2018-11-19T14:26:37.856Z’, loadSpec={type=>local, path=>/app/druid-0.12.3/var/druid/segments/wikipedia/2015-09-12T00:00:00.000Z_2015-09-13T00:00:00.000Z/2018-11-19T14:26:37.856Z/0/index.zip}, interval=2015-09-12T00:00:00.000Z/2015-09-13T00:00:00.000Z, dataSource=‘wikipedia’, binaryVersion=‘9’}], startMetadata=null, endMetadata=null}].

2018-11-19T14:26:48,855 INFO [publish-0] io.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Published segments.

2018-11-19T14:26:48,856 INFO [task-runner-0-priority-0] io.druid.indexing.common.task.IndexTask - Published segments[wikipedia_2015-09-12T00:00:00.000Z_2015-09-13T00:00:00.000Z_2018-11-19T14:26:37.856Z]

2018-11-19T14:26:48,856 INFO [task-runner-0-priority-0] io.druid.segment.realtime.appenderator.AppenderatorImpl - Shutting down…

2018-11-19T14:26:48,858 INFO [task-runner-0-priority-0] io.druid.indexing.overlord.TaskRunnerUtils - Task [index_wikipedia_2018-11-19T14:26:37.783Z] status changed to [SUCCESS].

2018-11-19T14:26:48,860 INFO [task-runner-0-priority-0] io.druid.indexing.worker.executor.ExecutorLifecycle - Task completed with status: {

“id” : “index_wikipedia_2018-11-19T14:26:37.783Z”,

“status” : “SUCCESS”,

“duration” : 4885

}

2018-11-19T14:26:48,864 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.server.listener.announcer.ListenerResourceAnnouncer.stop()] on object[io.druid.query.lookup.LookupResourceListenerAnnouncer@28ee882c].

2018-11-19T14:26:48,864 INFO [main] io.druid.curator.announcement.Announcer - unannouncing [/druid/listeners/lookups/__default/http:10.240.210.146:8100]

2018-11-19T14:26:48,874 INFO [main] io.druid.server.listener.announcer.ListenerResourceAnnouncer - Unannouncing start time on [/druid/listeners/lookups/__default/http:10.240.210.146:8100]

2018-11-19T14:26:48,874 INFO [main] io.druid.java.util.common.lifecycle.Lifecycle$AnnotationBasedHandler - Invoking stop method[public void io.druid.query.lookup.LookupReferencesManager.stop()] on object[io.druid.query.lookup.LookupReferencesManager@511da44f].

2018-11-19T14:26:48,874 INFO [main] io.druid.query.lookup.LookupReferencesManager - LookupReferencesManager is stopping.

2018-11-19T14:26:48,874 INFO [LookupReferencesManager-MainThread] io.druid.query.lookup.LookupReferencesManager - Lookup Management loop exited, Lookup notices are not handled anymore.

2018-11-19T14:26:48,874 INFO [main] io.druid.query.lookup.LookupReferencesManager - LookupReferencesManager is stopped.

2018-11-19T14:26:48,874 INFO [main] io.druid.server.initialization.jetty.JettyServerModule - Stopping Jetty Server…

2018-11-19T14:26:48,878 INFO [main] org.eclipse.jetty.server.AbstractConnector - Stopped ServerConnector@756808cc{HTTP/1.1,[http/1.1]}{0.0.0.0:8100}

2018-11-19T14:26:48,879 INFO [main] org.eclipse.jetty.server.handler.ContextHandler - Stopped o.e.j.s.ServletContextHandler@6b867ee7{/,null,UNAVAILABLE}

Try restarting the Historical node. May be something went wrong while loading the data on Historical.

Ankit, thanks for your reply.

I restarted druid from a fresh environment (means deleted var, log folder and rerun bin/init). Still see the same problem. I can see “var/druid/segments” folder on the middlemanager node has segments created, but no error in the log. At least, I can’t find anything why historical node has no data.

Just wondering when your historical node has data, under which folder I should see them? Btw, could you share how you configure your cluster environment?

Many thanks,

am using druid 0.12.3, tutorial works fine when testing on a single machine. Follow the production configuration instruction modified the configurations, then started druid on 4 VMs:

  1. zookeeper, coordinator, overlord on 1 VM

  2. historical node on 1 VM

  3. middlemanager on 1 VM

  4. broker on 1 VM

Since there are multiple VMs, “local” deep storage which the tutorials use wouldn’t work, since each VM sees a different filesystem.

You could setup NFS and continue using “local” deep storage, or use some other distributed filesystem like HDFS, or a cloud store like S3. Alternatively, it might be easier to setup a common shared directory across all VMs and set the deep storage and task log storage locations to somewhere in the shared dir so they’re accessible to all VMs.

Thanks,

Jon

Jonathan, thank you so much for pointing it out. Make perfect sense. Will rework on my cluster deployment.

I got a s3 bucket from a non-aws S3 implementation for deep storage. Follow http://druid.io/docs/latest/tutorials/cluster.html, I have the following in conf/druid/_common/common.runtime.properties

druid.storage.type=s3

druid.storage.bucket=my-bucket

druid.storage.baseKey=druid/segments

druid.s3.accessKey=***

druid.s3.secretKey=***

druid.indexer.logs.type=s3

druid.indexer.logs.s3Bucket=my-bucket

druid.indexer.logs.s3Prefix=druid/indexing-logs

I am still using derby for metadata though.

I also added jets3t.properties:

s3service.s3-endpoint=http://****.com

s3service.s3-endpoint-http-port=9020

s3service.disable-dns-buckets=true

s3service.https-only=false

I don’t see log added on s3 bucket, neither batch ingest worked and I see log on console:

2018-11-21T20:46:40,163 WARN [TaskQueue-Manager] io.druid.indexing.overlord.TaskQueue - TaskRunner failed to clean up task: index_wikipedia_2018-11-21T20:38:18.459Z

io.druid.java.util.common.RE: Error in handling post to [10.240.210.146:8091] for task [index_wikipedia_2018-11-21T20:38:18.459Z]

10.240.210.146 is the ip for coordinator node.

I can’t kill the task from coordinator UI and I noticed that the java process is still running on middlemanager node. I don’t see middlemanager in zookeeper. I wonder whether it is related to the embed derby. somewhere I saw as if embed derby only allows one connection.