Moving segments from local deep storage to s3

I use druid for ingesting realtime Kafka streams. I initially created a druid cluster with deep storage as local. Then updated the configurations with s3 as deep storage.
Now I have some segments stored in localdisk and rest continues to be stored in s3. How can I move all the deep storage contents to s3.

I moved all the contents manually and then added a new data node but node fails because it cannot find the segment in local disk.

Can anyone please guide me on this.

Hi Amal,

Please take a look at the following tool. This will help update your metadata DB.

https://druid.apache.org/docs/latest/operations/insert-segment-to-db.html

Eric Graham

Solutions Engineer -** **Imply

**cell: **303-589-4581

email: eric.graham@imply.io

www.imply.io

Thanks for the suggestion.

I couldn’t find documentation on where the executables are for this tool. I hope it’s bundled with druid.

This is the error that I am getting while running.

java -Ddruid.metadata.storage.type=mysql -Ddruid.metadata.storage.connector.connectURI=jdbc:mysql://.rds.amazonaws.com:3306/druid -Ddruid.metadata.storage.connector.user=druid -Ddruid.metadata.storage.connector.password= -Ddruid.extensions.loadList=[“mysql-metadata-storage”,“druid-s3-extensions”] -Ddruid.storage.type=s3 -Ddruid.storage.bucket=***-storage -Ddruid.storage.baseKey=druid/storage/wikipedia -Ddruid.storage.maxListingLength=1000 -cp /opt/druid/lib/ org.apache.druid.cli.Main tools insert-segment-to-db --workingDir “druid/storage/wikipedia” --updateDescriptor true
Error: Could not find or load main class .opt.druid.lib.accessors-smart-1.2.jar

``

I am running druid version apache-druid-0.14.2-incubating

Oh, My bad. The issue was not specifying classpath in quotes.

Hey Eric,

I created a staging cluster to try out before doing in prod. I moved the segments folder manually to s3. Then ran the insert-segments-to-db-tool.

java
-Ddruid.metadata.storage.type=mysql
-Ddruid.metadata.storage.connector.connectURI=jdbc:mysql://druid.******.us-east-1.rds.amazonaws.com:3306/druid -Ddruid.metadata.storage.connector.user=druid
-Ddruid.metadata.storage.connector.password=********
-Ddruid.extensions.loadList=[“mysql-metadata-storage”,“druid-s3-extensions”]
-Ddruid.storage.type=s3
-Ddruid.storage.bucket=********-storage
-Ddruid.storage.baseKey="/druid/segments/wikipedia"
-Ddruid.storage.maxListingLength=1000
-cp "/opt/druid/lib/
" org.apache.druid.cli.Main tools insert-segment-to-db
–workingDir “druid/segments/wikipedia” --updateDescriptor true

``

The tool ran successfully and I could check the metadata store and see that the segment table has been updated with new s3 path.

Unfortunately the loadspec contains a path field which holds the old path which points to the disk .

{“type”:“s3_zip”,“path”:"/mnt/data/var/druid/segments/wikipedia/2015-09-12T00:00:00.000Z_2015-09-13T00:00:00.000Z/2019-06-26T12:09:45.416Z/0/index.zip",“key”:“druid/segments/wikipedia/2015-09-12T00:00:00.000Z_2015-09-13T00:00:00.000Z/2019-06-26T12:09:45.416Z/0/index.zip”,“bucket”:"****-staging-exhibitor-storage"},“dimensions”:“channel,cityName,comment,countryIsoCode,countryName,isAnonymous,isMinor,isNew,isRobot,isUnpatrolled,metroCode,namespace,page,regionIsoCode,regionName,user,added,deleted,delta”,“metrics”:"",“shardSpec”:{“type”:“numbered”,“partitionNum”:0,“partitions”:0},“binaryVersion”:9,“size”:4821529,“identifier”:“wikipedia_2015-09-12T00:00:00.000Z_2015-09-13T00:00:00.000Z_2019-06-26T12:09:45.416Z”}

``

The new data node fails to start the historical process mentioning that it’s not able to find the segment in local.

I have uploaded the error log.

Please have a look.

log (10.6 KB)

Can you please send me your conf/druid/_common/common.runtime.properties and coordinator and overlord log?

Eric Graham

Solutions EngineerCell: +1-303-589-4581
egraham@imply.io
www.imply.io