Druid Segment File Format and Ingesting Data without converting file

Hi,

Objective: Options to bulk import the data from Druid Segment File with out reading line by line, or exporting/converting to csv file:

Where I tried to find the answer for this requirements in documentation:

Option #1: insert-segment-to-db tool, but this is from Druid to Druid, requirement is to export data to other database DWH

Option #2: DumpSegment Tool, this comes close , but looking at high volume data, this may not work

Option #3: Write a map-reduce to export data to csv

Need inputs from experts, on two things:

Wanted confirmation before we go either Option #3 (Mapreduce) , or modifying Option #2 (DumpSegment)

Will it be possible to use any available data base or MPP or DWH which can ingest these files without converting them.

Any pointer PoV, ideas will help me a lot.

Thanks,

Kaleem

Hi,

Can some one please point me to correct class name / documentation where details of the “Binary” format of the Segment file is discussed.

Not sure if this question needs to be asked in Druid Development Group of here, can some one please help me.

Thanks,
Kaleem

Hi,

Was able to locate the file, can some one confirm if this is correct class:
https://github.com/druid-io/druid/blob/master/processing/src/main/java/io/druid/segment/data/GenericIndexed.java

Thanks,

Kaleem

That’s the correct class.

Thanks,

Jon