All segments are unavailable after native batch ingestion

Team,

I am testing Druid for our use case. Using single-server -> micro-quickstart. We have data exported from PSQL as csv and tried native batch ingestion. The file has 52M entries. Once the ingestion is finished, waited for a long time but no segments are available.

Using Java version 13 : openjdk 13.0.2

My ingestion spec is :

{

“type”: “index_parallel”,

“spec”:{

“ioConfig”: {

“type”: “index_parallel”,

“inputSource”: {

“type”: “local”,

“filter”: “50mactions_high_cardinality.csv”,

“baseDir”: “/Users/vignesh/backup/”

},

“inputFormat”: {

“type”: “csv”,

“findColumnsFromHeader”: true

}

},

“tuningConfig”: {

“type”: “index_parallel”,

“partitionsSpec”: {

“type”: “dynamic”

}

},

“dataSchema”: {

“dataSource”: “actions”,

“granularitySpec”: {

“type”: “uniform”,

“queryGranularity”: “NONE”,

“rollup”: false,

“segmentGranularity”: “DAY”,

“intervals” : [“2020-01-01/2020-06-30”]

},

“timestampSpec”: {

“column”: “time”,

“format”: “millis”

},

“dimensionsSpec”: {

“dimensions”: [

{

“type”: “long”,

“name”: “actionid”

},

{

“type”: “long”,

“name”: “companyid”

},

{

“type”: “long”,

“name”: “customerid”

},

{

“type”: “long”,

“name”: “data11”

},

{

“type”: “long”,

“name”: “data12”

},

{

“type”: “long”,

“name”: “data13”

},

{

“type”: “long”,

“name”: “data14”

},

{

“type”: “long”,

“name”: “schemaid”

}

]

}

}

}

}

Tried setting “Reload data from interval” in the Actions dropdown. Gave in ISO format - "2020-01-01T00:00:00.000Z/2020-06-30T00:00:00.000Z.

My time column is in epoch but converted to timestamp in “Parse Time” section.

But still no change. The request http://localhost:8888/druid/coordinator/v1/datasources/actions/markUsed is made with the interval in request body, but received 204 No Content, so not sure whether it updated successfully in the server side.

While the request is made, the following log appears in coordinator-overlord log,

2020-02-13T06:59:53,630 WARN [qtp637365534-110] org.apache.druid.server.http.DataSourcesResource - datasource not found [actions]

2020-02-13T06:59:54,026 INFO [TaskQueue-StorageSync] org.apache.druid.indexing.overlord.TaskQueue - Synced 0 tasks from storage (0 tasks added, 0 tasks removed).

2020-02-13T06:59:55,297 INFO [LookupCoordinatorManager–8] org.apache.druid.server.lookup.cache.LookupCoordinatorManager - Not updating lookups because no data exists

2020-02-13T06:59:55,583 INFO [DatabaseRuleManager-Exec–0] org.apache.druid.metadata.SQLMetadataRuleManager - Polled and found 1 rule(s) for 2 datasource(s)

actions is the datasource and shown in UII

Only 1 datasource exists, but it says 2.

Thought there was some problem with console and tried from cli but still the same.

Attached the ingestion logs. (Seems good to me seeing the logs)

Attached the coordinator-overlord logs. Found an exception here, but not sure whether this might be the cause.

2020-02-13T07:15:36,408 ERROR [LeaderSelector[/druid/coordinator/_COORDINATOR]] org.apache.curator.framework.listen.ListenerContainer - Listener (org.apache.druid.curator.discovery.CuratorDruidLeaderSelector$1@2ca1e49a) threw an exception

java.lang.ClassFormatError: Illegal field name “org.apache.druid.server.coordinator.DruidCoordinator$this” in class org/apache/druid/server/coordinator/DruidCoordinator$CoordinatorHistoricalManagerRunnable

Ingestion report :

{“ingestionStatsAndErrors”:{“taskId”:“index_parallel_actions_mohicclo_2020-02-12T22:03:12.183Z”,“payload”:{“ingestionState”:“COMPLETED”,“unparseableEvents”:{},“rowStats”:{“determinePartitions”:{“processed”:0,“processedWithError”:0,“thrownAway”:0,“unparseable”:0},“buildSegments”:{“processed”:52128000,“processedWithError”:0,“thrownAway”:288000,“unparseable”:0}},“errorMsg”:null},“type”:“ingestionStatsAndErrors”}}

With Regards,

Vignesh R

index_parallel_actions_mohicclo_2020-02-12T22:03:12.183Z.log (447 KB)

Hi Vignesh,
Please use Java 8. Druid is not tested and certified with the java version you are using.

Thanks and Regards,

Vaibhav

Please check:
https://druid.apache.org/docs/latest/tutorials/index.html

Thanks ,
vaibhaV

Vaibhav,

Downgrading java from 13 to 8 fixed the issue.

Thanks

Good to know that, Its know issue and been discussed earlier as well:
https://mail.google.com/mail/u/0/#search/in%3Asent+java8/FMfcgxwDrtxzdfdPSDrcZfwDBjGRzQjq

There are some open JIRA to being worked on to support newer java version:
https://github.com/apache/druid/issues/5589

But for now [ Druid-0.17] its Java 8 (8u92+).

Thanks and Regrads,

Vaibhav