Hadoop Segment S3 Upload Issue in 0.9.1-rc4

Having an issue with the hadoop job in druid 9.1-rc4. All my segments get built & loaded into historical just fine except for one. Historical log:

Caused by: io.druid.segment.loading.SegmentLoadingException: IndexFile[s3://inin-dca-useast1-analytics/druid/analytics-druid-v3/v2/imetrics/2015-06-21T00:00:00.000Z_2015-06-22T00:00:00.000Z/2016-06-29T01:29:18.462Z/0/index.zip] does not exist.

aws s3 ls s3://inin-dca-useast1-analytics/druid/analytics-druid-v3/v2/imetrics/2015-06-21T00:00:00.000Z_2015-06-22T00:00:00.000Z/2016-06-29T01:29:18.462Z/0/
2016-06-29 02:26:59 1263 descriptor.json
2016-06-29 02:15:42 2407314 index.zip.0
2016-06-29 02:26:42 2407314 index.zip.2

cat descriptor.json
{“dataSource”:“imetrics”,“interval”:“2015-06-21T00:00:00.000Z/2015-06-22T00:00:00.000Z”,“version”:“2016-06-29T01:29:18.462Z”,“loadSpec”:{“type”:“s3_zip”,“bucket”:“inin-dca-useast1-analytics”,“key”:“druid/analytics-druid-v3/v2/imetrics/2015-06-21T00:00:00.000Z_2015-06-22T00:00:00.000Z/2016-06-29T01:29:18.462Z/0/index.zip”},“dimensions”:“orgid,cid,media,interaction_type,session_id,direction,dg,ani,callid,user,station,team,edge,dnis,wrapup_code,dialer_campaign,dialer_contact,dialer_contact_list,chat,chatroom”,“metrics”:“events,tAbandon,tAbandon.cnt,tIvr,tIvr.cnt,tAnswered,tAnswered.cnt,tAcd,tAcd.cnt,tTalk,tTalk.cnt,tTalkCompleted,tTalkCompleted.cnt,tHeld,tHeld.cnt,tHeldCompleted,tHeldCompleted.cnt,tAcw,tAcw.cnt,tHandle,tHandle.cnt,tVoicemail,tVoicemail.cnt,tUserResponseTime,tUserResponseTime.cnt,tAgentResponseTime,tAgentResponseTime.cnt,nOffered,nOverSla,nTransferred,nDialerAttempted,nDialerConnected,nDialerAbandoned,nError,mCreatedVoicemailSize,mCreatedVoicemailDuration,mDeletedVoicemailSize,mDeletedVoicemailDuration,oMailboxVoicemailSize,oMailboxVoicemailDuration,oMailboxVoicemailCount”,“shardSpec”:{“type”:“none”},“binaryVersion”:9,“size”:4263099,“identifier”:“imetrics_2015-06-21T00:00:00.000Z_2015-06-22T00:00:00.000Z_2016-06-29T01:29:18.462Z”}

Metadata expects index.zip, but in s3 there’s index.zip.0 & index.zip.2. I spot checked a bunch of other segments and they’re just index.zip as expected. I attached my hadoop spec file. I didn’t see anything interesting in the hadoop logs. I know I can mark it as un-used to clear it, but any idea what could cause that?

hadoop_specfile.txt (6.94 KB)

Hey Drew,

Could you double-check the index-generator reducer logs for anything interesting? Those reducers are responsible for renaming index.zip.N to index.zip. Also, check if your version of Hadoop is affected by https://issues.apache.org/jira/browse/HADOOP-10737 (“S3n silent failure on copy, data loss on rename”).

I’m on emr-4.3.0, hadoop 2.7.1. The finishes despite some timeouts forcing some reducers to retry. I’ll take a peek at some of the task tracker logs next time I run the thing.

AttemptID:attempt_1467158861048_0030_r_000267_0 Timed out after 600 secs

StdOutput.zip (58.7 KB)