Missing Datasource

Hi All ,

I’m running a simple POC with some sample data to evaluate real time ingestion with tranquility server . I’m able to see my data at overlord node log like below …there is no error as such I can see any where . However , I’m not able to see this datasource from coordinator and also if I query with same window and datasource i’m getting response as blank

I noticed below log @ overlord UI … saying skipped …really not sure what does this imply … I would really appreciate for any pointer /tips to look after …

Thank you in advance …

2016-03-31T00:17:58,047 INFO [task-runner-0] io.druid.segment.realtime.plumber.RealtimePlumber - Submitting persist runnable for dataSource[pageviews]
2016-03-31T00:17:58,047 INFO [pageviews-incremental-persist] io.druid.segment.realtime.plumber.RealtimePlumber - DataSource[pageviews], Interval[2016-03-30T00:00:00.000Z/2016-03-31T00:00:00.000Z], Metadata [null] persisting Hydrant[FireHydrant{index=io.druid.segment.incremental.OnheapIncrementalIndex@4d2bfbf, queryable=io.druid.segment.ReferenceCountingSegment@3d1bc49b, count=6}]
2016-03-31T00:17:58,047 INFO [pageviews-incremental-persist] io.druid.segment.IndexMerger - Starting persist for interval[2016-03-30T00:00:00.000Z/2016-03-30T07:00:00.001Z], rows[1]
2016-03-31T00:17:58,049 INFO [pageviews-incremental-persist] io.druid.segment.IndexMerger - outDir[/tmp/persistent/task/index_realtime_pageviews_2016-03-30T00:00:00.000Z_2_0/work/persist/pageviews/2016-03-30T00:00:00.000Z_2016-03-31T00:00:00.000Z/6/v8-tmp] completed index.drd in 1 millis.
2016-03-31T00:17:58,065 INFO [pageviews-incremental-persist] io.druid.segment.IndexMerger - outDir[/tmp/persistent/task/index_realtime_pageviews_2016-03-30T00:00:00.000Z_2_0/work/persist/pageviews/2016-03-30T00:00:00.000Z_2016-03-31T00:00:00.000Z/6/v8-tmp] completed dim conversions in 16 millis.
2016-03-31T00:17:58,081 INFO [pageviews-incremental-persist] io.druid.segment.IndexMerger - outDir[/tmp/persistent/task/index_realtime_pageviews_2016-03-30T00:00:00.000Z_2_0/work/persist/pageviews/2016-03-30T00:00:00.000Z_2016-03-31T00:00:00.000Z/6/v8-tmp] completed walk through of 1 rows in 16 millis.
2016-03-31T00:17:58,082 INFO [pageviews-incremental-persist] io.druid.segment.IndexMerger - Starting dimension[url] with cardinality[1]
2016-03-31T00:17:58,084 INFO [pageviews-incremental-persist] io.druid.segment.IndexMerger - Completed dimension[url] in 2 millis.
2016-03-31T00:17:58,084 INFO [pageviews-incremental-persist] io.druid.segment.IndexMerger - Starting dimension[user] with cardinality[1]
2016-03-31T00:17:58,087 INFO [pageviews-incremental-persist] io.druid.segment.IndexMerger - Completed dimension[user] in 3 millis.
2016-03-31T00:17:58,087 INFO [pageviews-incremental-persist] io.druid.segment.IndexMerger - outDir[/tmp/persistent/task/index_realtime_pageviews_2016-03-30T00:00:00.000Z_2_0/work/persist/pageviews/2016-03-30T00:00:00.000Z_2016-03-31T00:00:00.000Z/6/v8-tmp] completed inverted.drd in 6 millis.
2016-03-31T00:17:58,092 INFO [pageviews-incremental-persist] io.druid.segment.IndexIO$DefaultIndexIOHandler - Converting v8[/tmp/persistent/task/index_realtime_pageviews_2016-03-30T00:00:00.000Z_2_0/work/persist/pageviews/2016-03-30T00:00:00.000Z_2016-03-31T00:00:00.000Z/6/v8-tmp] to v9[/tmp/persistent/task/index_realtime_pageviews_2016-03-30T00:00:00.000Z_2_0/work/persist/pageviews/2016-03-30T00:00:00.000Z_2016-03-31T00:00:00.000Z/6]
2016-03-31T00:17:58,092 INFO [pageviews-incremental-persist] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[dim_url.drd]
2016-03-31T00:17:58,092 INFO [pageviews-incremental-persist] io.druid.segment.IndexIO$DefaultIndexIOHandler - Dimension[url] is single value, converting...
2016-03-31T00:17:58,093 INFO [pageviews-incremental-persist] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[dim_user.drd]
2016-03-31T00:17:58,093 INFO [pageviews-incremental-persist] io.druid.segment.IndexIO$DefaultIndexIOHandler - Dimension[user] is single value, converting...
2016-03-31T00:17:58,093 INFO [pageviews-incremental-persist] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[index.drd]
2016-03-31T00:17:58,093 INFO [pageviews-incremental-persist] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[inverted.drd]
2016-03-31T00:17:58,093 INFO [pageviews-incremental-persist] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[met_latencyMs_LITTLE_ENDIAN.drd]
2016-03-31T00:17:58,093 INFO [pageviews-incremental-persist] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[met_views_LITTLE_ENDIAN.drd]
2016-03-31T00:17:58,094 INFO [pageviews-incremental-persist] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[spatial.drd]
2016-03-31T00:17:58,094 INFO [pageviews-incremental-persist] io.druid.segment.IndexIO$DefaultIndexIOHandler - Processing file[time_LITTLE_ENDIAN.drd]
2016-03-31T00:17:58,094 INFO [pageviews-incremental-persist] io**.druid.segment.IndexIO$DefaultIndexIOHandler - Skipped files[[index.drd, inverted.drd, spatial.drd]]**

Hi,

Can you describe how you are loading the data? I am guessing your data is probably not getting ingested because it is outside of the configured windowPeriod.

If you are just starting to evaluate Druid, I suggest trying http://imply.io/docs/latest/quickstart. You’ll have a much easier time getting started until Druid 0.9.0 docs are ready.

Hi Fangjin,

Thank you for the mail . Yes , you are correct . It was windowPeriod issue . I was able to resolve the same and ingest data and query.

I’m setting up six node ec2 instance to run some benchmarking with memcache and s3 as deep storage .

Is there any suggestion for historical/middle manger node instance type or number of instance to see optimum number . I’m planning to run with segment granularity of hr and window period

say 20 mint for approx. below data

event topic
click topic

8000 msg / s
400 msg /s

20 second batch
20 second batch
20.3MB/ 20s
5.3 MB / 20s

Thank you again .

~ Biswajit

Hi,

If you are trying to set up a simple cluster, I suggest actually checking out this doc, which has more details about what a simple distributed cluster may look like: http://druid.io/docs/0.9.0-rc3/tutorials/cluster.html