Question: can I name column in Chinese?

version : druid-0.9.2

Hi,
I’ve a issue related to the character set of column name .

When I load file into Druid, I have tried using English and Chinese character as column name respectively.

both batch loading process are using the same .csv file.

when I use iso-8859-1 character set as the column name, all are going well.

when I use Chinese characters as the column name, the status of batch data ingestion task is SUCCESS, but the historical service can’t load the segment.

And I found the file size of index.zip are different. in normal situation, the index.zip file is 400K, in the case of error, the index.zip file is only 200K.

The historical log print :

io.druid.segment.loading.SegmentLoadingException: Exception loading segment[watchcloud-mixer-report_2016-12-01T00:00:00.000Z_2017-01-01T00:00:00.000Z_2017-03-18T07:35:40.616Z]

    at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:310) ~[druid-server-0.9.2.jar:0.9.2]

    at io.druid.server.coordination.ZkCoordinator.addSegment(ZkCoordinator.java:351) [druid-server-0.9.2.jar:0.9.2]

    at io.druid.server.coordination.SegmentChangeRequestLoad.go(SegmentChangeRequestLoad.java:44) [druid-server-0.9.2.jar:0.9.2]

    at io.druid.server.coordination.ZkCoordinator$1.childEvent(ZkCoordinator.java:153) [druid-server-0.9.2.jar:0.9.2]

    at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:522) [curator-recipes-2.11.0.jar:?]

    at org.apache.curator.framework.recipes.cache.PathChildrenCache$5.apply(PathChildrenCache.java:516) [curator-recipes-2.11.0.jar:?]

    at org.apache.curator.framework.listen.ListenerContainer$1.run(ListenerContainer.java:93) [curator-framework-2.11.0.jar:?]

    at com.google.common.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297) [guava-16.0.1.jar:?]

    at org.apache.curator.framework.listen.ListenerContainer.forEach(ListenerContainer.java:84) [curator-framework-2.11.0.jar:?]

    at org.apache.curator.framework.recipes.cache.PathChildrenCache.callListeners(PathChildrenCache.java:513) [curator-recipes-2.11.0.jar:?]

    at org.apache.curator.framework.recipes.cache.EventOperation.invoke(EventOperation.java:35) [curator-recipes-2.11.0.jar:?]

    at org.apache.curator.framework.recipes.cache.PathChildrenCache$9.run(PathChildrenCache.java:773) [curator-recipes-2.11.0.jar:?]

    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_111]

    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_111]

    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_111]

    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_111]

    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_111]

    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_111]

    at java.lang.Thread.run(Thread.java:745) [?:1.8.0_111]

Caused by: java.lang.NullPointerException

    at io.druid.common.utils.SerializerUtils.readString(SerializerUtils.java:72) ~[druid-common-0.9.2.jar:0.9.2]

    at io.druid.segment.IndexIO$V9IndexLoader.deserializeColumn(IndexIO.java:1043) ~[druid-processing-0.9.2.jar:0.9.2]

    at io.druid.segment.IndexIO$V9IndexLoader.load(IndexIO.java:1026) ~[druid-processing-0.9.2.jar:0.9.2]

    at io.druid.segment.IndexIO.loadIndex(IndexIO.java:222) ~[druid-processing-0.9.2.jar:0.9.2]

    at io.druid.segment.loading.MMappedQueryableIndexFactory.factorize(MMappedQueryableIndexFactory.java:49) ~[druid-server-0.9.2.jar:0.9.2]

    at io.druid.segment.loading.SegmentLoaderLocalCacheManager.getSegment(SegmentLoaderLocalCacheManager.java:96) ~[druid-server-0.9.2.jar:0.9.2]

    at io.druid.server.coordination.ServerManager.loadSegment(ServerManager.java:152) ~[druid-server-0.9.2.jar:0.9.2]

    at io.druid.server.coordination.ZkCoordinator.loadSegment(ZkCoordinator.java:306) ~[druid-server-0.9.2.jar:0.9.2]

This might be related to https://github.com/druid-io/druid/pull/3118 where dimension names have to be reconcilable to file names on the file system at indexing time.

Having a unit test for this would be helpful.