I am trying to save a spreadsheet as a CSV file. The file consists of korean characters, and the resulting csv has those characters literally converted to question marks (0x3F).
I am running on an English version of Windows but I (should) have the appropriate charsets installed. The default encoding for non-unicode programs is English. I have no problems saving the files in other formats (xls or txt, for example).
This issue occurs for Japanese and Chinese characters as well.
The sample strings I used (in separate files) are
안녕하세요
你好
おはよう
None of files are exported correctly.
Does Excel support asian characters when saving as CSV?
UPDATE
Decided to do some testing. Instead of trying to export from xls to csv, I instead created a csv manually that contains the following chinese characters: 你好. I am using Notepad++ to save the files. The purpose of this is to test whether excel can actually save CSV files with certain characters properly.
First, I encoded the file in UTF8 without BOM with .csv extension. I opened the file in Excel, and it rendered the characters as ASCII (therefore, incorrectly. Looks something like ä½ å¥½). However, when I saved the file, the characters were preserved when I opened it in notepad++ in UTF8.
Second, I created a new copy of the test file, but this time encoding it in UTF8 (with the BOM). I opened the file in Excel, and it read the file correctly (as 你好). I then re-saved the file, but this time it converted the characters to two question marks.
I found it interesting that while Excel correctly reads the file, it cannot re-save it correctly, but when it reads it as ASCII, it doesn't try to convert them but instead just outputs them as is. Seems like there is an issue when Excel tries to save unicode characters?
Workaround Solution
I saved the the document as Unicode Text which produces a tab-separated text file and it preserved the asian characters. There should be tools available to convert tab-separated files to comma-separated files.
Because the file is unicode encoded, all of the sample strings I've presented at the top (from three different charsets) appear properly.