I have a file 500MB+ that has been generated by saving a large excel spreadsheet as unicode. I am running windows 7.
I need to open the file with python pandas. So far I used to convert the file from ANSI to UTF-8 with notepad++ but the file is now too large and then open it with notepad++.
I have Hebrew, French, Swedish, Norwegian, Danish special characters.
- Panda's read_excelis just too slow * I let it go for several minutes without seeing some output.
- iconv: apparently I can not get the encoding right, I just get out a list of tab separated nulls when I have tried:- iconv -f "CP858" -t "UTF-8" file1.txt > file2.txt - iconv -f "windows-1252" -t "UTF-8" file1.txt > file2.txt 
Edit
iconv -f "UTF-16le" -t "UTF-8" file1.txt > file2.txt leads to a very weird behaviour: a row in between lines is cut. All looks fine but only 80K rows are actually converted.
Edit 2
.. read_csv with encoding='utf-16le' reads properly the file. However, I still don't get why iconv messes it up.
 
    