17

When I write :

file file1.txt 

I have this output : Little-endian UTF-16 Unicode text, with CR line terminators

Then if I write :

file file2.txt 

I have : ASCII text

file2.txt is created by making :

echo $var > "file2.txt"

I would like file2.txt have the same encoding than file1.txt. How can I do that ?

Pierre
  • 317

3 Answers3

31

You can use iconv to convert the encoding of the file:

iconv -f ascii -t utf16 file2.txt > another.txt

another.txt should then have the desired encoding.

You could also try this:

echo $var | iconv -f ascii -t utf16 > "file2.txt"
Oliver Salzburg
  • 89,072
  • 65
  • 269
  • 311
8

Use iconv:

echo "$var" | iconv --from-code=utf-8 --to-code=utf-16le --output=file2.txt
0

When converting your file, you should be sure it contains a byte-order mark. Even though the standard says a byte-order-mark isn't recommended for UTF-8, there can be legitimate confusions between UTF-8 and ASCII without a byte-order mark.

Additionally, specifying UTF-16BE or UTF-16LE doesn't prepend a byte-order mark, so I first convert to UTF-16, which uses a platform-dependent endianness. Then, I use file to determine the actual endianness and the convert from that to UTF-16LE.

Finally, when you create a file using bash, the file receives bash's locale charmap encoding, so that's what you need to map from.

(I uppercase all my encodings because when you list all of iconv's supported encodings with iconv -l they are all uppercase.)

BASH_ENCODING="$( locale charmap | tr [:lower:] [:upper:] )"
echo $var | iconv -f "$BASH_ENCODING" -t UTF-16 > UTF-16-UNKNOWN-ENDIANNESS-FILE
FILE_ENCODING="$( file --brief --mime-encoding UTF-16-UNKNOWN-ENDIANNESS-FILE )"
iconv -f "$FILE_ENCODING" -t UTF-16LE UTF-16-UNKNOWN-ENDIANNESS-FILE > file2.txt