Questions tagged [character-encoding]

A character encoding system consists of a code that pairs each character from a given repertoire with something else — such as a bit pattern, sequence of natural numbers, octets, or electrical pulses — in order to facilitate the transmission of data (generally numbers or text) through telecommunication networks or for data storage.

A character encoding system consists of a code that pairs each character from a given repertoire with something else — such as a bit pattern, sequence of natural numbers, octets, or electrical pulses — in order to facilitate the transmission of data (generally numbers or text) through telecommunication networks or for data storage.

More information, including a large list of common character encodings can be found at the Wikipedia page on the topic.

308 questions
157
votes
2 answers

How do I change the character encoding for a webpage in Chrome?

Google Chrome, like almost every web browser in recent memory, used to have an option to change the character encoding for the webpage being viewed by going to Menu › Tools › Encoding (or some similar location). This is extraordinarily useful in…
Meshaal
  • 1,672
128
votes
5 answers

How to set character encoding when opening a CSV file in Excel?

Is it possible to set the default encoding for Excel (any version, e.g. 2010) when opening files like csv files (like you can in Open Office Calc)? I When I try to open a csv file encoded in Japanese SHIFT-JIS, it opens but with mojibake (corrupted…
107
votes
3 answers

How do I find the encoding of the current buffer in vim?

Say I am editing some file with vim (or gvim). I have no idea about the file's encoding and I want to know whether it is in UTF-8 or ISO-8859-1 or whatever? Can I somehow tell vim to show me what encoding is used?
innaM
  • 10,412
75
votes
1 answer

Converting the encoding of a text file (Mac OS X)

Possible Duplicate: How can I convert multiple files to UTF-8 encoding using *nix command line tools? Okay, now that I can detect the encoding, I know that my encoding is using charset=iso-8859-1 instead of utf. How can I convert this?
Casebash
  • 7,727
55
votes
1 answer

Saving "Bush hid the facts" in notepad

When saving the text "Bush hid the facts" in notepad under Windows XP, how come when you reopen it shows squares instead of the text? I saw it in this video if you need an example http://www.youtube.com/watch?v=9bK9-sc_uus&feature=related
Mohammed
50
votes
1 answer

How can I convert multiple files to UTF-8 encoding using *nix command line tools?

Possible Duplicate: Batch-convert files for encoding or line ending I have a bunch of text files that I'd like to convert from any given charset to UTF-8 encoding. Are there any command line tools or Perl (or language of your choice) one liners I…
jason
  • 665
44
votes
8 answers

Excel: Change default encoding (file origin) of Text Import Wizard to UTF-8 (65001 : Unicode)

I am using a variety of tools to regularly prepare data for the web. One stage requires me to open a CSV in Excel, make changes and save the file. Is there a way to force Excel to accept UTF-8 encoding, and to save its files with that encoding?
Dizzley
  • 1,041
  • 2
  • 13
  • 19
39
votes
2 answers

Finding out the default character encoding in Windows

Is there any way to find what is the default character encoding in Windows? I know that in Western Europe and the US, CP-1252 is the default, but need to check this on other Windows machines too. Alternatively, is there any list of default encodings…
36
votes
1 answer

ANSI to UTF-8 in Notepad++

I have a text encoded in ANSI: When I tried to convert it into UTF-8 (using the Notepad++ menu Encoding > UTF-8), I get some weird characters: I thought that UTF-8 was a superset of ANSI and that I subsequently wouldn't have such issues. Is there…
user3658425
  • 365
  • 1
  • 3
  • 8
36
votes
3 answers

Default PowerShell to emitting UTF-8 instead of UTF-16?

By default, PowerShell in Windows seems to be outputting UTF-16 (e.g., if I do a simple echo hello > hi.txt, then hi.txt ends up in UTF-16). I know that I can force this to my desired text encoding by instead doing echo hello | out-file -encoding…
32
votes
7 answers

Why do English characters require fewer bytes to represent than other alphabets?

When I put 'a' in a text file, it makes it 2 bytes but when I put, let's say 'ա', which is a letter from Armenian alphabet, it makes it 3 bytes. What is the difference between alphabets for a computer? Why does English take less space?
khajvah
  • 788
32
votes
1 answer

What is this '°͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌' strange character?

I saw this °͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌͌ strange character online. I also noted that: It needs 26 backspaces to delete but behaves like one character when selecting. It is drawn vertically covering many rows above. What is this character and why…
Serious
  • 1,563
29
votes
6 answers

remove <200b> character from text file

I have a huge text file containing this string/character <200b> that I want to delete. I tried with sed but it didn't work. sed 's/<200b>//g' file The character never shows when I open the file with a graphic text editor like gedit, I see it with…
27
votes
4 answers

Ubuntu Linux: Can I paste plain text by default?

(Similar to my earlier question about Windows XP and darren_n's follow-up for Mac OS X.) I regularly copy and paste text between spreadsheets, emails, browser windows, etc. I can't think of a single time when I've wanted to keep the formatting from…
25
votes
2 answers

What does STX, SOH, and GS mean in Notepad++ output?

Upon reviewing the MIME source for an email (presumably containing international characters), I see stuff like this in Notepad++ I understand that CRLF is carriage return line feed, but what about the others? What do SOH, GS, and STX mean?
Mike B
  • 2,720
1
2 3
20 21