65

Is there a quick and easy way to find the Unicode code point for any character? For example, I see a funny character on a web page, or a PDF file, or some other document.

What I current do is copy the character to the clipboard, save it to a file, and look at the file with a hex viewer. Alternatively I can open Microsoft Word, paste and do Alt+X. Both of these methods are a bit cumbersome. Is there an easier way?

I use Notepad++ so if there's any way to do that with Notepad++, it would be a suitable answer (it's less cumbersome than having to open Word). Or maybe there's a way to do it with a small specialised application?

15 Answers15

35

I work a lot with Unicode characters, so I have written a small Windows application specifically for this:

Unicode Character Informer (Documentation)

In addition, my text editor, Rejbrand Text Editor, has extensive Unicode character support.

34

Notepad++ has a pre-installed plug-in called Converter that has a option to Convert ASCII to HEX and Vice-versa. This tool is quite useful as to convert data files that are in HEX format which are to be converted to ASCII to read:

enter image description here

That is how it works:

enter image description here

Leo Chapiro
  • 15,705
28

There's a nice little website called Unicode Character Inspector (built by Tim Whitlock) that does just that. I find it way more convenient than a text editor or desktop program.

18

When I'm faced with this problem, a quick Google search usually provides a quick answer. For example, when I google " unicode", I get a result like this: Google search for the "smiling face with heart shaped eyes" emoji

I like this method because:

  • It works on any computer with internet
  • You don't have to install anything
  • The keypresses required (Ctrl+C & Ctrl+T & Ctrl+V & Enter) are muscle memory actions for me, and probably for most other developers/typists.
9

On a Unix-like system*:

unicode -s "$(xsel -ob)"

You can alias this or create a script to run it.

The output looks like this:

U+2672 UNIVERSAL RECYCLING SYMBOL
UTF-8: e2 99 b2 UTF-16BE: 2672 Decimal: ♲ Octal: \023162
♲ (♲)
Uppercase: 2672
Category: So (Symbol, Other)
Bidi: ON (Other Neutrals)

* It looks like the original poster is probably using Windows, but (a) this isn't specified, and (b) this solution might help others.

wchargin
  • 994
8

I find Rishard Ishida's Unicode code converter (github link) very usefull for finding unicode charactercodes, amongst other things. It also provides translations/conversions to other codepoints, encodings and for instance escapes-sequences.

Unicode Converter

You may also want to checkout Richard Ishida's main webpage (rishida.net), as it contains (links to) alot of valuable tools and information, especially if you're interested in internationalisation and character-encoding. For instance, another very useful tool linked there, is his Uniview tool (github link).

Uniview

And finally, also very useful i find, although mostly relevant to Mac-users, is macOS's Character Viewer, accessible through the Input Menu, which can be enabled in System PreferencesKeyboard

Although the Apple-support website mainly focusses on how-to insert emojies (…), the Character Viewer is actually very useful for looking-up specific ('special') characters and their codepoints in several different encodings, as well as for finding which fonts on your systen contain specific glyphs.

Character Viewer

Cheers!

arri
  • 191
  • 2
6

You can use PowerShell!

[char]::ConvertToUtf32((gcb), 0)

This prints the first Unicode code point of the text on the clipboard.

If you don't have to worry about characters outside the Basic Multilingual Plane (that would be represented in .NET strings as a high and low surrogate), you can use this instead:

[int](gcb)[0]

If you'd prefer it in hex, you can use a format specifier:

'0x{0:x}' -f [char]::ConvertToUtf32((gcb), 0)
Ben N
  • 42,308
6

A note for any Emacs users: you can type C-u C-x = and it will give you a bunch of information about the character under the cursor, including the Unicode code point, the name in the Unicode database and the categories etc.

             position: 146 of 147 (99%), column: 0
            character: ♲ (displayed as ♲) (codepoint 9842, #o23162, #x2672)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x2672
               script: symbol
               syntax: w    which means: word
             category: .:Base
             to input: type "C-x 8 RET 2672" or "C-x 8 RET UNIVERSAL RECYCLING SYMBOL"
          buffer code: #xE2 #x99 #xB2
            file code: #xE2 #x99 #xB2 (encoded by coding system utf-8-unix)
              display: by this font (glyph code)
    xft:-PfEd-Mensch-normal-normal-normal-*-16-*-*-*-m-0-iso10646-1 (#x985)

Character code properties: customize what to show
  name: UNIVERSAL RECYCLING SYMBOL
  general-category: So (Symbol, Other)
  decomposition: (9842) ('♲')
4

I use http://unicode.scarfboy.com, which is simple and works well.

One thing this website supports is looking up a specific entered Unicode character. If you paste the character from the clipboard and hit enter, it will identify the character.

M. Justin
  • 203
4

You can also use the following site: https://unicode-table.com/en/ Just paste your character, and you'll get a Unicode code point and HTML code as well.

4

Got Vim? Just paste it in, put your cursor on it, and hit ga. I use this all the time for weird characters.

2

Here's one more answer using an idea from user202729:

Bookmark the URL javascript:alert(prompt().codePointAt(0).toString(16)) and use a browser to run it. (Works on Chrome and Firefox. Doesn't appear to work on IE but this may be due to security settings.)

Unlike other answers, no internet connection is required, no external utility to download, not OS-specific.

2

I can't believe nobody has suggested this lookup gem yet as its what many of us really want: https://util.unicode.org/UnicodeJsps/character.jsp?a=0002

It includes ALL Unicode characters, all Unicode data on them, pictures of them for browsers without font support, and is constantly updated to the latest Unicode standard as its an official tool by the Unicode consortium.

More links for everyone:

Jack G
  • 246
1

I am going to mention http://amp-what.com/ as it is really easy to use with its quick search field and supports different notations (& code, Unicode codepoints, URI encode character sequence).

Example image

Tefek
  • 11
0

If you have Microsoft Word, paste the text there, select the character (or click to the right of it), and press Alt+X.