1

Searching Excel with no luck for text sequences that I knew where there was driving me insane. I was copy-pasting search terms from a website Digi-Key (digi-key.com) and searching for them in an Excel database.

I finally figured out what was wrong when I accidentally middle-clicked into a MINGW64 window:

IMG:

When I double-clicked on the text "‎BRL2012T2R2M‎" and pasted it into the MINGW64 window, it revealed the secret: the text was actually \342\200\216‎BRL2012T2R2M‎\342\200\216 (photo linked)

What are these control codes, and why does windows pick them up even when I dump the paste into notepad then re-copy it?

Ben
  • 11

1 Answers1

1

Shown by bash, \342\200\216 are C-style octal escapes, which can also be written as hexadecimal \xE2\x80\x8E.

The bytes E2 80 8E (hex) are UTF-8 encoding of the Unicode codepoint value U+200E, which is an invisible character called Left-to-right mark.

It indicates that the following text is read left-to-right even if the surrounding text is normally right-to-left (as some languages are, such as Arabic). The website author most likely adds these marks to ensure the part's name won't get corrupted when the website's interface is switched to those languages.

See this W3C article for an introduction to inline bidirectional markup.

grawity
  • 501,077