22

Could some one explain what is happening behind the scenes in character escaping in Linux shell? I tried the following and googled a lot, without any success in understanding what (and how) is going on:

root@sv01:~# echo -e "\ Hello!"
\ Hello!
root@sv01:~# echo -e "\\ Hello!"
\ Hello!
root@sv01:~# echo -e "\\\ Hello!"
\ Hello!
root@sv01:~# echo -e "\\\\ Hello!"
\ Hello!
root@sv01:~# echo -e "\\\\\ Hello!"
\\ Hello!
root@sv01:~# echo -e "\\\\\\ Hello!"
\\ Hello!
root@sv01:~# echo -e "\\\\\\\ Hello!"
\\ Hello!
root@sv01:~# echo -e "\\\\\\\\ Hello!"
\\ Hello!
root@sv01:~# echo -e "\\\\\\\\\ Hello!"
\\\ Hello!
root@sv01:~# echo -e "\n Hello!"

 Hello!
root@sv01:~# echo -e "\\n Hello!"

 Hello!
root@sv01:~# echo -e "\\\n Hello!"
\n Hello!

I am totally lost there, so for example, why do three backslashes give only one back slash? I would expect: the first two will be escaped to one, the third one will find nothing to escape so it will remain a slash (line in the first experiment), but what is happening is that the third one is just disappears.
Why I am getting one backslash from four \\\\ Hello? I would expect each pair will give one back slash -> two backslashes.

And why I need three backslashes in the last case to get \n escaped? what is happening in background of escaping to get that? and how is it different from \\n case?

I appreciate any explanation of what is going on in the previous lines.

1 Answers1

32

This is because bash and echo -e combined. From man 1 bash

A non-quoted backslash (\) is the escape character. It preserves the literal value of the next character that follows, with the exception of <newline>. […]

Enclosing characters in double quotes preserves the literal value of all characters within the quotes, with the exception of $, `, \, […] The backslash retains its special meaning only when followed by one of the following characters: $, `, ", \, or <newline>.

The point is: double quoted backslash is not always special.

There are various implementations of echo in general, it's a builtin in bash; the important thing here is this behavior:

If -e is in effect, the following sequences are recognized:
\\
backslash
[…]
\n
new line

Now we can decode:

  1. echo -e "\ Hello!" – nothing special to bash, nothing special to echo; \ stays.
  2. echo -e "\\ Hello!" – the first \ tells bash to treat the second \ literally; echo gets \ Hello! and acts as above.
  3. echo -e "\\\ Hello!" – the first \ tells bash to treat the second \ literally; echo gets \\ Hello! and (because of -e) it recognizes \\ as \.
  4. echo -e "\\\\ Hello!" – the first \ tells bash to treat the second \ literally; the third tells the same about the fourth; echo gets \\ Hello! and (because of -e) it recognizes \\ as \.
  5. echo -e "\\\\\ Hello!" – the first \ tells bash to treat the second \ literally; the third tells the same about the fourth; the last one is not special; echo gets \\\ Hello! and (because of -e) it recognizes the initial \\ as \, the last \ stays intact.

And so on. As you can see, up to four consecutive backslashes give one in result. That's why you need (at least) nine of them to get three. 9=4+4+1.

Now with \n:

  1. echo -e "\n Hello!" – there's nothing special to bash, echo gets the same string and (because of -e) it interprets \n as a newline.
  2. echo -e "\\n Hello!"bash interprets \\ as \; echo gets \n Hello! and the result is the same as above.
  3. echo -e "\\\n Hello!"bash interprets the initial \\ as \; echo gets \\n Hello! and (because of -e) it interprets \\ as a literal \ which needs to be printed.

The results would be different with ' instead of " (due to different bash behavior) or without -e (different echo behavior).