0

I'm trying to build a script to make a mirror-backup of a free ESXi 6.5 to another free ESXi 6.5 host. I'm almost there but this issue is driving me crazy. This is a part of the script; I am using Bash for the script:

#!/bin/sh
find /vmfs/volumes/datastore1/ -regex '.*\.\(vmx\|nvram\|vmsd\|vmdk\)$' ! -name *-flat.vmdk | while read line; do
    dir1=$(dirname "${line}"| sed 's/ /\\ /g')
    dir2=$(dirname "${line}"| sed 's/ /\\\\ /g')
    ssh -n root@XX.XX.XX.XX "mkdir -p $dir1"
    cmd=$(echo $line "XX.XX.XX.XX:\""$dir2"/\"")
    echo $cmd
    scp -pr $cmd
done

output is:

  • for every VM that have no spaces in name, succeded.
  • for every VM with spaces in name, (last word in VM name): No such file or directory

I tried everything to make this SCP get the full path and it ignores everything: Put single quotes, double quotes, escape char to space, double, triple escape chars. Put args directly in SCP, put all the args of SCP in a variable and pass it after.

When running outside script, command runs flawless. When running in script, it's giving error and takes only last part after space.

Giacomo1968
  • 58,727
Costin
  • 15

3 Answers3

1

Your code is flawed in many aspects.

-name *-flat.vmdk is prone to globbing; what it expands to depends on files in the current working directory. * should be quoted (e.g. -name '*-flat.vmdk').

This is not the only time your code lacks quotes. echo $line is flawed because of this (and this in general).

read line should be at least IFS= read -r line. It would still fail if any path (returned by find) contained the newline character (which is a valid character in file names). For this reason find … -exec … \; is better. You can go like this:

find … -exec sh -c '…' sh {} \;

which introduces another level of quoting; or like this:

find … -exec helper_script {} \;

which makes quoting in the helper_script easier. The latter approach is advocated by this answer, still the answer doesn't fix other issues.

Your variables dir1 and dir2 seem to inject some cumbersome escaping to deal with spaces. You should not rely on escaping like this. Even if you managed to make it work with spaces, there are other characters you would need to escape in general. The right way is to quote properly.

There are at least three levels of quoting:

  1. in the original shell where find is invoked;
  2. in a shell spawned by -exec sh or in a shell interpreting the helper_script;
  3. in a shell spawned on the remote side by ssh … "whatever command" (similarly for paths processed by scp).

Introducing a helper_script makes the first level not interfere with the rest. The main command would be:

find /vmfs/volumes/datastore1/ -regex '.*\.\(vmx\|nvram\|vmsd\|vmdk\)$' ! -name '*-flat.vmdk' -exec /path/to/helper_script {} \;

And the helper_script:

#!/bin/sh
# no need for bash

addrs=XX.XX.XX.XX

pth="$1"
drctry="${pth%/*}"
# no need for dirname (separate executable)

ssh "root@$addrs" "mkdir -p '$drctry'"
scp -pr "$pth" "$addrs:'$drctry/'"

Now the important thing is ssh gets mkdir -p 'whatever/the var{a,b}e/expand$t*' as a string. This is passed to the remote shell and interpreted. Without the inner single quotes it could be interpreted in a way you don't want; my example exaggerates this. You could try to escape every troublesome character, it would be hard; so quote.

But if the variable contains any single-quote then some substring can be unquoted on the remote side. This opens a code injection vulnerability. E.g. this path:

…/foo/'$(nasty command)'bar/baz/…

will be very dangerous when embedded in single-quotes and interpreted. You should sanitize $drctry beforehand:

drctry="$(printf '%s' "${pth%/*}" | sed "s/'/'\"'\"'/g")"

The example dangerous path will now look like this:

…/foo/'"'"'$(nasty command)'"'"'bar/baz/…

This is somewhat similar to your usage of sed, but since the single-quote character is now the only troublesome character, it should be better.

scp needs similar quoting in the remote path for basically the same reason. Again, proper escaping with backslashes is more hassle (if possible at all).


A slight improvement is to allow the helper script to process more than one object. This will run less shell processes:

find /vmfs/volumes/datastore1/ -regex '.*\.\(vmx\|nvram\|vmsd\|vmdk\)$' ! -name '*-flat.vmdk' -exec /path/to/helper_script_2 {} +

And the helper_script_2:

#!/bin/sh

addrs=XX.XX.XX.XX

for pth; do
   drctry="$(printf '%s' "${pth%/*}" | sed "s/'/'\"'\"'/g")"
   ssh "root@$addrs" "mkdir -p '$drctry'"
   scp -pr "$pth" "$addrs:'$drctry/'"
done

It's possible to build a standalone command (not referring to any helper script) with -exec sh -c '…' (or -exec sh -c "…"). Because of the most outer quotes, this would tun into a quoting and/or escaping frenzy. The following trick with command substitution and here document is useful to avoid this:

find /vmfs/volumes/datastore1/ \
   -type f \
   -regex '.*\.\(vmx\|nvram\|vmsd\|vmdk\)$' \
 ! -name '*-flat.vmdk' \
   -exec sh -c "$(cat << 'EOF'

addrs=XX.XX.XX.XX

for pth; do
   drctry="$(printf '%s' "${pth%/*}" | sed "s/'/'\"'\"'/g")"
   ssh "root@$addrs" "mkdir -p '$drctry'" \
   && scp -pr "$pth" "$addrs:'$drctry/'"
done

EOF
   )" sh {} +

To fully understand this (and some fragments in previous snippets) in the context of variable expansion you need to know about quotes within quotes and why EOF is quoted (the linked answer cites man bash but this is more general POSIX behavior). Also note I added -type f to rule out possible directories matching the regex; and I wrote ssh … && scp …, so if the former fails (which includes when mkdir -p fails), the latter will not run.

0

Move the stuff on the right of the pipe (|) to a shell script, then do something like

find /vmfs/volumes/datastore1/ -regex '.*\.\(vmx\|nvram\|vmsd\|vmdk\)$' ! -name *-flat.vmdk -exec /path/to/shell/script {} \;

The {} will properly escape each and every file name it successfully finds and then call your script passing the escaped/quoted file name as the first argument. Simply access it with $1 in your script.

ivanivan
  • 3,042
0

Witness the magic of array:

$ line="meh bleh"
$ dir="hello\ world"
$ cmd=$(echo "$line" "$dir")
$ for i in $cmd; do echo "$i"; done
meh
bleh
hello\
world
$ for i in "$cmd"; do echo "$i"; done
meh bleh hello\ world
$ cmd=("$line" "$dir")
$ for i in "${cmd[@]}"; do echo "$i"; done
meh bleh
hello\ world
$

The problem with putting everything in a simple variable is that no one can tell what each argument is anymore.

Tom Yan
  • 10,996