2

I am currently streaming a directory over SSH after compressing it through tar:

tar cz /path/to/foo | pv | ssh HOSTNAME 'tar xmz && some-cool-command'

The issue is that pv doesn't know the total size of the stream so it cannot show me a proper progressbar. I could set it to the size of the /path/to/foo but that won't be correct as the stream is compressed.

Is there a way to work around this and get pv to show a proper progressbar?

2 Answers2

0

Short answer, no. You don't know how much the tar (gzip actually) will compress your file(s), until you do that.

You might compress the files on your drive in advance and then show progress bar of the transfer only:

 tar cz /path/to/foo > /tmp/ar && pv /tmp/ar | ssh HOSTNAME 'tar xmz && some-cool-command'

but I don't think it is what you you want to achieve. Or you might get satisfied with the approximate size based on the original file, as you proposed.

Jakuje
  • 10,827
0

I could set it to the size of the /path/to/foo but that won't be correct as the stream is compressed.

Then let pv see the not-yet-compressed stream.

tar cz … pipes to gzip "internally". You can build your pipeline with tar and explicit gzip instead:

tar c … | gzip -c | ssh …

Note there is no z in the options of tar now. Put pv between tar and gzip and it will see the not-yet-compressed stream:

tar c … | pv | gzip -c | ssh …

The total size of the stream seen by this pv can be estimated. I would estimate it using:

  • du --summarize --bytes /path/to/foo | cut -f 1
  • or du --summarize --block-size=1 /path/to/foo | cut -f 1 in case of tar --sparse.

Personally I like to have pv -cN … before and after gzip, so I can roughly calculate compression ratio on the fly, usually just out of curiosity or for fun.