How to encoding VP9 video using ffmpeg and vp9_qsv?

Question

I am trying to use ffmpeg to encode VP9 video with vp9_qsv (Intel Quick Sync Video hardware support) on windows 10. I have previously successfully encoded VP9 video with libvpx-vp9, but that uses the CPU and is rather slow. Now that I am trying to switch to vp9_sqv I get some error messages that are not so helpful and the documentation is rather lacking as well.

ffmpeg.exe -i input.mp4 -c:v vp9_qsv -c:a copy output.webm

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.20.100
  Duration: 00:00:46.74, start: 0.040000, bitrate: 15752 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 1920x1080, 15561 kb/s, 25.02 fps, 25 tbr, 12800 tbn, 50 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 193 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> vp9 (vp9_qsv))
  Stream #0:1 -> #0:1 (copy)
Press [q] to stop, [?] for help
[vp9_qsv @ 000001b156147b40] Selected ratecontrol mode is unsupported
[vp9_qsv @ 000001b156147b40] Low power mode is unsupported
[vp9_qsv @ 000001b156147b40] Current frame rate is unsupported
[vp9_qsv @ 000001b156147b40] Current picture structure is unsupported
[vp9_qsv @ 000001b156147b40] Current resolution is unsupported
[vp9_qsv @ 000001b156147b40] Current pixel format is unsupported
[vp9_qsv @ 000001b156147b40] some encoding parameters are not supported by the QSV runtime. Please double check the input parameters.
Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
Conversion failed!

I tried specifying some of the things it is asking for, but this produces exactly the same errors.

ffmpeg.exe -i input.mp4 -c:v vp9_qsv -preset veryslow -low_power 0 -b:v 800K -c:a copy output.webm

EDIT: Based on Mokubai's suggestion I now get this: ffmpeg.exe -init_hw_device qsv=hw -filter_hw_device hw -f rawvideo -pix_fmt yuv420p -s:v 1920x1080 -i input.mp4 -vf hwupload=extra_hw_frames=64,format=qsv -c:v vp9_qsv -c:a copy output.webm

[AVHWDeviceContext @ 000001d30eccb640] Error setting child device handle: -6

score 3 · Accepted Answer · answered Dec 10 '20 at 21:19

Firstly it may be worth bearing in mind that Intel have not officially enabled VP9 support in their Quicksync Encoder before Tiger Lake CPUs. There is some hints that it might be able to be made to work in Linux on Kaby Lake (7th Generation) processorsand above with some mangling, but one user with contributions to the Intel media driver states

sorry for late response, we will not enable VP9 encode on Gen9 platforms , if someone still want it, he could try PR #717 , it add VP9 encode support. but Gen9 VME design is not optimized for VP9. so we have concerns on it.

Gen9 being a large number of platforms pre-Tiger Lake which itself uses Gen 11 graphics.

That said the QSV examples section at the FFMPEG wiki: Hardware / QuickSync seems to suggest that you can add -init_hw_device qsv=hw -filter_hw_device hw -f rawvideo -pix_fmt yuv420p -s:v 1920x1080 before your -i input.mp4 to initialise the hardware (assuming a YUV420 source).

Then use -vf hwupload=extra_hw_frames=64,format=qsv before your -c:v to specify the format the encoder wants, though I do not have functional hardware with which to test.

At the very least the FFMPEG wiki does hold some promise but even on a Tiger Lake CPU I cannot get it to transcode any video. No matter what I try I get the following lines

[swscaler @ 000001c902edec40] deprecated pixel format used, make sure you did set range correctly
[vp9_qsv @ 000001c97dc7ef40] Selected ratecontrol mode is unsupported
[vp9_qsv @ 000001c97dc7ef40] some encoding parameters are not supported by the QSV runtime. Please double check the input parameters.
Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
Conversion failed!

I can use the h264_qvc encoder happily, but switching to vp9_qvc throws the error and I cannot find any hints on how to specify the necessary ratecontrol.

My example command line that works

.\ffmpeg.exe -init_hw_device qsv=hw -i 'MVI_7664.mov' -vf hwupload=extra_hw_frames=64,format=qsv -c:v h264_qsv -strict experimental -preset veryslow -c:a copy output.mkv -y

but it is using h264 not VP9. Perhaps I need a newer FFMPEG.

Dennis Mungai · Answer 2 · 2021-05-13T09:24:04.130

Using Intel's QuickSync (on supported platforms):

This answer extends the answer above, with a few changes:

For vp9_qsv encoder wrapper, note that low power mode is mandatory (for now). Failure to set this (via the private codec option -low_power 1) will result in failure, whereupon the MFX runtime will print out a log similar to:

[vp9_qsv @ 000001b156147b40] Selected ratecontrol mode is unsupported
[vp9_qsv @ 000001b156147b40] Low power mode is unsupported
[vp9_qsv @ 000001b156147b40] Current frame rate is unsupported
[vp9_qsv @ 000001b156147b40] Current picture structure is unsupported
[vp9_qsv @ 000001b156147b40] Current resolution is unsupported
[vp9_qsv @ 000001b156147b40] Current pixel format is unsupported
[vp9_qsv @ 000001b156147b40] some encoding parameters are not supported by the QSV runtime. Please double check the input parameters.
Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
Conversion failed!

This is because the QSV MFX runtime must negotiate all requirements with the device driver (iHD, on Linux) before an MFX session can register successfully. To my knowledge, this wrapper will only work on Linux at the moment. This may change in the near future.

All examples below show a case of 1:N transcoding (ie one input used to provide multiple outputs). A complex filter chain is also in use, as well as the tee muxer slaves calling up the underlying segment muxers.

On Intel Icelake and above, you can use the vp9_qsv encoder wrapper with the following known limitations (for now), as tested on Linux:

(a). You must enable low_power mode because only the VDENC decode path is exposed by the iHD driver for now.

(b). Coding option1 and extra_data are not supported by MSDK.

(c). The IVF header will be inserted in MSDK by default, but it is not needed for FFmpeg, and remains disabled by default.

See the examples below, taking a single input and producing multiple outputs via the tee muxer slaves calling up segment muxers:

If you need to deinterlace, call up the vpp_qsv filter as shown:

    ffmpeg -nostdin -y -fflags +genpts \
    -init_hw_device vaapi=va:/dev/dri/renderD128,driver=iHD \
    -filter_hw_device va -hwaccel vaapi -hwaccel_output_format vaapi \
    -threads 4 -vsync 1 -async 1 \
    -i 'http://server:port' \
    -filter_complex "[0:v]hwmap=derive_device=qsv,format=qsv,vpp_qsv=deinterlace=2:async_depth=4,split[n0][n1][n2]; \
    [n0]vpp_qsv=w=1152:h=648:async_depth=4[v0]; \
    [n1]vpp_qsv=w=848:h=480:async_depth=4[v1];
    [n2]vpp_qsv=w=640:h=360:async_depth=4[v2]" \
    -b:v:0 2250k -maxrate:v:0 2250k -bufsize:v:0 360k -c:v:0 vp9_qsv -g:v:0 50 -r:v:0 25 -low_power:v:0 2 \
    -b:v:1 1750k -maxrate:v:1 1750k -bufsize:v:1 280k -c:v:1 vp9_qsv -g:v:1 50 -r:v:1 25 -low_power:v:1 2 \
    -b:v:2 1000k -maxrate:v:2 1000k -bufsize:v:2 160k -c:v:2 vp9_qsv -g:v:2 50 -r:v:2 25 -low_power:v:2 2 \
    -c:a aac -b:a 128k -ar 48000 -ac 2 \
    -flags -global_header -f tee -use_fifo 1 \
    -map "[v0]" -map "[v1]" -map "[v2]" -map 0:a \
    "[select=\'v:0,a\':f=segment:segment_time=5:segment_format_options=movflags=+faststart]$output_path0/output%03d.mp4| \
     [select=\'v:1,a\':f=segment:segment_time=5:segment_format_options=movflags=+faststart]$output_path1/output%03d.mp4| \
     [select=\'v:2,a\':f=segment:segment_time=5:segment_format_options=movflags=+faststart]$output_path2/output%03d.mp4"

Without deinterlacing:

    ffmpeg -nostdin -y -fflags +genpts \
    -init_hw_device vaapi=va:/dev/dri/renderD128,driver=iHD \
    -filter_hw_device va -hwaccel vaapi -hwaccel_output_format vaapi \
    -threads 4 -vsync 1 -async 1 \
    -i 'http://server:port' \
    -filter_complex "[0:v]hwmap=derive_device=qsv,format=qsv,split=3[n0][n1][n2]; \
    [n0]vpp_qsv=w=1152:h=648:async_depth=4[v0]; \
    [n1]vpp_qsv=w=848:h=480:async_depth=4[v1];
    [n2]vpp_qsv=w=640:h=360:async_depth=4[v2]" \
    -b:v:0 2250k -maxrate:v:0 2250k -bufsize:v:0 2250k -c:v:0 vp9_qsv -g:v:0 50 -r:v:0 25 -low_power:v:0 2  \
    -b:v:1 1750k -maxrate:v:1 1750k -bufsize:v:1 1750k -c:v:1 vp9_qsv -g:v:1 50 -r:v:1 25 -low_power:v:1 2  \
    -b:v:2 1000k -maxrate:v:2 1000k -bufsize:v:2 1000k -c:v:2 vp9_qsv -g:v:2 50 -r:v:2 25 -low_power:v:2 2  \
    -c:a aac -b:a 128k -ar 48000 -ac 2 \
    -flags -global_header -f tee -use_fifo 1 \
    -map "[v0]" -map "[v1]" -map "[v2]" -map 0:a \
    "[select=\'v:0,a\':f=segment:segment_time=5:segment_format_options=movflags=+faststart]$output_path0/output%03d.mp4| \
     [select=\'v:1,a\':f=segment:segment_time=5:segment_format_options=movflags=+faststart]$output_path1/output%03d.mp4| \
     [select=\'v:2,a\':f=segment:segment_time=5:segment_format_options=movflags=+faststart]$output_path2/output%03d.mp4"

Note that we use the vpp_qsv filter with the async_depth option set to 4. This massively improves transcode performance over using scale_qsv and deinterlace_qsv. See this commit on FFmpeg's git.

Notes:

This will only work on Linux, running the current media-driver package for VAAPI H/W acceleration, which ffmpeg picks via -init_hw_device vaapi=va:/dev/dri/renderD128,driver=iHD -filter_hw_device va -hwaccel vaapi -hwaccel_output_format vaapi bound to a DRI node /dev/dri/rendereD128. This is the default on single-GPU systems. However, this will change if more than one GPU is present. We use VAAPI for H/W acceleration as its' more resilient for decode acceleration. QuickSync decode is surprisingly fragile and will result in MFX errors on multiple input files.

We also derive a QSV context via the hwmap filter, called up via hwmap=derive_device=qsv,format=qsv which is then chained immediately to the format=qsv filter, specifying that we want QSV H/W frames to be fed to the adjacent filter vpp_qsv in the complex filter chain.

Warnings:

With QuickSync, regardless of input formats and the QSV encoder wrapper in use, expect a slightly higher CPU overhead especially on smaller processors such as the Intel Atom® x7-E3950 Processor. The same overhead is greatly mitigated against on more capable desktop-grade processors and the Iris Pro capable CPUs.
For VP9-based H/W based encoding, I'd still strongly recommend using the vp9_vaapi encoder wrapper instead, and with caveats (related to the use of B-frames and rate control modes). VAAPI tends to be more stable overall than QSV.

References:

See the encoder options, including rate control methods supported:

ffmpeg -h encoder=vp9_qsv

On the vpp_qsv filter usage, see:

ffmpeg -h filter=vpp_qsv

Warning:

Note that the SDK requires at least 2 threads to prevent deadlock, see this code block.

How to encoding VP9 video using ffmpeg and vp9_qsv?

2 Answers2