2

I have a video call with my guests here is how the feeds look like:

  • Feed 1 (recorded locally by guest, records ONLY his/her audio/video)

  • Feed 2 (recorded locally by me, records ONLY my audio/video)

I can determine the point when both started (i.e. 10 seconds in etc).

I want to create a combined feed where:

  • a guest is shown/heard oly when guest speaks

  • I am shown/heard when I speak

I.e. the script should decide based on audio waves which video is to be shown

It should also sync the audio volume so both audio volumes sound the same for listener.

Can ffmpeg do that?

1 Answers1

0

use xstack+join:

ffmpeg \
-ss 10 -i test01.mkv \
-ss 10 -i test02.mkv \
-filter_complex '
[0:1]aresample=async=1:first_pts=0,pan=1c|c0=c0+c1,dynaudnorm=f=200:g=7[0a1];
[1:1]aresample=async=1:first_pts=0,pan=1c|c0=c0+c1,dynaudnorm=f=200:g=7[1a1];
[0a1][1a1]join=inputs=2[a0];
[0:0][1:0]xstack=inputs=2:layout=0_0|1024_0[v0]
' -map [v0] -map [a0] -c:v:0 h264 -c:a:0 aac output.mkv -y -hide_banner

xstack=layout=

0_0 - coordinate of first video

1024_0 - coordinate of second video

pan=1c|c0=c0+c1 - stereo to mono before joining

dynaudnorm - normalize volume

join - left audio to left channel, right audio to right channel