I have a .mp4 file generated with ffmpeg as follows.
ffmpeg -y -i video_extended.mp4 -itsoffset 00:00:04.00 -i output5-1.wav -map 0:0 -map 1:0 -c:v copy -c:a aac -ac 6 -ar 48000 -b:a 128k -async 1 mixed.mp4
Playing mixed.mp4 file with ffplay is fine and there is no impact to the sound quality. Below is the output I get from ffplay when using the command ffplay -i mixed.mp4
> Input #0, mov,mp4,m4a,3gp,3g2,mj2, from
> 'mixed_h264_aac_512k_async_qp0_all_I.mp4': Metadata:
> major_brand : isom
> minor_version : 512
> compatible_brands: isomiso2avc1mp41
> encoder : Lavf58.76.100 Duration: 00:00:16.02, start: 0.000000, bitrate: 49136 kb/s Stream #0:0[0x1](und): Video: h264 (High 4:4:4 Predictive) (avc1 / 0x31637661), yuv422p10le(progressive),
> 1920x1080, 65409 kb/s, 59.94 fps, 59.94 tbr, 11988 tbn (default)
> Metadata:
> handler_name : VideoHandler
> vendor_id : [0][0][0][0] Stream #0:1[0x2](und): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, 5.1, fltp, 71 kb/s (default)
> Metadata:
> handler_name : SoundHandler
> vendor_id : [0][0][0][0] Switch subtitle stream from #-1 to #-1 vq= 1606KB sq= 0B f=0/0
Then, I decode the mixed.mp4 file back to raw PCM using the following command.
ffmpeg -i mixed.mp4 -vn -acodec pcm_s16le -f s16le -ar 48000 -ac 6 raw_audio.pcm
However, this raw_audio.pcm contains a lot of noise and ffplay output shows the following output
[s16le @ 0x7f7490000c80] Estimating duration from bitrate, this may be inaccurate
Input #0, s16le, from 'separated_audio_s16.pcm':
Duration: 00:00:16.02, bitrate: 4607 kb/s
Stream #0:0: Audio: pcm_s16le, 48000 Hz, 6 channels, s16, 4608 kb/s
[pcm_s16le @ 0x7f749002b940] Multiple frames in a packet.
[pcm_s16le @ 0x7f749002b940] Invalid PCM packet, data has size 8 but at least a size of 12 was expected
Last message repeated 32 times
[pcm_s16le @ 0x7f749002b940] Invalid PCM packet, data has size 8 but at least a size of 12 was expected
Last message repeated 11 times
Switch subtitle stream from #-1 to #-1 vq= 0KB sq= 0B f=0/0
[pcm_s16le @ 0x7f749002b940] Invalid PCM packet, data has size 8 but at least a size of 12 was expected
Last message repeated 11 times
[pcm_s16le @ 0x7f749002b940] Invalid PCM packet, data has size 8 but at least a size of 12 was expected
Last message repeated 11 times
[pcm_s16le @ 0x7f749002b940] Invalid PCM packet, data has size 8 but at least a size of 12 was expected
Can someone please explain the issue here? Note that the ffplay command that works correctly for mixed.mp4 shows fltp as the audio format, whereas when playing the raw_audio.pcm file, it is seen as s16.
Also, I have tried the same process with files containing stereo audio and it works fine. You can reproduce the problem by downloading the .mp4 file in here.
Is this a resampling issue in ffmpeg when processing multi-channel audio, and how can I rectify this?
I’m using ffmpeg and ffplay versions 5.0.1 in a Fedora 36 system.
Thank you.