I'm trying to do the following:
- cut an input video into segments and convert these segments in the same step
- concatenate the transcoded segments back together in an output file
The steps that I'm going through are:
1) create the segments and transcode them
ffmpeg -i input.mp4 -ss 00 -t 10 -vcodec libx264 -acodec libfdk_aac -f mpegts segment0.ts
ffmpeg -i input.mp4 -ss 10 -t 10 -vcodec libx264 -acodec libfdk_aac -f mpegts segment1.ts
etc.
2) concatenate the segments using the Concat Demuxer
printf "file '%s'\n" ./*.ts > mylist.txt
ffmpeg -f concat -i mylist.txt -vcodec copy -bsf:a aac_adtstoasc output.mp4
This works fine (or at least seems to work fine as I can't hear or see any problems in the output file) with 1 testing video I'm using (https://www.youtube.com/watch?v=x76VEPXYaI0)
But there are some audible glitches at the glue points between segments when I try another testing video - http://trailers.divx.com/divx_prod/profiles/Helicopter_DivXHT_ASP.divx. In this one, ffmpeg already shows an error message during the segment cutting and transcoding:
[mp3 @ 0x297d220] Header missing Error while decoding stream #0:1: Invalid data found when processing input
Does anyone know if these audio glitches at the glue points can be avoided, if it's just this input file and its audio being weird or if it's likely a general problem with my method?
Thanks for your help.
Console output of segment/transcode command:
ffmpeg -i helicopter.divx -ss 00 -t 10 -vcodec libx264 -acodec libfdk_aac -f mpegts segment0.ts
ffmpeg version 2.2.git Copyright (c) 2000-2014 the FFmpeg developers
built on Jun 23 2014 18:58:08 with gcc 4.8 (Ubuntu 4.8.2-19ubuntu1)
configuration: --prefix=/home/tobi/ffmpeg_build --extra-cflags=- I/home/tobi/ffmpeg_build/include --extra-ldflags=-L/home/tobi/ffmpeg_build/lib -- bindir=/home/tobi/bin --extra-libs=-ldl --enable-gpl --enable-libass --enable-libfdk-aac --enable- libfreetype --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable- libvpx --enable-libx264 --enable-nonfree --enable-x11grab
libavutil 52. 89.100 / 52. 89.100
libavcodec 55. 66.100 / 55. 66.100
libavformat 55. 43.100 / 55. 43.100
libavdevice 55. 13.101 / 55. 13.101
libavfilter 4. 7.100 / 4. 7.100
libswscale 2. 6.100 / 2. 6.100
libswresample 0. 19.100 / 0. 19.100
libpostproc 52. 3.100 / 52. 3.100
Input #0, avi, from 'helicopter.divx':
Duration: 00:01:48.11, start: 0.000000, bitrate: 4192 kb/s
Stream #0:0: Video: mpeg4 (DX50 / 0x30355844), yuv420p, 720x408 [SAR 1:1 DAR 30:17], 3991 kb/s, 23.98 fps, 23.98 tbr, 23.98 tbn, 30k tbc
Metadata:
title : Video
Stream #0:1: Audio: mp3 (U[0][0][0] / 0x0055), 44100 Hz, stereo, s16p, 192 kb/s
Metadata:
title : Audio
[libx264 @ 0x2408220] using SAR=1/1
[libx264 @ 0x2408220] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
[libx264 @ 0x2408220] profile High, level 3.0
Output #0, mpegts, to 'segment0.ts':
Metadata:
encoder : Lavf55.43.100
Stream #0:0: Video: h264 (libx264), yuv420p, 720x408 [SAR 1:1 DAR 30:17], q=-1--1, 23.98 fps, 90k tbn, 23.98 tbc
Metadata:
title : Video
encoder : Lavc55.66.100 libx264
Stream #0:1: Audio: aac (libfdk_aac), 44100 Hz, stereo, s16, 128 kb/s
Metadata:
title : Audio
encoder : Lavc55.66.100 libfdk_aac
Stream mapping:
Stream #0:0 -> #0:0 (mpeg4 (native) -> h264 (libx264))
Stream #0:1 -> #0:1 (mp3 (native) -> aac (libfdk_aac))
Press [q] to stop, [?] for help
frame= 240 fps= 12 q=28.0 Lsize= 974kB time=00:00:10.00 bitrate= 797.6kbits/s
video:709kB audio:158kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 12.433780%
[libx264 @ 0x2408220] frame I:2 Avg QP:13.73 size: 5558
[libx264 @ 0x2408220] frame P:109 Avg QP:21.15 size: 5161
[libx264 @ 0x2408220] frame B:129 Avg QP:21.30 size: 1181
[libx264 @ 0x2408220] consecutive B-frames: 17.5% 24.2% 25.0% 33.3%
[libx264 @ 0x2408220] mb I I16..4: 56.3% 37.6% 6.0%
[libx264 @ 0x2408220] mb P I16..4: 5.7% 11.7% 0.6% P16..4: 29.5% 10.9% 5.9% 0.0% 0.0% skip:35.7%
[libx264 @ 0x2408220] mb B I16..4: 1.0% 2.2% 0.1% B16..8: 19.3% 3.9% 0.8% direct: 0.8% skip:72.0% L0:39.1% L1:50.6% BI:10.3%
[libx264 @ 0x2408220] 8x8 transform intra:63.0% inter:69.7%
[libx264 @ 0x2408220] coded y,uvDC,uvAC intra: 29.6% 40.0% 6.6% inter: 10.2% 12.2% 1.8%
[libx264 @ 0x2408220] i16 v,h,dc,p: 24% 33% 10% 33%
[libx264 @ 0x2408220] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 25% 30% 32% 2% 3% 2% 3% 2% 2%
[libx264 @ 0x2408220] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 23% 20% 26% 4% 11% 5% 6% 3% 3%
[libx264 @ 0x2408220] i8c dc,h,v,p: 62% 22% 13% 3%
[libx264 @ 0x2408220] Weighted P-Frames: Y:16.5% UV:8.3%
[libx264 @ 0x2408220] ref P L0: 68.7% 13.1% 14.4% 3.7% 0.0%
[libx264 @ 0x2408220] ref B L0: 80.9% 18.2% 0.9%
[libx264 @ 0x2408220] ref B L1: 87.5% 12.5%
[libx264 @ 0x2408220] kb/s:580.21
Console output of Concat Demuxer (I left out the creating list in .txt file step)
ffmpeg -f concat -i mylist.txt -vcodec copy -bsf:a aac_adtstoasc output.mp4
ffmpeg version 2.2.git Copyright (c) 2000-2014 the FFmpeg developers
built on Jun 23 2014 18:58:08 with gcc 4.8 (Ubuntu 4.8.2-19ubuntu1)
configuration: --prefix=/home/tobi/ffmpeg_build --extra-cflags=- I/home/tobi/ffmpeg_build/include --extra-ldflags=-L/home/tobi/ffmpeg_build/lib -- bindir=/home/tobi/bin --extra-libs=-ldl --enable-gpl --enable-libass --enable-libfdk-aac --enable- libfreetype --enable-libmp3lame --enable-libopus --enable-libtheora --enable-libvorbis --enable- libvpx --enable-libx264 --enable-nonfree --enable-x11grab
libavutil 52. 89.100 / 52. 89.100
libavcodec 55. 66.100 / 55. 66.100
libavformat 55. 43.100 / 55. 43.100
libavdevice 55. 13.101 / 55. 13.101
libavfilter 4. 7.100 / 4. 7.100
libswscale 2. 6.100 / 2. 6.100
libswresample 0. 19.100 / 0. 19.100
libpostproc 52. 3.100 / 52. 3.100
[concat @ 0x1ff0c20] Estimating duration from bitrate, this may be inaccurate
Input #0, concat, from 'mylist.txt':
Duration: 00:00:00.05, start: 0.000000, bitrate: 6 kb/s
Stream #0:0: Video: h264 (High) ([27][0][0][0] / 0x001B), yuv420p, 720x408 [SAR 1:1 DAR 30:17], 23.98 fps, 23.98 tbr, 90k tbn, 47.95 tbc
Stream #0:1: Audio: aac ([15][0][0][0] / 0x000F), 44100 Hz, stereo, fltp, 124 kb/s
Output #0, mp4, to 'output.mp4':
Metadata:
encoder : Lavf55.43.100
Stream #0:0: Video: h264 ([33][0][0][0] / 0x0021), yuv420p, 720x408 [SAR 1:1 DAR 30:17], q=2-31, 23.98 fps, 90k tbn, 90k tbc
Stream #0:1: Audio: aac (libfdk_aac) ([64][0][0][0] / 0x0040), 44100 Hz, stereo, s16, 128 kb/s
Metadata:
encoder : Lavc55.66.100 libfdk_aac
Stream mapping:
Stream #0:0 -> #0:0 (copy)
Stream #0:1 -> #0:1 (aac (native) -> aac (libfdk_aac))
Press [q] to stop, [?] for help
[libfdk_aac @ 0x21ee7e0] Queue input is backward in time.38 bitrate= 388.3kbits/s
[mp4 @ 0x21ec7a0] Non-monotonous DTS in output stream 0:1; previous: 442367, current: 441650; changing to 442368. This may result in incorrect timestamps in the output file.
frame= 480 fps=289 q=-1.0 Lsize= 1805kB time=00:00:20.06 bitrate= 736.6kbits/s
video:1470kB audio:316kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.049592%
I tested it with another video and got the audio gaps between concatenated segments in that one too. These gaps are visible when I open the file in Audacity.