2
1
We have 2 videos. One is a professionally produced video by an artist that performs a song stopping in certain parts to let the user sing. The second video is a stream that is recorded while the user watches the first video and sings during the parts where the artist stopped. The idea is to then merge these 2 videos into a single video with them sitting side by side with the audio merged. The problem we have is that the video stream contains dropped frames and when merging the 2 videos these dropped frames result in the audio being out of sync between the 2 videos.
The command we use to merge the two videos:
ffmpeg -i "OCTAVIA - DESPUES DE TI_330X220.mp4" -i input.flv -filter_complex "[0:v]setpts=PTS-STARTPTS, pad=iw*2:ih[bg]; [1:v]setpts=PTS-STARTPTS[fg]; [bg][fg]overlay=w; amix=duration=longest" -strict -2 merged.mp4
The output we get from this operation:
ffmpeg version 2.2.2 Copyright (c) 2000-2014 the FFmpeg developers
built on May 8 2014 19:48:20 with llvm-gcc 4.2.1 (LLVM build 2336.11.00)
configuration: --prefix=/Volumes/Ramdisk/sw --enable-gpl --enable-pthreads --enable-version3 --enable-libspeex --enable-libvpx --disable-decoder=libvpx --enable-libmp3lame --enable-libtheora --enable-libvorbis --enable-libx264 --enable-avfilter --enable-libopencore_amrwb --enable-libopencore_amrnb --enable-filters --enable-libgsm --enable-libvidstab --arch=x86_64 --enable-runtime-cpudetect
libavutil 52. 66.100 / 52. 66.100
libavcodec 55. 52.102 / 55. 52.102
libavformat 55. 33.100 / 55. 33.100
libavdevice 55. 10.100 / 55. 10.100
libavfilter 4. 2.100 / 4. 2.100
libswscale 2. 5.102 / 2. 5.102
libswresample 0. 18.100 / 0. 18.100
libpostproc 52. 3.100 / 52. 3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'OCTAVIA - DESPUES DE TI_330X220.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf55.34.100
Duration: 00:01:40.62, start: 0.033333, bitrate: 379 kb/s
Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 330x220 [SAR 1:1 DAR 3:2], 246 kb/s, 24 fps, 24 tbr, 12288 tbn, 48 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
Metadata:
handler_name : SoundHandler
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
Input #1, flv, from 'input.flv':
Metadata:
creationdate : Wed May 21 18:52:06
level : 3.1
keyFrameInterval: 15
bandwith : 0
fps : 15
codec : H264Avc
profile : baseline
Duration: 00:01:40.46, start: 0.000000, bitrate: 64 kb/s
Stream #1:0: Video: h264 (Baseline), yuv420p(tv), 330x220 [SAR 1:1 DAR 3:2], 15.17 fps, 15 tbr, 1k tbn, 30 tbc
Stream #1:1: Audio: nellymoser, 22050 Hz, mono, flt
File 'merged.mp4' already exists. Overwrite ? [y/N] y
[libx264 @ 0x7ff3f5000600] using SAR=1/1
[libx264 @ 0x7ff3f5000600] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.1 Cache64
[libx264 @ 0x7ff3f5000600] profile High, level 2.1
[libx264 @ 0x7ff3f5000600] 264 - core 142 - H.264/MPEG-4 AVC codec - Copyleft 2003-2014 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=3 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=24 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'merged.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf55.33.100
Stream #0:0: Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p, 660x220 [SAR 1:1 DAR 3:1], q=-1--1, 12288 tbn, 24 tbc (default)
Stream #0:1: Audio: aac ([64][0][0][0] / 0x0040), 48000 Hz, stereo, fltp, 128 kb/s (default)
Stream mapping:
Stream #0:0 (h264) -> setpts
Stream #0:1 (aac) -> amix:input0
Stream #1:0 (h264) -> setpts
Stream #1:1 (nellymoser) -> amix:input1
overlay -> Stream #0:0 (libx264)
amix -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:02.26 bitrate= 67.3kbits/s
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:03.47 bitrate= 199.4kbits/s
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:05.24 bitrate= 241.1kbits/s
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:07.70 bitrate= 279.7kbits/s
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:10.04 bitrate= 301.8kbits/s
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:11.96 bitrate= 323.1kbits/s
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:14.33 bitrate= 343.0kbits/s
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 10 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 16 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 8 times
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 4 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 10 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 13 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 4 times
- this happens a lot more times
Last message repeated 7 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 19 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
Last message repeated 7 times
frame= 2414 fps=116 q=-1.0 Lsize= 4709kB time=00:01:40.62 bitrate= 383.4kbits/s
video:3067kB audio:1573kB subtitle:0 data:0 global headers:0kB muxing overhead 1.476866%
[libx264 @ 0x7ff3f5000600] frame I:11 Avg QP:17.45 size: 15375
[libx264 @ 0x7ff3f5000600] frame P:905 Avg QP:23.36 size: 2650
[libx264 @ 0x7ff3f5000600] frame B:1498 Avg QP:28.24 size: 383
[libx264 @ 0x7ff3f5000600] consecutive B-frames: 12.1% 7.2% 25.2% 55.5%
[libx264 @ 0x7ff3f5000600] mb I I16..4: 10.6% 49.9% 39.5%
[libx264 @ 0x7ff3f5000600] mb P I16..4: 0.9% 3.5% 1.2% P16..4: 15.9% 10.1% 6.6% 0.0% 0.0% skip:61.8%
[libx264 @ 0x7ff3f5000600] mb B I16..4: 0.1% 0.2% 0.0% B16..8: 14.8% 2.7% 0.8% direct: 0.7% skip:80.7% L0:41.5% L1:47.1% BI:11.4%
[libx264 @ 0x7ff3f5000600] 8x8 transform intra:61.0% inter:46.8%
[libx264 @ 0x7ff3f5000600] coded y,uvDC,uvAC intra: 49.2% 31.1% 19.7% inter: 6.3% 3.7% 0.7%
[libx264 @ 0x7ff3f5000600] i16 v,h,dc,p: 46% 20% 16% 18%
[libx264 @ 0x7ff3f5000600] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 28% 16% 31% 3% 4% 5% 4% 5% 4%
[libx264 @ 0x7ff3f5000600] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 31% 17% 15% 5% 6% 7% 6% 6% 7%
[libx264 @ 0x7ff3f5000600] i8c dc,h,v,p: 71% 11% 15% 3%
[libx264 @ 0x7ff3f5000600] Weighted P-Frames: Y:0.1% UV:0.1%
[libx264 @ 0x7ff3f5000600] ref P L0: 71.2% 13.7% 10.2% 4.9%
[libx264 @ 0x7ff3f5000600] ref B L0: 88.8% 9.0% 2.3%
[libx264 @ 0x7ff3f5000600] ref B L1: 95.6% 4.4%
[libx264 @ 0x7ff3f5000600] kb/s:249.78
We have tried converting the flv into an mp4 before attempting the merge, but we still get the buffer overflow statements and the audio is still out of sync.
We have over a 100 videos that would need to be processed like this so we need a non-manual way of solving this issue.
Any pointers on what we could try to fix this wold be appreciated.
The two files I am trying to merge can be found here: http://www.lauchenauer.info/test/input.flv http://www.lauchenauer.info/test/artist.mp4
You can see the sync problem 35 seconds into the song where the artist passes it over to the user where the audio of the user should then start, but after the merge of the 2 files it starts around 8-10 seconds too early.
@Lauchenauer Did you ever come up with a solution to this problem? I'm dealing with something similar where I'm trying to mix together two audio tracks but one track starts too early and is out of sync. My video also has dropped frames. – phansen – 2016-06-21T23:18:42.420
Can you provide the input files so others can attempt to duplicate the issue? – llogan – 2014-05-26T16:49:34.757
The two files I am trying to merge can be found here: http://www.lauchenauer.info/test/input.flv http://www.lauchenauer.info/test/artist.mp4
You can see the sync problem 35 seconds into the song where the artist passes it over to the user where the audio of the user should then start, but after the merge of the 2 files it starts around 8-10 seconds too early.