ffmpeg - complex filter putting 2 videos side by side results in audio out of sync

2

1

We have 2 videos. One is a professionally produced video by an artist that performs a song stopping in certain parts to let the user sing. The second video is a stream that is recorded while the user watches the first video and sings during the parts where the artist stopped. The idea is to then merge these 2 videos into a single video with them sitting side by side with the audio merged. The problem we have is that the video stream contains dropped frames and when merging the 2 videos these dropped frames result in the audio being out of sync between the 2 videos.

The command we use to merge the two videos:

ffmpeg -i "OCTAVIA - DESPUES DE TI_330X220.mp4" -i input.flv -filter_complex "[0:v]setpts=PTS-STARTPTS, pad=iw*2:ih[bg]; [1:v]setpts=PTS-STARTPTS[fg]; [bg][fg]overlay=w; amix=duration=longest" -strict -2 merged.mp4

The output we get from this operation:

ffmpeg version 2.2.2 Copyright (c) 2000-2014 the FFmpeg developers
  built on May  8 2014 19:48:20 with llvm-gcc 4.2.1 (LLVM build 2336.11.00)
  configuration: --prefix=/Volumes/Ramdisk/sw --enable-gpl --enable-pthreads --enable-version3 --enable-libspeex --enable-libvpx --disable-decoder=libvpx --enable-libmp3lame --enable-libtheora --enable-libvorbis --enable-libx264 --enable-avfilter --enable-libopencore_amrwb --enable-libopencore_amrnb --enable-filters --enable-libgsm --enable-libvidstab --arch=x86_64 --enable-runtime-cpudetect
  libavutil      52. 66.100 / 52. 66.100
  libavcodec     55. 52.102 / 55. 52.102
  libavformat    55. 33.100 / 55. 33.100
  libavdevice    55. 10.100 / 55. 10.100
  libavfilter     4.  2.100 /  4.  2.100
  libswscale      2.  5.102 /  2.  5.102
  libswresample   0. 18.100 /  0. 18.100
  libpostproc    52.  3.100 / 52.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'OCTAVIA - DESPUES DE TI_330X220.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf55.34.100
  Duration: 00:01:40.62, start: 0.033333, bitrate: 379 kb/s
    Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 330x220 [SAR 1:1 DAR 3:2], 246 kb/s, 24 fps, 24 tbr, 12288 tbn, 48 tbc (default)
    Metadata:
      handler_name    : VideoHandler
    Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 128 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
Input #1, flv, from 'input.flv':
  Metadata:
    creationdate    : Wed May 21 18:52:06
    level           : 3.1
    keyFrameInterval: 15
    bandwith        : 0
    fps             : 15
    codec           : H264Avc
    profile         : baseline
  Duration: 00:01:40.46, start: 0.000000, bitrate: 64 kb/s
    Stream #1:0: Video: h264 (Baseline), yuv420p(tv), 330x220 [SAR 1:1 DAR 3:2], 15.17 fps, 15 tbr, 1k tbn, 30 tbc
    Stream #1:1: Audio: nellymoser, 22050 Hz, mono, flt
File 'merged.mp4' already exists. Overwrite ? [y/N] y
[libx264 @ 0x7ff3f5000600] using SAR=1/1
[libx264 @ 0x7ff3f5000600] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.1 Cache64
[libx264 @ 0x7ff3f5000600] profile High, level 2.1
[libx264 @ 0x7ff3f5000600] 264 - core 142 - H.264/MPEG-4 AVC codec - Copyleft 2003-2014 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=3 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=24 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'merged.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf55.33.100
    Stream #0:0: Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p, 660x220 [SAR 1:1 DAR 3:1], q=-1--1, 12288 tbn, 24 tbc (default)
    Stream #0:1: Audio: aac ([64][0][0][0] / 0x0040), 48000 Hz, stereo, fltp, 128 kb/s (default)
Stream mapping:
  Stream #0:0 (h264) -> setpts
  Stream #0:1 (aac) -> amix:input0
  Stream #1:0 (h264) -> setpts
  Stream #1:1 (nellymoser) -> amix:input1
  overlay -> Stream #0:0 (libx264)
  amix -> Stream #0:1 (aac)
Press [q] to stop, [?] for help
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:02.26 bitrate=  67.3kbits/s    
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:03.47 bitrate= 199.4kbits/s    
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:05.24 bitrate= 241.1kbits/s    
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:07.70 bitrate= 279.7kbits/s    
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:10.04 bitrate= 301.8kbits/s    
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:11.96 bitrate= 323.1kbits/s    
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623e=00:00:14.33 bitrate= 343.0kbits/s    
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
    Last message repeated 10 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
    Last message repeated 16 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
    Last message repeated 8 times
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
    Last message repeated 4 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
    Last message repeated 10 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
    Last message repeated 13 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
    Last message repeated 4 times




- this happens a lot more times





    Last message repeated 7 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
    Last message repeated 19 times
[h264 @ 0x7ff3f403f200] AVC: nal size 21102623
[h264 @ 0x7ff3f403f200] missing picture in access unit with size 41
[Parsed_overlay_3 @ 0x7ff3f4800440] [framesync @ 0x7ff3f4800528] Buffer queue overflow, dropping.
    Last message repeated 7 times
frame= 2414 fps=116 q=-1.0 Lsize=    4709kB time=00:01:40.62 bitrate= 383.4kbits/s    
video:3067kB audio:1573kB subtitle:0 data:0 global headers:0kB muxing overhead 1.476866%
[libx264 @ 0x7ff3f5000600] frame I:11    Avg QP:17.45  size: 15375
[libx264 @ 0x7ff3f5000600] frame P:905   Avg QP:23.36  size:  2650
[libx264 @ 0x7ff3f5000600] frame B:1498  Avg QP:28.24  size:   383
[libx264 @ 0x7ff3f5000600] consecutive B-frames: 12.1%  7.2% 25.2% 55.5%
[libx264 @ 0x7ff3f5000600] mb I  I16..4: 10.6% 49.9% 39.5%
[libx264 @ 0x7ff3f5000600] mb P  I16..4:  0.9%  3.5%  1.2%  P16..4: 15.9% 10.1%  6.6%  0.0%  0.0%    skip:61.8%
[libx264 @ 0x7ff3f5000600] mb B  I16..4:  0.1%  0.2%  0.0%  B16..8: 14.8%  2.7%  0.8%  direct: 0.7%  skip:80.7%  L0:41.5% L1:47.1% BI:11.4%
[libx264 @ 0x7ff3f5000600] 8x8 transform intra:61.0% inter:46.8%
[libx264 @ 0x7ff3f5000600] coded y,uvDC,uvAC intra: 49.2% 31.1% 19.7% inter: 6.3% 3.7% 0.7%
[libx264 @ 0x7ff3f5000600] i16 v,h,dc,p: 46% 20% 16% 18%
[libx264 @ 0x7ff3f5000600] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 28% 16% 31%  3%  4%  5%  4%  5%  4%
[libx264 @ 0x7ff3f5000600] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 31% 17% 15%  5%  6%  7%  6%  6%  7%
[libx264 @ 0x7ff3f5000600] i8c dc,h,v,p: 71% 11% 15%  3%
[libx264 @ 0x7ff3f5000600] Weighted P-Frames: Y:0.1% UV:0.1%
[libx264 @ 0x7ff3f5000600] ref P L0: 71.2% 13.7% 10.2%  4.9%
[libx264 @ 0x7ff3f5000600] ref B L0: 88.8%  9.0%  2.3%
[libx264 @ 0x7ff3f5000600] ref B L1: 95.6%  4.4%
[libx264 @ 0x7ff3f5000600] kb/s:249.78

We have tried converting the flv into an mp4 before attempting the merge, but we still get the buffer overflow statements and the audio is still out of sync.

We have over a 100 videos that would need to be processed like this so we need a non-manual way of solving this issue.

Any pointers on what we could try to fix this wold be appreciated.

The two files I am trying to merge can be found here: http://www.lauchenauer.info/test/input.flv http://www.lauchenauer.info/test/artist.mp4

You can see the sync problem 35 seconds into the song where the artist passes it over to the user where the audio of the user should then start, but after the merge of the 2 files it starts around 8-10 seconds too early.

Lauchenauer

Posted 2014-05-26T03:12:24.487

Reputation: 21

@Lauchenauer Did you ever come up with a solution to this problem? I'm dealing with something similar where I'm trying to mix together two audio tracks but one track starts too early and is out of sync. My video also has dropped frames. – phansen – 2016-06-21T23:18:42.420

Can you provide the input files so others can attempt to duplicate the issue? – llogan – 2014-05-26T16:49:34.757

The two files I am trying to merge can be found here: http://www.lauchenauer.info/test/input.flv http://www.lauchenauer.info/test/artist.mp4

You can see the sync problem 35 seconds into the song where the artist passes it over to the user where the audio of the user should then start, but after the merge of the 2 files it starts around 8-10 seconds too early.

– Lauchenauer – 2014-05-27T14:31:14.147

No answers