0
I generate 200 video files based on audio files generated using sox
, combined with image files. Most clips are shorter than one second, none is longer than 6. I then concatenate these files and there is an overall delay of about 2 seconds in the end result.
I believe this might be due to audio and video tracks being concatenated independently.
I can find out the exact duration of the video and audio track (stream) using ffprobe. In one of the short files alone I can see that the durations differ:
ffprobe file001.webm
Input #0, matroska,webm, from 'file001.webm':
Metadata:
ENCODER : Lavf58.20.100
Duration: 00:00:00.92, start: 0.000000, bitrate: 211 kb/s
Stream #0:0: Video: vp8, yuv420p, 1100x140, SAR 1:1 DAR 55:7, 25 fps, 25 tbr, 1k tbn, 1k tbc (default)
Metadata:
ENCODER : Lavc58.35.100 libvpx
DURATION : 00:00:00.923000000
Stream #0:1: Audio: vorbis, 48000 Hz, stereo, fltp (default)
Metadata:
ENCODER : Lavc58.35.100 libvorbis
DURATION : 00:00:00.908000000
How can I make it so that video and audio tracks in one video file are absolutely exactly the same duration?
I'm using vpx/vorbix/webm (after not being able to understand cause of issues with mpeg2ts) but I will use any format to get it done.
I can also add silence padding to the audio to make them match duration.
would be better suited as a comment, since it's not an answer. I just added some more info to clarify that I can add padding to the audio to achieve the goal (the audio duration is not something that needs to be preserved, it's just that the montage of concatenated videos needs to be in sync) – qubodup – 2019-05-24T18:32:23.370
A comment count not hold that much text. Also audio "padding" won't work because it still need to encode entire audio frames. The only thing possible would be to abuse audio "priming" But I don't know any implementations that support that. – szatmary – 2019-05-24T18:34:48.037