FFmpeg and HLS multiple audio renditions

I'm trying to use FFmpeg to produce an HLS playlist which contains multiple audio renditions, but I cannot get the audio & video tracks to sync together. Here is the scenario:

Suppose I have 2 video files, each with 1 audio track
I use FFmpeg to pan the 2 videos together to form a single video, example:

The extracted audio track for each file (transcoded as .mp3)
I want to produce an HLS playlist where the alternative audio tracks are respectively the left & the right audio:

The problem I'm having is that I cannot make the audio sync with the video properly. I have tried a couple of ffmpeg commands, each naive at a different level, and the best case scenario is that I get a synced stream on the Desktop, but on the mobile (where the playback is handled by the device's native player), the video lose sync with audio very quickly as soon as I switch to the other video track.

I'm using ffmpeg 3.1.1.

An example command I have tried, starting from a relatively-simple one, where I map the audio tracks to the segmenter muxer, and the video to the hls:

ffmpeg -i dual.mp4 -i audio_left.mp3 -i audio_right.mp3 \
-threads 0 -muxdelay 0 -y \
-map 0 -pix_fmt yuv420p -vsync 1 -async 1 -vcodec libx264 -r 29.97 -g 60 -refs 3 -f hls -hls_time 10 -hls_list_size 0 video/index.m3u8 \
-map 1 -acodec aac -strict experimental -async 1 -ar 44100 -ab 96k -f segment -segment_time 10 -segment_list_size 0 -segment_list_flags -cache -segment_format aac -segment_list audio1/audio1.m3u8 audio1/audio1%d.aac \
-map 2 -acodec aac -strict experimental -async 1 -ar 44100 -ab 96k -f segment -segment_time 10 -segment_list_size 0 -segment_list_flags -cache -segment_format aac -segment_list audio2/audio2.m3u8 audio2/audio2%d.aac

To more complex like outputting the raw mpegts container, and then slice the tracks up:

ffmpeg -i dual_short.mp4 -i audio_left_short.mp3 -i audio_right_short.mp3 \
-threads 0 -muxdelay 0 -y \
-map 0:v -map 1 -map 2 -codec copy -pix_fmt yuv420p -vsync 1 -async 1 -shortest -f mpegts pipe:1 | ffmpeg-3.1.1 -i pipe:0 \
-map 0:0 -vcodec copy -r 29.97 -g 60 -refs 3 -bsf:v h264_mp4toannexb -f hls -hls_time 10 -hls_list_size 0 video/index.m3u8 \
-map 0:1 -f ssegment -segment_time 10 -segment_list_size 0 -segment_format aac -segment_list audio1/audio1.m3u8 audio1/audio1_%d.aac \
-map 0:2 -f ssegment -segment_time 10 -segment_list_size 0 -segment_format aac -segment_list audio2/audio2.m3u8 audio2/audio2_%d.aac

I'm no audio/video expert, so I'm pretty sure there is something fundamentally flawed in my reasoning, so I'm asking you guys for help and guidance. In particular:

Is what I'm trying to do here unfeasible? Another way to put it is given N audio tracks, recorded in sync with the original video, to produce an HLS playlist with the audio always lip-synced?
Is the video FPS & Audio's bitrates the cause of the A/V sync problem? Is there even a correlation?
Does the different level of quality of the video (e.g. bitrate) has an effect over sync?
Will the target audio container I chose (mp3 vs aac) influence the sync?
Shall I use a single command with multiple inputs or work on each stream separately?

As you can see I'm quite lost. I did search extensively over the internet, watch Apple's "Effective HLS" talk from WWDC 2012, but the information on how to produce effective Multiple Audio Rendition playlists seems to be scarce on the internet.

Thanks for any pointers.

Alfredo Di Napoli

Posted 2016-07-11T11:45:20.903

Reputation: 171

FFmpeg and HLS multiple audio renditions

Answers