Twitch has a post about this. They explain that they decided to use their own program for several reasons; one of them was that ffmpeg doesn't let you run different x264 instances in different threads, but instead devotes all specified threads to one frame in one output before moving on to the next output.
If you aren't doing real-time streaming, you have more luxury. The 'correct' way is probably to encode at one resolution with just the GOP size specified with -g, and then encode the other resolutions forcing keyframes at the same places.
If you wanted to do that, you might use ffprobe to get the keyframe times and then use a shell script or an actual programming language to convert that into an ffmpeg command.
But for most content, there's very little difference between having one keyframe every 5 seconds and two keyframes every 5 seconds (one forced and one from scenecut). This is about the average I-frame size vs the size of P-frames and B-frames. If you use x264 with typical settings (the only reason I think you should do anything to affect these is if you set -qmin, as a poor way of preventing x264 from using bitrate on easy content; this limits all frame types to the same value, I think) and get a result like I-frame average size of 46 kB, P-frame 24 kB, B-frame 17 kB (half as frequent as P-frames), then an extra I-frame every second at 30 fps is only a 3% increase in file size. The difference between h264 and h263 might be made up of a bunch of 3% decreases, but a single one isn't very important.
On other types of content, frame sizes will be different. To be fair, this is about temporal complexity and not spatial complexity, so it isn't just easy content vs hard content. But generally, streaming video sites have a bitrate limit, and content with relatively large I-frames is easy content that will be encoded at high quality no matter how many extra keyframes are added. It's wasteful, but this waste will usually not be noticed. The most wasteful case is probably a video that's just a static image accompanying a song, where each keyframe is exactly the same.
One thing I'm not sure of is how forced keyframes interact with the rate limiter set with -maxrate and -bufsize. I think even YouTube has had recent problems correctly configuring buffer settings to give consistent quality. If you're just using average bitrate settings as can be seen by some sites (since you can inspect x264's options in the header/mov atom? with a hex editor) then the buffer model isn't a problem, but if you're serving user-generated content, average bitrate encourages users to add a black screen at the end of their video.
Ffmpeg's -g option, or any other encoder option that you use, is mapped to the encoder-specific option. So '-x264-params keyint=GOPSIZE' is equivalent to '-g GOPSIZE'.
One problem with using scene detection is if you prefer keyframes near specific numbers for whatever reason. If you specify keyframes every 5 seconds and use scene detection, and there's a scene change at 4.5, then it should be detected, but then the next keyframe will be at 9.5. If the time keeps getting stepped up like this, you could end up with keyframes at 42.5, 47.5, 52.5, etc., instead of 40, 45, 50, 55. Conversely, if there's a scene change at 5.5, then there will be a keyframe at 5 and 5.5 will be too early for another one. Ffmpeg doesn't let you specify "make a keyframe here if there's no scene change within the next 30 frames". Someone who understands C could add that option, though.
For variable-frame-rate video, when you're not live-streaming like Twitch, you should be able to use scene changes without converting permanently to constant frame-rate. If you use the 'select' filter in ffmpeg and use the 'scene' constant in the expression, then the debug output (-v debug or press '+' several times while encoding) shows the scene change number. This is probably different from, and not as useful as, the number used by x264, but it could still be useful.
The procedure, then, would probably be to do a test video that's only for keyframe changes, but maybe could be used for rate control data if using 2-pass. (Not sure if the generated data is at all useful for different resolutions and settings; the macroblock-tree data won't be.) Convert it to constant-framerate video, but see this bug about stuttering output when halving framerate if you ever decide to use the fps filter for other purposes. Run it through x264 with your desired keyframe and GOP settings.
Then just use these keyframe times with the original variable frame-rate video.
If you allow completely crazy user-generated content with a 20-second gap between frames, then for the variable frame-rate encode, you could split the output, use fps filter, somehow use select filter (maybe build a really long expression that has every keyframe time)... or maybe you could use the test video as input and either decode only keyframes, if that ffmpeg option works, or use the select filter to select keyframes. Then scale it to the correct size (there's even a scale2ref filter for this) and overlay the original video on it. Then use the interleave filter to combine these destined-to-be forced keyframes with the original video. If this results in two frames that are 0.001 sec apart that the interleave filter doesn't prevent, then address this problem yourself with another select filter. Dealing with frame buffer limits for the interleave filter could be the main problem here. These could all work: use some kind of filter to buffer the denser stream (fifo filter?); refer to the input file multiple times so it's decoded more than once and frames don't have to be stored; use the 'streamselect' filter, which I have never done, at exactly the times of the keyframes; improve the interleave filter by changing its default behaviour or adding an option to output the oldest frame in a buffer instead of dropping a frame.
Just a note: Since this is a Q&A site and not really a discussion forum where posts are ordered chronologically, it's best to put all the information into one answer, so that people looking for a solution just have to read one post and not to look at who posted what, when :) I merged your answers and gave you a +1 on this, too. Since cross posting is not allowed, I'd suggest you delete your question on the Video site. People will find the answer(s) here.
– slhck – 2015-05-02T10:36:15.1101I just had one more thought (actually it was raised on the FFmpeg mailing list). When you use
force_key_frames
, it kind of messes up the x264 bit allocation algorithm, so it may give you worse quality than simply setting a fixed keyframe interval. – slhck – 2015-05-05T19:25:19.590Holy crap. Yet one more reason to have FFMPEG provide a codec-nonspecfic way to do this, an argument that would "do the best thing for the codec in question". I tried to file a ticket for this with FFMPEG's trac, but the bounced :-( – Mark Gerolimatos – 2015-05-05T23:16:08.687
@slhck: Could you give more details please? I've looked in the mailing list archives in May 2015 but couldn't find anything. The bottom line would be to forget about "Method 3" and stick to "Method 1". – schieferstapel – 2016-10-04T18:32:39.737
3@MarkGerolimatos : about
-g
, you say, "It neither appears to work, ... nor does it appear to be used in the code.". I checked and the the input ofg
is stored inavctx->gop_size
and that libx264 makes use of it:x4->params.i_keyint_max = avctx->gop_size;
. When I probe this generated test file:ffmpeg -i a-test-file.mp4 -g 37 -t 15 gtest.mp4
, I get keyframes at exactly0,37,74,111,148,185,222,259,296,333,370
. A GOP could be cut short if scene change is triggered, and for that-sc_threshold
could be set, which is also picked up by x264. – Gyan – 2017-03-03T11:03:06.383Method 2
-g $gopsize
seems to work fine for me, using ffmpeg v. 4 and libx264. It sets a keyframe at least every $gopsize frames, and sometimes at a shorter interval, probably because of a scene change. I also could not find any reference of this switch being deprecated. So at least for x264, as of 2019, I will be using that, which is short, simple and seems to do exactly what I want. – mivk – 2019-06-01T21:22:00.453