How do I use ffmpeg to split a video into images and then reassemble exactly the same?

4

1

I have an MP4 which has an audio and video streams and I need to modify every frame of the video. The pipeline I'm using is:

Split the audio out of the video:

ffmpeg -i in.mp4 -vn -acodec copy out.m4a

Then split the video into an image file per frame:

ffmpeg -i in.mp4 img%04d.png

Then I do some processing on the img%04d.png files (assume null operation for now) and want to reassemble the video.

ffmpeg -i img%04d.png -i out.m4a -c:v libx264 -r 25 -pix_fmt yuv420p -c:a copy -shortest out.mp4

This basically works but my problem is I need to match the input mp4 format as closely as possible and I'm struggling to work out how to do this.

Example:

Input MP4:

Duration: 00:00:10.01, start: 0.010000, bitrate: 24589 kb/s
  Stream #0:0(eng): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 4096x2048 [SAR 1:1 DAR 2:1], 27736 kb/s, 25 fps, 25 tbr, 25k tbn, 50 tbc
  Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 189 kb/s

Output MP4 after processing:

Duration: 00:00:10.00, start: 0.000000, bitrate: 4458 kb/s
  Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p, 4096x2048 [SAR 1:1 DAR 2:1], 4272 kb/s, 25 fps, 25 tbr, 12800 tbn, 50 tbc
  Stream #0:1(eng): Audio: aac (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 189 kb/s

As I've split the audio away, that matches exactly. I've used the -r and -pix_fmt options to force the frame rate and pixel formats to match.

However, the duration, start, bitrate, video stream language and tbn have all changed.

I have tried to fix the bitrate by using the arguments:

-b:v 27736k -minrate 27736k -maxrate 27736k

but I ended up with a bit rate of 41 Mb/s instead of 27 Mb/s.

I don't expect bitrate to match exactly but I need quality to be pretty much unchanged, and I need the other elements to stay the same.

Can anyone tell me whether I can somehow use an existing mp4 to control the config of one I'm generating, or what arguments I need to use to manually ensure the result is a close match.

Update 1 - updated

Tried command suggested by Mulvya:

ffmpeg -i img%04d.png -i out.m4a \
-c:v libx264 -b:v 27736k -bufsize 30000k \
-r 25 -video_track_timescale 25000 -output_ts_offset 0.01 -pix_fmt yuv420p \
-c:a copy -metadata:s:v:0 language=eng -metadata:s:a:0 language=eng -shortest out.mp4

The resultant bitrate was 24502 kb/s which is much closer and the language for Stream #0.0 was correctly set to english. tbn and start are correct, but the length is short.

I'm wondering if the problem is something to do with the initial generation. The original output produced 251 frame images which assuming one start, one end is exactly 10 seconds at 25 fps. I created this video by taking an existing video and cutting it down to 10 seconds using:

ffmpeg -i in.mp4 -ss 0 -c copy -t 10 out.mp4

and that command results in a video that is 10.01 in length. 0.01 is a lot less than the 0.04 seconds per frame of a 25 fps video.

I am using ffmpeg version N-78636-g45d3af9 from Zeranoe build site.

Update 2

Adding output from ffmpeg command

E:\ImageTest\video>c:\ffmpeg\bin\ffmpeg -thread_queue_size 512 -i img%04d.png -i out.m4a -c:v libx264 -r 25 -pix_fmt yuv420p -c:a copy -shortest out.mp4
ffmpeg version N-78636-g45d3af9 Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 5.3.0 (GCC)
  configuration: --enable-gpl --enable-version3 --disable-w32threads --enable-avisynth --enable-bzlib --enable-fontconfig --enable-frei0r --enable-gnutls --enable-iconv --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libdcadec --enable-libfreetype --enable-libgme --enable-libgsm --enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libopus --enable-librtmp --enable-libschroedinger --enable-libsoxr --enable-libspeex --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvo-amrwbenc --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs --enable-libxvid --enable-libzimg --enable-lzma --enable-decklink --enable-zlib
  libavutil      55. 18.100 / 55. 18.100
  libavcodec     57. 24.105 / 57. 24.105
  libavformat    57. 26.100 / 57. 26.100
  libavdevice    57.  0.101 / 57.  0.101
  libavfilter     6. 35.100 /  6. 35.100
  libswscale      4.  0.100 /  4.  0.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, image2, from 'img%04d.png':
  Duration: 00:00:10.04, start: 0.000000, bitrate: N/A
    Stream #0:0: Video: png, rgb24(pc), 4096x2048 [SAR 1:1 DAR 2:1], 25 tbr, 25 tbn, 25 tbc
Input #1, mov,mp4,m4a,3gp,3g2,mj2, from 'out.m4a':
  Metadata:
    major_brand     : M4A
    minor_version   : 512
    compatible_brands: isomiso2
    encoder         : Lavf55.0.100
  Duration: 00:00:10.00, start: 0.000000, bitrate: 191 kb/s
    Stream #1:0(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 189 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
File 'out.mp4' already exists. Overwrite ? [y/N] y
[libx264 @ 0000015f02f52b40] using SAR=1/1
[libx264 @ 0000015f02f52b40] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2
[libx264 @ 0000015f02f52b40] profile High, level 5.1
[libx264 @ 0000015f02f52b40] 264 - core 148 r2665 a01e339 - H.264/MPEG-4 AVC codec - Copyleft 2003-2016 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=18 lookahead_threads=3 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, mp4, to 'out.mp4':
  Metadata:
    encoder         : Lavf57.26.100
    Stream #0:0: Video: h264 (libx264) ([33][0][0][0] / 0x0021), yuv420p, 4096x2048 [SAR 1:1 DAR 2:1], q=-1--1, 25 fps, 12800 tbn, 25 tbc
    Metadata:
      encoder         : Lavc57.24.105 libx264
    Side data:
      cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: -1
    Stream #0:1(eng): Audio: aac (LC) ([64][0][0][0] / 0x0040), 48000 Hz, stereo, 189 kb/s (default)
    Metadata:
      handler_name    : SoundHandler
Stream mapping:
  Stream #0:0 -> #0:0 (png (native) -> h264 (libx264))
  Stream #1:0 -> #0:1 (copy)
Press [q] to stop, [?] for help
frame=  251 fps= 10 q=28.0 Lsize=    4746kB time=00:00:10.00 bitrate=3885.8kbits/s speed=0.416x
video:4507kB audio:231kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.155678%
[libx264 @ 0000015f02f52b40] frame I:2     Avg QP:17.80  size:304307
[libx264 @ 0000015f02f52b40] frame P:82    Avg QP:19.93  size: 44228
[libx264 @ 0000015f02f52b40] frame B:167   Avg QP:21.34  size: 12982
[libx264 @ 0000015f02f52b40] consecutive B-frames:  0.8% 15.9% 46.6% 36.7%
[libx264 @ 0000015f02f52b40] mb I  I16..4: 28.9% 55.3% 15.8%
[libx264 @ 0000015f02f52b40] mb P  I16..4:  5.8%  7.6%  0.1%  P16..4: 24.5%  2.9%  2.9%  0.0%  0.0%    skip:56.3%
[libx264 @ 0000015f02f52b40] mb B  I16..4:  1.1%  1.2%  0.0%  B16..8: 17.1%  0.6%  0.0%  direct: 1.7%  skip:78.2%  L0:48.7% L1:50.7% BI: 0.6%
[libx264 @ 0000015f02f52b40] 8x8 transform intra:55.5% inter:88.7%
[libx264 @ 0000015f02f52b40] coded y,uvDC,uvAC intra: 20.3% 28.0% 3.4% inter: 4.1% 10.1% 0.0%
[libx264 @ 0000015f02f52b40] i16 v,h,dc,p: 33% 43% 13% 12%
[libx264 @ 0000015f02f52b40] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 13% 22% 60%  1%  1%  0%  2%  0%  2%
[libx264 @ 0000015f02f52b40] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 53% 11%  3%  4%  2%  6%  2%  5%
[libx264 @ 0000015f02f52b40] i8c dc,h,v,p: 57% 30% 10%  3%
[libx264 @ 0000015f02f52b40] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0000015f02f52b40] ref P L0: 62.4%  4.7% 20.4% 12.5%
[libx264 @ 0000015f02f52b40] ref B L0: 66.2% 25.9%  7.9%
[libx264 @ 0000015f02f52b40] ref B L1: 86.0% 14.0%
[libx264 @ 0000015f02f52b40] kb/s:5102.21

Jules

Posted 2016-02-23T13:35:01.493

Reputation: 155

Please include the command and the full, uncut command line output from your conversion. – slhck – 2016-02-23T13:44:33.913

Added the log as requested – Jules – 2016-02-23T15:18:08.957

Answers

3

Try

ffmpeg -i img%04d.png -i out.m4a \
-c:v libx264 -b:v 27736k -bufsize 30000k \
-r 25 -video_track_timescale 25000 -output_ts_offset 0.01 -pix_fmt yuv420p \
-c:a copy -metadata:s:v:0 language=eng -metadata:s:a:0 language=eng -shortest out.mp4

(Depending on whether the image or audio stream is shorter, the duration may not match)

Gyan

Posted 2016-02-23T13:35:01.493

Reputation: 21 016

I got the following messages: – Jules – 2016-02-23T14:48:08.290

Messages haven't come through. Paste them in the question (using code formatting). – Gyan – 2016-02-23T14:55:35.283

I got unrecognized option for 'video_track_timescale' and 'output_ts_offset'. Removing these options for now, the bit rate ended up as 14502 kb/s which is much closer and the language got set correctly. Start and duration ended up unchanged without the additional options. I am using a build I downloaded 23/02/16 from https://ffmpeg.zeranoe.com/builds/

– Jules – 2016-02-23T14:55:40.950

Recheck for syntax, please. I'm also using a Feb Zeranoe build and both options work. – Gyan – 2016-02-23T15:00:15.230

Tried again with those options and this time they worked. Don't know what mistake I made first time around. tbn now matches. Start time correct, but duration now short by 0.01. I wondered if the problem was with my extraction do I need a frame for beginning and end, could one of those be missing? – Jules – 2016-02-23T15:17:58.270