Extracting/Transcoding/Adding Video Streams Using FFmpeg

In this article, we extract audio from a video, transcode it and add it to another video.

This article is a part of the Using FFmpeg series.

Contents

Lets start already!

Extracting a Stream from a Video

One common scenario is extracting audio from a video. First, we discover the streams available and their order:

ffmpeg -i <input file>

For example:

$ ffmpeg -i input.mp4
ffmpeg version 2.8.8-0ubuntu0.16.04.1 Copyright (c) 2000-2016 the FFmpeg developers

  ... bla bla bla ...

    Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yuv420p, 480x360 (... etc)

  ... bla bla bla ...

    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 71 kb/s (default)

  ... bla bla bla ...

Our example file contains two streams, 0:0 (input file 0, stream number 0, which is video) and 0:1 (the audio stream we are about to extract). Notice the audio codec (aac) and the bitrate (71 kb/s). To extract the audio stream:

ffmpeg -i input.mp4 -map 0:1 -c:a copy output.aac

Notice the .aac extension of the output file. FFmpeg normally tries to deduce the desired file format from the output file extension. Then, it transcodes the input into that format. So, if we just did:

ffmpeg -i input.mp4 -map 0:1 output.aac

FFmpeg will decode the input stream, encode it into aac again then write it to the output file. Since our codec is lossy, decoding a stream and re-encoding it will most probably result in a slightly different stream. Every time you perform the encoding, you are modifying the stream a little (or maybe a lot). So, we set the audio codec to copy to prevent FFmpeg from automatically transcoding the stream.

Now, the choice of the output file extension is important. .aac is closer to being raw data than it is to being a container format (?). If you want something more player-friendly, you should probably use mp4 (or m4a, which is the exact same thing (?), but makes some devices happy).

Transcoding a Stream into Another Format

If you have an ancient mp3-player that you cherish, the better aac codec might not be suitable. We will convert our stream into an mp3 while extracting it:

ffmpeg -i input.mp4 -map 0:1 output.mp3

That was easy! We can be more explicit about our audio encoder:

ffmpeg -i input.mp4 -map 0:1 -c:a libmp3lame output.mp3

which could be useful if we used other containers that support more than one format.

We’ve successfully created an mp3, but if you are really doing this for compatibility, then chances are that you don’t want a variable bit-rate mp3. To fix the bit-rate:

ffmpeg -i input.mp4 -map 0:1 -c:a libmp3lame -b:a 96k output.mp3

Note that, even though the original stream had a bit-rate of 71 kb/s (maybe an average bit-rate), we used 96 kb/s. That’s because:

  • we can’t use arbitrary fixed bit-rates with mp3. Our options are: 8, 16, 24, 32, 40, 48, 64, 80, 96, 112, 128, 160, 192, 224, 256, or 320.
  • aac is superior to mp3 at the same bit-rate. That’s why I usually nudge the mp3 bit-rate up a little (but that’s totally my thing and I have no evidence whatsoever about its usefulness).

Adding a Stream to a Video

Lets say you have a video dubbed in your native language. But alas, the video quality is awful, and you can’t find a better looking dubbed video. However, better looking videos of the original language are everywhere. If the original and dubbed videos are of the same length (not belonging to different television systems, like PAL and NTSC, which usually have like 5 minutes difference in full-length movies), and happen to be perfectly synchronized, you have three options:

  • Play them side-by-side (use Open Multiple Files in VLC). Works (sometimes), but is a hassle. And you have to keep the poor quality video even though you don’t need it.
  • Extract the audio from the poor quality video using the above procedure and play it side-by-side with the high-quality video. Better, but still a hassle.
  • Create a new file with the video taken from the high quality video and the audio from dubbed one. That’s what we are going to do now:
ffmpeg \
   -i original.mp4 -i dubbed.mp4 \
   -map 0:0 -c:v copy \
   -map 1:1 -c:a copy \
   output.mp4

Tadaa! No transcoding at all. You can even have both of the audio tracks if you want:

ffmpeg \
   -i original.mp4 -i dubbed.mp4 \
   -map 0:0 \
   -map 1:1 \
   -map 0:1 \
   -c:v copy -c:a copy \
   output.mp4

Notice the order of the streams. In order to make the dubbed track the default one, I’ve placed it prior to the original one. Also, notice that I’ve extracted the codec settings in a separate line (this might not work if there is some stream meta-data that override this order). In FFmpeg:

  • you can have multiple inputs. They should all be specified before specifying outputs using the -i option.
  • anything found on the command line that is not an option is considered an output filename. You can have multiple outputs in a single command.
  • options apply to the next input/output only (except global options) and are reset between files. In other words, you have to set the options in-between different inputs and outputs, and repeat the options where necessary.

So, in a nut-shell, the line:

-c:v copy -c:a copy \

applies to the output file output.mp4. They do refer to the streams, but are options of the file, not the streams. But, what if we want to transcode the dubbed track only and keep the others intact?

ffmpeg \
   -i original.mp4 -i dubbed.mp4 \
   -map 0:0 \
   -map 1:1 \
   -map 0:1 \
   -c:v copy -c:a copy -c:a:0 libmp3lame \
    output.mp4

We’ve just set the encoder of the first audio track (#0) of output.mp4 to libmp3lame. Lets be more explicit about our preferences:

ffmpeg \
   -i original.mp4 -i dubbed.mp4 \
   -map 0:0 \
   -map 1:1 \
   -map 0:1 \
   -c:v:0 copy -c:a:0 libmp3lame -c:a:1 copy \
   output.mp4

That’s exactly the same thing. Yeah, the order of most options doesn’t really matter (unless they are related). We can shuffle them around:

ffmpeg \
   -i original.mp4 -i dubbed.mp4 \
   -map 0:0 -c:v:0 copy \
   -map 1:1 -c:a:0 libmp3lame \
   -map 0:1 -c:a:1 copy \
   output.mp4

and still get the same result. We can also refer to streams using their indices regardless of being video or audio:

ffmpeg \
   -i original.mp4 -i dubbed.mp4 \
   -map 0:0 -c:0 copy \
   -map 1:1 -c:1 libmp3lame \
   -map 0:1 -c:2 copy \
   output.mp4

Again, just different ways of expressing the same thing.

This concludes our tutorial today. Final notes and summing things up:

  • You can have multiple streams of different types in the same container (video/audio/subtitle/attachment/data).
  • The allowed number and/or types of streams is limited by the container format. For example, if you want to have a subtitles track, use .mkv.
  • FFmpeg can extract, transcode and mix different streams with ease.

Thanks for reading! if Allah wills i’ll be writing more about the ffmpeg library (we’ve only scratched the surface). So, stay tuned! (or just read the official documentation!).

References

Comments are closed