Video Codec/Container


Without any form of compression in video streaming, the network can’t transmit the raw data. Also, due to large file sizes, storing data on limited capacity of disk drives is not possible. That’s why we need to use compression, especially in video surveillance systems. Video compression removes redundant video data, so that the video file can be transmitted or stored effectively. Video content is encoded and decoded by using a video codec method, inside a container format, so the video quality is not degraded at the time of transferring over the network. We study the concept and some different types of video codecs and video containers along with their differences.

What is video codec?

A codec is a software used to compress or decompress a digital media file for transmission over a data network. In fact, a pair of encoding and decoding algorithms that work together is called a video codec, so that encoder uses these algorithms to effectively compress the size of the video file, and then decoder decompress it when needed. Some codecs include both of these components and others include one of them. Moreover, codecs are divided into two categories: lossless and lossy codec. In lossless codec all the information is kept in the original stream, thus the video quality is preserved. On the other hand, in lossy codec due to using lower data bandwidth and missing some of the original data to achieve the best compression, the quality will be reduced.

There are different standards of codec which use different technologies to encode and decode the video file related to intended application. Since video content that is compressed using one standard cannot be decompressed with other standard, different implementation of video codecs are normally not compatible with each other. Because, one algorithm cannot correctly decode the output from another algorithm.

However, implementing many different algorithms in the same software or hardware is possible, so multiple formats can be compressed. Utilizing different methods of compressing data leads to variant bitrate, quality and latency. The time it takes to compress, send, decompress and display a file, called latency.

How video compression can help video surveillance?

Video compression methods use a codec to reduce or eliminate unnecessary files or frames from video files, without any significant degradation in final video. This makes the video file smaller, so more video can be stored on NVR hard drives or files can be kept for longer periods of time.

Due to large capacity of high resolution video files, video compression is a valuable tool when the surveillance system has storage and bandwidth limitation. It is worth mentioning that to achieve desired image quality in spite of compression, the best balance of image quality and compression method should be found.

In IP video, encoding would be done by the IP camera encoder and the decoding is normally done on the computer or device which is displaying the live video.

Compressing video leads to file transferring over network without significant delay, resulting in high speed data transfer, which is especially important in mobile viewing with a smart phone in video surveillance.

Different types of video codec:

Similar to a digital picture camera, a network camera captures individual images and compresses them into a format. The camera captures and compresses individual images per second (fps), and then make them a continuous flow of images over a network to a viewing station. At a frame rate of about 16 fps and above, the viewer will perceive full motion video. Since each individual image is a complete compressed image, they will have the same quality, determined by the compression level defined for the network camera. So, Video compression is performed automatically by surveillance camera and choosing the compression level is an important issue to achieve the best video quality. Here, we study some of video compression methods.


Motion JPEG (MJPEG) is a video codec where each video field (frame) is separately compressed into a JPEG image. As JPEG is a compression method to compress the images, MJPEG is an algorithm to compress multiple frames of videos and send them as individual JPEG images. The resulting quality of videos is independent from the motion in the image, so quality is not decreased when the video contains lots of movement.

Due to offering minimum latency in image processing and maintaining image quality during transmission over low bandwidth availability, MJPEG is still a usable compression format in spite of being an old lossy codec.


MPEG, standing for Moving Picture Experts Group, is one of the biggest families in video codec and the most common video format. Its algorithms compress data into small bits that can be easily transmitted and then decompressed. Since some of data will be removed in MPEG, this method is a lossy compression, but this defect is generally invisible to the human eye. The most common types of MPEG include MPEG1 (used in the production of VCD and the download of some video clips), MPEG2 (used in the production of the DVD and also in some of the HDTV and high demand video editing), and MPEG4. MPEG4 transmits video and images over a narrow bandwidth, meaning that it reduces the network bandwidth used by the surveillance system. Also, MPEG4 reduces the amount of needed storage and increases the amount of time that video can be stored, which make it beneficial for video surveillance. On the other hand, MPEG4 can identify and deal with separate audio and video objects in the frame, which allows individual elements to be compressed more efficiently. Hence, it can mix video with text, graphics and 2-D and 3-D animation layers.

It is also important to point out that, due to utilizing video sequencing compression which transports only the changes in the sequence, MPEG uses less network bandwidth and storage than MJPEG. However, the quality often decreased where there lots of movement available in video, it is a disadvantage of MPEG.


DivX as the popular MPEG4 based codec developed by DivX, Inc, enables user to play and create high quality videos in a fast way and the best quality. DivX can compress a DVD movie to fit on a CD, and DivX HD can reduce an HD movie to fit on a DVD.


It is an open source version of DivX, so videos which encoded by XviD can be decoded by all MPEG4 compatible decoder. The XviD codec can compress a full length DVD quality movie to fit on a single CD, while original image quality is still kept. It is used for compressing video data in order to facilitate video data transferring and storage improvement on hard disks.


H.264 is the newest and most efficient compression method especially in video surveillance. This technology evaluates small groups of frames together as a series to eliminate duplicate content in each frame without changing. Low bandwidth usage, reduced storage requirements, higher resolution and better quality images encourage the security surveillance applications to use H.264 codec.

Why H.264 video compression is recommended in video surveillance?

To detect the superiority of one compression format to another, some factors should be taken into consideration such as bandwidth consumption, storage requirement, latency and image quality.

The popular video compression standards are MJPEG, MPEG4, and H.264, while some features of H.264 make it more popular in video security systems.

The H.264 video compression standard provides approximately twice the compression of the previous MPEG4 standard for the same video quality.

The bitrate is the total number of bits which transferred between two devices. The bitrate of an IP camera directly affects the maximum amount of data which can transfer over network at any given time (bandwidth). If surveillance system uses more bandwidth than available, video feeds will lost. So, by reducing the bitrate more data can be transmitted and the transmission rate will be increased. H.264 provides low bitrate for reduction in bandwidth usage, 80% lower than MJPEG video and 30-50% lower than MPEG4. The lower bitrate is desired for security applications which need fast frame rate such as casino, traffic monitoring, object counting (such as vehicles, people), etc.

On the other hand, in video surveillance system the maximum amount of storage capacity indicates how many recorded days can be retained, so the amount of required storage for recording has to be considered. Low bitrate reduces the file sizes being stored, so that using H.264 will provide 30-80% total saving on storage space compared to conventional compression formats. Therefore, the preservation period for recorded archive will be increased.

Low latency is a requirement for video surveillance, because images should appear in real time in surveillance monitoring. H.264 provides low latency, so this compression method is required in video surveillance. Also, image quality is an important factor for any video surveillance system along with savings in bandwidth and storage space, so the efficient compression method should provide high video resolution. High definition video encoding by H.264 enables the IP camera to capture details and provides high quality images which makes it an ideal video codec for mission critical video surveillance.

H.264 provides techniques to create better video encoders, resulting in higher quality video streams, higher frame rate and higher resolution at lower bit rate compared with other video codecs.

Technically, H.264 introduces a new and advanced intra prediction scheme that is a key part of its efficiency in video surveillance. The new intra prediction is used to encode I-frames (the first image in a video sequence is always an I-frame) which greatly reduces the bit size of an I-frame and maintains high quality through sequential prediction of smaller blocks of pixels within each macro block in a frame.

Reducing data when there is a lot of motion in a video is other important factor in video compression methods. This needs techniques such as block based motion compensation, which divides a frame into a series or macro block. This technique has been improved in H.264 encoder, confirming its efficiency in crowded surveillance scenes where the high quality is demanded. Also a certain filter used in H.264, smoothes block edge using an adaptive strength to achieve an almost great decompressed video.

Differences between H.264 and H.265:

H.265 or HEVC (High Efficiency Video Coding) as the next generation of H.264, is a video compression standard which delivers video quality identical to H.264 at only half the bitrate, meaning that the bandwidth usage is divided in half. Demand for better compression, higher image quality and bandwidth saving leads to H.264 transition to H.265 compression.

What is video container (file) format?

Container format is a type of file format that contains various type of data compressed by different codecs. Video container format contains various components of a video such as the stream of images, the sound, and anything else.

There are different types of video file format which are described in this section briefly.

AVI format (.avi):

AVI which stands for Audio Video Interleaved, as a multimedia container format stores data that can be encoded in a number of different codecs and can contain both audio and video data. The possibility of choosing codec for AVI container, can get the high rate compression as an advantage of AVI format. However, as a disadvantage, if AVI files compress under certain limits, the video quality will be lost.

MP4 format (.mp4):

The MP4 container uses MPEG4 or H.264 for video encoding, as well as AAC for audio compression. It is widely supported on most consumer devices, and the most common container used for audio and visual streams online.

MKV format (.mkv):

MKV is a container that supports any audio or video format, so it is one of the best container to store audio and video files. Also it supports error recovery property, meaning that playing back corrupted files is feasible. Hence, MKV is an adaptable and efficient container that has quickly became one of the best containers currently available.

WMV format (.wmv):

WMV stands for Windows Media Video and these files often contain Windows Media Video and Windows Media Audio. WMV contains files which support digital rights management, preventing users from copying the information. The advantage of this format is that it can compress large video files by retaining considerably high quality.

Flash video format (.flv,.swf):

FLV is a file format used by Adobe Flash Player to store and deliver synchronized audio and video streams over the Internet. The advantage of this format is its small size, so it can be easily viewed or downloaded. There are two different video file formats known as Flash Video: FLV and F4V. The most recent FLV file formats can support H.264 video encoding and AAC audio coding.

ASF format (.asf):

The Advanced System Format (ASF) was previously known as advanced streaming format or active streaming format designed primarily to store and play digital media streams and transmit them over networks, so it supports data transfer over a wide range of networks and protocols. ASF files support playback from digital media serves, HTTP servers, and storage devices. It is not O.S dependent.

One of the advantage of ASF format is that playing the video is feasible before it is streamed to the end, allowing the user to playback and view the file when a certain amount of bytes have been downloaded and the file continues to download while watching, which makes it ideal for internet use. (For example like YouTube)

Although ASF does not define how the video or audio should be encoded with the codec, it defines the structure of the video and audio stream. ASF container often contains Windows Media Audio (WMA) and Windows Media Video (WMV) files which can be compressed by using a variety of video codecs. However, the best compression is achieved BY the Microsoft Windows Media Audio codec. Also ASF file can contain any data type like text streams, script command, web page, title album (for sound track) in addition to video and audio stream types.

The disadvantage of ASF format is that since it is designed primarily for streaming capabilities, the maximum resolution is small at 352×288.

This format has been replaced by WMA and WMV files.

Differences between video codec and video container format:

Codecs and containers are not equivalent. Video file format defines how the data is stored along with the audio and video data. It does not define the compression method of data, while codec performs encoding and decoding the video and audio streams in a video file. In other words, data that has been compressed by using a particular codec is located inside the container. Good container formats can handle files compressed by a variety of different codecs. Sometimes container and codec have the same name, for example, a file format such as Windows Media Audio (WMA) contains data that is compressed by using the Windows Media Audio codec. However, a file format such as Audio Video Interleaved (AVI) can contain data that is compressed by any of a number of different codecs, including MPEG-2, DivX, or XviD codecs.


Surveillance systems should be designed around the available storage capacity or bandwidth allowance. Reducing the storage requirement and being able to lower the bitrate while maintaining high resolution are the important issues for these systems. Video compression methods enables encoding video streams to transmit over the network and decoding the streams to view without significant degradation in quality. Video compression is accomplished by removing unnecessary parts of the picture and reducing the color resolution.

Among different video codecs, Motion JPEG provides good compression rate and quality, but H.264 presents a valuable step forward in video compression technology due to more accurate prediction capabilities. H.264 is now the most widespread compression method in network camera due to significant improvements in coding efficiency, latency, complexity and robustness. It delivers an average bit rate reduction of 50% compared to older video codec standards, meaning that higher quality is achieved at the maintained bitrate or conversely, the same quality video at a lower bitrate.

On the other hand, the market need for better image quality, higher frame rates and higher resolutions with minimized bandwidth consumption leads to the advent of H.265 as the next generation of H.264. It achieves higher resolution at only the half of H.264 bitrate, resulting in bandwidth saving.

Finally, it should point out that today video network products which support several compression format especially H.264 and H.265 are ideal to achieve maximum flexibility and efficiency.

Source by Maya Avery

Leave a Reply

Your email address will not be published. Required fields are marked *