High Efficiency Video Coding (HEVC), also known as H.265, is a new video compression standard, developed by the Joint Collaborative Team on Video Coding (JCT-VC). The JCT-VC brings together image and video encoding experts from around the world, producing a single standard that is approved by two standards bodies;
- ITU-T Study Group 16 – Video Coding Experts Group (VCEG) – publishes the H.265 standard as ITU-T H.265, and
- ISO/IEC JTC 1/SC 29/WG 11 Motion Picture Experts Group (MPEG) – publishes the HEVC standard as ISO/IEC 23008-2.
The initial version of the H.265/HEVC standard was ratified in January, 2013.
HEVC was developed with the goal of providing twice the compression efficiency of the previous standard, H.264 / AVC. Although compression efficiency results vary depending on the type of content and the encoder settings, at typical consumer video distribution bit rates HEVC is typically able to compress video twice as efficiently as AVC. End-users can take advantage of improved compression efficiency in one of two ways (or some combination of both);
- At an identical level of visual quality, HEVC enables video to be compressed to a file that is about half the size (or half the bit rate) of AVC, or
- When compressed to the same file size or bit rate as AVC, HEVC delivers significantly better visual quality.
HEVC encodes video files twice as efficiently as previous video coding standards
Most of the power of video compression standards comes from a technique known as motion compensated prediction. Blocks of pixels are encoded by making reference to another area in the same frame (intra-prediction), or in another frame (inter-prediction). Where H.264/AVC defines macroblocks up to 16×16 pixels, HEVC can describe a much larger range of block sizes, up to 64 x 64 pixels.
- HEVC allows predicted blocks to be coded in different block sizes than the residual error. Each top level coding unit (or CTU) is first coded as a prediction quad-tree, where at each depth the encoder decides whether to encode with merge/skip, inter, or intra coding. The residual from those predictions is then coded with a second quad-tree which can optionally have greater depth than the prediction quad-tree. For instance, this allows the residual error from a 32×32 inter coded coding unit (CU) to be represented by a mixture of 16×16, 8×8, and 4×4 transforms.
- HEVC can encode motion vectors with much greater precision, giving a better predicted block with less residual error. There are 35 intra-picture directions, compared with only 9 for H.264/AVC.
- HEVC includes Adaptive Motion Vector Prediction, a new method to improve inter-prediction.
- An improved deblocking filter
- Sample Adaptive Offset – an additional filter that reduces artifacts at block edges
Versions of the HEVC/H.265 Standard
Version 1: (April 13, 2013) First approved version of the HEVC/H.265 standard containing Main, Main 10, and Main Still Picture profiles.
Version 2: (October 29, 2014) Second approved a version of the HEVC/H.265 standard which adds 21 range extensions profiles, two scalable extensions profiles, and one multi-view extensions profile.
Version 3: (April 29, 2015) Third approved version of the HEVC/H.265 standard which adds the 3D Main profile.
Coding Efficiency
The design of most video coding standards is primarily aimed at having the highest coding efficiency. Coding efficiency is the ability to encode video at the lowest possible bit rate while maintaining a certain level of video quality. There are two standard ways to measure the coding efficiency of a video coding standard, which are to use an objective metric, such as peak signal-to-noise ratio (PSNR), or to use subjective assessment of video quality. Subjective assessment of video quality is considered to be the most important way to measure a video coding standard since humans perceive video quality subjectively.
HEVC benefits from the use of larger coding tree unit (CTU) sizes. This has been shown in PSNR tests with a HM-8.0 HEVC encoder where it was forced to use progressively smaller CTU sizes. For all test sequences, when compared to a 64×64 CTU size, it was shown that the HEVC bit rate increased by 2.2% when forced to use a 32×32 CTU size, and increased by 11.0% when forced to use a 16×16 CTU size. In the Class A test sequences, where the resolution of the video was 2560×1600, when compared to a 64×64 CTU size, it was shown that the HEVC bit rate increased by 5.7% when forced to use a 32×32 CTU size, and increased by 28.2% when forced to use a 16×16 CTU size. The tests showed that large CTU sizes increase coding efficiency while also reducing decoding time.
The HEVC Main Profile (MP) has been compared in coding efficiency to H.264/MPEG-4 AVC High Profile (HP), MPEG-4 Advanced Simple Profile (ASP), H.263 High Latency Profile (HLP), and H.262/MPEG-2 Main Profile (MP). The video encoding was done for entertainment applications and twelve different bitrates were made for the nine video test sequences with a HM-8.0 HEVC encoder being used. Of the nine video test sequences, five were at HD resolution, while four were at WVGA (800×480) resolution. The bit rate reductions for HEVC were determined based on PSNR with HEVC having a bit rate reduction of 35.4% compared to H.264/MPEG-4 AVC HP, 63.7% compared to MPEG-4 ASP, 65.1% compared to H.263 HLP, and 70.8% compared to H.262/MPEG-2 MP.
HEVC MP has also been compared to H.264/MPEG-4 AVC HP for subjective video quality. The video encoding was done for entertainment applications and four different bitrates were made for nine video test sequences with a HM-5.0 HEVC encoder being used. The subjective assessment was done at an earlier date than the PSNR comparison and so it used an earlier version of the HEVC encoder that had slightly lower performance. The bit rate reductions were determined based on subjective assessment using mean opinion score values. The overall subjective bitrate reduction for HEVC MP compared to H.264/MPEG-4 AVC HP was 49.3%.
École Polytechnique Fédérale de Lausanne (EPFL) did a study to evaluate the subjective video quality of HEVC at resolutions higher than HDTV. The study was done with three videos with resolutions of 3840×1744 at 24 fps, 3840×2048 at 30 fps, and 3840×2160 at 30 fps.
The five second video sequences showed people on a street, traffic, and a scene from the open source computer animated movie Sintel. The video sequences were encoded at five different bitrates using the HM-6.1.1 HEVC encoder and the JM-18.3 H.264/MPEG-4 AVC encoder. The subjective bit rate reductions were determined based on subjective assessment using mean opinion score values. The study compared HEVC MP with H.264/MPEG-4 AVC HP and showed that, for HEVC MP, the average bitrate reduction based on PSNR was 44.4%, while the average bitrate reduction based on subjective video quality was 66.5%.
In a HEVC performance comparison released in April 2013, the HEVC MP and Main 10 Profile (M10P) were compared to H.264/MPEG-4 AVC HP and High 10 Profile (H10P) using 3840×2160 video sequences. The video sequences were encoded using the HM-10.0 HEVC encoder and the JM-18.4 H.264/MPEG-4 AVC encoder. The average bit rate reduction based on PSNR was 45% for inter frame video.
In a video encoder comparison released in December 2013, the HM-10.0 HEVC encoder was compared to the x264 encoder (version r2334) and the VP9 encoder (version v1.2.0-3088-ga81bd12). The comparison used the Bjøntegaard-Delta bit-rate (BD-BR) measurement method, in which negative values tell how much lower the bit rate is reduced, and positive values tell how much the bit rate is increased for the same PSNR. In the comparison, the HM-10.0 HEVC encoder had the highest coding efficiency and, on average, to get the same objective quality, the x264 encoder needed to increase the bit rate by 66.4%, while the VP9 encoder needed to increase the bit rate by 79.4%.
In a subjective video performance comparison released in May 2014, the JCT-VC compared the HEVC Main profile to the H.264/MPEG-4 AVC High profile.
The comparison used mean opinion score values and was conducted by the BBC and the University of the West of Scotland. The video sequences were encoded using the HM-12.1 HEVC encoder and the JM-18.5 H.264/MPEG-4 AVC encoder. The comparison used a range of resolutions and the average bit rate reduction for HEVC was 59%. The average bit rate reduction for HEVC was 52% for 480p, 56% for 720p, 62% for 1080p, and 64% for 4K UHD.
Video coding standard | Average bit rate reduction compared to H.264/MPEG-4 AVC HP | |||
480p | 720p | 1080p | 4K UHD | |
HEVC | 52% | 56% | 62% | 64% |
In a subjective video codec comparison released in August 2014 by the EPFL, the HM-15.0 HEVC encoder was compared to the VP9 1.2.0-5183 encoder and the JM-18.8 H.264/MPEG-4 AVC encoder.
Four 4K resolutions sequences were encoded at five different bit rates with the encoders set to use an intra period of one second. In the comparison, the HM-15.0 HEVC encoder had the highest coding efficiency and, on average, for the same subjective quality the bit rate could be reduced by 49.4% compared to the VP9 1.2.0-5183 encoder, and it could be reduced by 52.6% compared to the JM-18.8 H.264/MPEG-4 AVC encoder.
Features
HEVC was designed to substantially improve coding efficiency compared to H.264/MPEG-4 AVC HP, i.e. to reduce bitrate requirements by half with comparable image quality, at the expense of increased computational complexity. HEVC was designed with the goal of allowing video content to have a data compression ratio of up to 1000:1.
Depending on the application requirements, HEVC encoders can trade off computational complexity, compression rate, robustness to errors, and encoding delay time.
Two of the key features where HEVC was improved compared to H.264/MPEG-4 AVC was support for higher resolution video and improved parallel processing methods.
HEVC is targeted at next-generation HDTV displays and content capture systems which feature progressive scanned frame rates and display resolutions from QVGA (320×240) to 4320p (7680×4320), as well as improved picture quality in terms of noise level, color spaces, and dynamic range.
Video Coding Layer
The HEVC video coding layer uses the same “hybrid” approach used in all modern video standards, starting from H.261, in that it uses inter-/intra-picture prediction and 2D transform coding.
A HEVC encoder first proceeds by splitting a picture into block shaped regions for the first picture, or the first picture of a random access point, which uses intra-picture prediction.
Intra-picture prediction is when the prediction of the blocks in the picture is based only on the information in that picture.
For all other pictures, inter-picture prediction is used, in which prediction information is used from other pictures.
After the prediction methods are finished and the picture goes through the loop filters, the final picture representation is stored in the decoded picture buffer.
Pictures stored in the decoded picture buffer can be used for the prediction of other pictures.
HEVC was designed with the idea that progressive scan video would be used and no coding tools were added specifically for interlaced video.
Interlace specific coding tools, such as MBAFF and PAFF, are not supported in HEVC.
HEVC instead sends metadata that tells how the interlaced video was sent.
Interlaced video may be sent either by coding each frame as a separate picture or by coding each field as a separate picture.
For interlaced video, HEVC can change between frame coding and field coding using Sequence Adaptive Frame Field (SAFF), which allows the coding mode to be changed for each video sequence.
This allows interlaced video to be sent with HEVC without needing special interlaced decoding processes to be added to HEVC decoders.
Colour Spaces
The HEVC standard supports colour spaces such as generic film, NTSC, PAL, Rec. 601, Rec. 709, Rec. 2020, SMPTE 170M, SMPTE 240M, sRGB, sYCC, xvYCC, XYZ, and externally specified colour spaces. HEVC supports colour encoding representations such as RGB, YCbCr, and YCoCg.
Coding Tools
Coding Tree Unit
HEVC replaces 16×16 pixel macroblocks, which were used with previous standards, with coding tree units (CTUs) which can use larger block structures of up to 64×64 samples and can better sub-partition the picture into variable sized structures.
HEVC initially divides the picture into CTUs which can be 64×64, 32×32, or 16×16 with a larger pixel block size usually increasing the coding efficiency.
Parallel Processing Tools
Tiles allow for the picture to be divided into a grid of rectangular regions that can independently be decoded/encoded. The main purpose of tiles is to allow for parallel processing. Tiles can be independently decoded and can even allow for random access to specific regions of a picture in a video stream.
Wavefront parallel processing (WPP) is when a slice is divided into rows of CTUs in which the first row is decoded normally but each additional row requires that decisions be made in the previous row. WPP has the entropy encoder use information from the preceding row of CTUs and allows for a method of parallel processing that may allow for better compression than tiles.
Tiles and WPP are allowed but are optional. If tiles are present, they must be at least 64 pixels high and 256 pixels wide with a level specific limit on the number of tiles allowed.
Slices can, for the most part, be decoded independently from each other with the main purpose of tiles being the re-synchronization in case of data loss in the video stream.
Slices can be defined as self-contained in that prediction is not made across slice boundaries. When in-loop filtering is done on a picture though, information across slice boundaries may be required.
Slices are CTUs decoded in the order of the raster scan, and different coding types can be used for slices such as I types, P types, or B types.
Dependent slices can allow for data related to tiles or WPP to be accessed more quickly by the system than if the entire slice had to be decoded.
The main purpose of dependent slices is to allow for low-delay video encoding due to its lower latency.
Other Coding Tools
Entropy Coding
HEVC uses a context-adaptive binary arithmetic coding (CABAC) algorithm that is fundamentally similar to CABAC in H.264/MPEG-4 AVC. CABAC is the only entropy encoder method that is allowed in HEVC while there are two entropy encoder methods allowed by H.264/MPEG-4 AVC.
CABAC and the entropy coding of transform coefficients in HEVC were designed for higher throughput than H.264/MPEG-4 AVC, while maintaining higher compression efficiency for larger transform block sizes relative to simple extensions.
For instance, the number of context coded bins have been reduced by 8× and the CABAC bypass-mode has been improved in terms of its design to increase throughput.
Another improvement with HEVC is that the dependencies between the coded data have been changed to further increase throughput.
Context modelling in HEVC has also been improved so that CABAC can better select a context that increases efficiency when compared to H.264/MPEG-4 AVC.
Intra prediction
HEVC specifies 33 directional modes for intra prediction compared to the 8 directional modes for intra prediction specified by H.264/MPEG-4 AVC.
HEVC also specifies DC intra prediction and planar prediction modes.
The DC intra prediction mode generates a mean value by averaging reference samples and can be used for flat surfaces. The planar prediction mode in HEVC supports all block sizes defined in HEVC while the planar prediction mode in H.264/MPEG-4 AVC is limited to a block size of 16×16 pixels. The intra prediction modes use data from neighbouring prediction blocks that have been previously decoded from within the same picture.
Motion Compensation
For the interpolation of fractional luma sample positions HEVC uses separable application of one-dimensional half-sample interpolation with an 8-tap filter or quarter-sample interpolation with a 7-tap filter while, in comparison, H.264/MPEG-4 AVC uses a two-stage process that first derives values at half-sample positions using separable one-dimensional 6-tap interpolation followed by integer rounding and then applies linear interpolation between values at nearby half-sample positions to generate values at quarter-sample positions.
HEVC has improved precision due to the longer interpolation filter and the elimination of the intermediate rounding error. For 4:2:0 video, the chroma samples are interpolated with separable one-dimensional 4-tap filtering to generate eighth-sample precision, while in comparison H.264/MPEG-4 AVC uses only a 2-tap bilinear filter (also with eighth-sample precision).
As in H.264/MPEG-4 AVC, weighted prediction in HEVC can be used either with uni-prediction (in which a single prediction value is used) or bi-prediction (in which the prediction values from two prediction blocks are combined).
Motion Vector Prediction
HEVC defines a signed 16-bit range for both horizontal and vertical motion vectors (MVs). This was added to HEVC at the July 2012 HEVC meeting with the mvLX variables.
HEVC horizontal/vertical MVs have a range of −32768 to 32767 which given the quarter pixel precision used by HEVC allows for an MV range of −8192 to 8191.75 luma samples.
This compares to H.264/MPEG-4 AVC which allows for a horizontal MV range of −2048 to 2047.75 luma samples and a vertical MV range of −512 to 511.75 luma samples.
HEVC allows for two MV modes which are Advanced Motion Vector Prediction (AMVP) and merge mode. AMVP uses data from the reference picture and can also use data from adjacent prediction blocks.
The merge mode allows for the MVs to be inherited from neighbouring prediction blocks.[1] Merge mode in HEVC is similar to “skipped” and “direct” motion inference modes in H.264/MPEG-4 AVC but with two improvements.
The first improvement is that HEVC uses index information to select one of several available candidates. The second improvement is that HEVC uses information from the reference picture list and reference picture index.
Inverse Transforms
HEVC specifies four transform units (TUs) sizes of 4×4, 8×8, 16×16, and 32×32 to code the prediction residual. A CTB may be recursively partitioned into 4 or more TUs.
TUs use integer basis functions that are similar to the discrete cosine transform (DCT). In addition 4×4 luma transform blocks that belong to an intra coded region are transformed using an integer transform that is derived from discrete sine transform (DST).
This provides a 1% bit rate reduction but was restricted to 4×4 luma transform blocks due to marginal benefits for the other transform cases.
Chroma uses the same TU sizes as luma so there is no 2×2 transform for chroma.
Loop Filters
HEVC specifies two loop filters that are applied sequentially, with the deblocking filter (DBF) applied first and the sample adaptive offset (SAO) filter applied afterwards.
Both loop filters are applied in the inter-picture prediction loop, i.e. the filtered image is stored in the decoded picture buffer (DPB) as a reference for inter-picture prediction.
Deblocking Filter
The DBF is similar to the one used by H.264/MPEG-4 AVC but with a simpler design and better support for parallel processing. In HEVC the DBF only applies to an 8×8 sample grid while with H.264/MPEG-4 AVC the DBF applies to a 4×4 sample grid.
DBF uses an 8×8 sample grid since it causes no noticeable degradation and significantly improves parallel processing because the DBF no longer causes cascading interactions with other operations.
Another change is that HEVC only allows for three DBF strengths of 0 to 2. HEVC also requires that the DBF first apply horizontal filtering for vertical edges to the picture and only after that does it apply vertical filtering for horizontal edges to the picture. This allows for multiple parallel threads to be used for the DBF.
Sample Adaptive Offset
The SAO filter is applied after the DBF and is designed to allow for better reconstruction of the original signal amplitudes by applying offsets stored in a lookup table in the bitstream.
Per CTB the SAO filter can be disabled or applied in one of two modes: edge offset mode or band offset mode.
The edge offset mode operates by comparing the value of a sample to two of its eight neighbours using one of four directional gradient patterns.
Based on a comparison with these two neighbours, the sample is classified into one of five categories: minimum, maximum, an edge with the sample having the lower value, an edge with the sample having the higher value, or monotonic.
For each of the first four categories, an offset is applied.
The band offset mode applies an offset based on the amplitude of a single sample.
A sample is categorized by its amplitude into one of 32 bands (histogram bins).
Offsets are specified for four consecutive of the 32 bands because, in flat areas which are prone to banding artefacts, sample amplitudes tend to be clustered in a small range. The SAO filter was designed to increase picture quality, reduce banding artefacts, and reduce ringing artefacts.
Range Extensions
Additional coding tool options have been added in the range extensions. This includes new definitions of profiles and levels:
Profiles supporting bit depths beyond 10 bits per sample. Profiles that support a range of bit depths can use different bit depths for luma and chroma with YCbCr colour spaces.
Profiles that support 4:0:0 (monochrome), 4:2:2 (half-horizontal chroma resolution), and 4:4:4 (full chroma resolution) chroma sampling.
Additional profiles supporting only all-intra coding and only still-picture coding for applications that do not need inter-picture (temporal) prediction.
The Still Picture profiles can use an unbounded level, level 8.5, for which no limit is imposed on the picture size. Decoders for level 8.5 are not required to decode all level 8.5 bitstreams, since some may exceed their picture size capability.
Within these new profiles are enhanced coding features that include:
High precision weighted prediction uses an increased precision for a weighted prediction that increases the coding efficiency for fading video scenes at high bit depths.
Cross-component prediction, using prediction between the chroma/luma components to improve coding efficiency. The reduction in bit rate can be up to 7% for YCbCr 4:4:4 video and up to 26% for RGB video.
RGB video has a larger reduction in bit rate due to the greater correlation between the components.
Intra smoothing disabling, allowing the neighbour region filtering process ordinarily applied in intra prediction to be disabled.
Persistent Rice adaptation, using a Rice coding parameter derivation for entropy coding that has a memory that persists across transform coefficient sub-block boundaries.
Modifications of transform skip mode processing:
Residual DPCM (RDPCM), allowing a vertical or horizontal spatial-predictive coding of residual data in transform skip and transform-quantization bypass blocks (which can be selected for use in intra blocks, inter blocks, or both).
Transform skip block size flexibility, supporting block sizes up to 32×32 (versus only 4×4 support in version 1).
Transform skip rotation, allowing the encoder to indicate a rotation of residual data for 4×4 transform skip blocks.
Transform skip context enabling, using a separate context for entropy coding the indication of which blocks are coded using transform skipping.
Extended precision processing, using an extended dynamic range for inter prediction interpolation and inverse transform.
CABAC bypass alignment, allowing for the alignment of the data to a byte boundary before bypass decoding is supported in the High Throughput 4:4:4 16 Intra profile.
The second version of HEVC adds several supplemental enhancement information (SEI) messages which include:
Colour remapping information SEI message, provides information on remapping from one colour space to a different colour space.
An example would be to preserve the artistic intent when converting wide colour gamut (WCG) video from the Rec. 2020 colour space for output on a Rec. 709 display.
The colour remapping information SEI message was proposed for future UHDTV applications. Multiple colour remapping processes can be supported for different display scenarios.
Knee function information SEI message provides information on how to convert from one dynamic range to a different dynamic range.
An example would be to compress the upper range of high dynamic range (HDR) video that has a luminance level of 800 cd/m2 for output on a 100 cd/m2 display.
Multiple knee function processes can be supported for different display scenarios.
Mastering display colour volume SEI message provides information on the colour primaries and dynamic range of the display that was used to author the video.
Time code SEI message provides information on the time of origin when the video was recorded.
Screen Content Coding Extensions
Additional coding tool options have been added in the June 2015 draft of the screen content coding (SCC) extensions:
Adaptive colour transform.
Adaptive motion vector resolution.
Intra block copying.
Palette mode.
SCC extensions also added support for Hybrid Log-Gamma which is an HDR standard that was jointly developed by the BBC and NHK.
Profiles
Version 1 of the HEVC standard defines three profiles: Main, Main 10, and Main Still Picture.[6] Version 2 of HEVC adds 21 range extensions profiles, two scalable extensions profiles, and one multi-view profile. HEVC also contains provisions for additional profiles. Extensions that were added to HEVC include increased bit depth, 4:2:2/4:4:4 chroma sampling, Multiview Video Coding (MVC), and Scalable Video Coding (SVC).
The HEVC range extensions, HEVC scalable extensions, and HEVC multi-view extensions were completed in July 2014.
In July 2014 a draft of the second version of HEVC was released. Screen content coding (SCC) extensions are under development for screen content video, which contains text and graphics, with an expected final draft release date of 2015.
A profile is a defined set of coding tools that can be used to create a bitstream that conforms to that profile. An encoder for a profile may choose which coding tools to use as long as it generates a conforming bitstream while a decoder for a profile must support all coding tools that can be used in that profile.
Feature | Version 1 | Version 2 | |||||||
Main | Main 10 | Main 12 | Main 4:2:2 10 | Main 4:2:2 12 | Main 4:4:4 | Main 4:4:4 10 | Main 4:4:4 12 | Main 4:4:4 16 Intra | |
Bit Depth | 8 | 8 to 10 | 8 to 12 | 8 to 10 | 8 to 12 | 8 | 8 to 10 | 8 to 12 | 8 to 16 |
Chroma sampling formats | 4:2:0 | 4:2:0 | 4:2:0 | 4:2:0/4:2:2 | 4:2:0/4:2:2 | 4:2:0/4:2:2/4:4:4 | 4:2:0/4:2:2/4:4:4 | 4:2:0/4:2:2/4:4:4 | 4:2:0/4:2:2/4:4:4 |
4:0:0 (Monochrome) | No | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
High precision weighted prediction | No | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Chroma QP offset list | No | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Cross-component prediction | No | No | No | No | No | Yes | Yes | Yes | Yes |
Intra smoothing disabling | No | No | No | No | No | Yes | Yes | Yes | Yes |
Persistent Rice adaptation | No | No | No | No | No | Yes | Yes | Yes | Yes |
RDPCM implicit/explicit | No | No | No | No | No | Yes | Yes | Yes | Yes |
Transform skip block sizes larger than 4×4 | No | No | No | No | No | Yes | Yes | Yes | Yes |
Transform skip context/rotation | No | No | No | No | No | Yes | Yes | Yes | Yes |
Extended precision processing | No | No | No | No | No | No | No | No | Yes |
Version 1 profiles
Main
The Main profile allows for a bit depth of 8-bits per sample with 4:2:0 chroma sampling, which is the most common type of video used with consumer devices.
Main 10
The Main 10 profile allows for a bit depth of 8-bits to 10-bits per sample with 4:2:0 chroma sampling. HEVC decoders that conform to the Main 10 profile must be capable of decoding bitstreams made with the following profiles: Main and Main 10.
A higher bit depth allows for a greater number of colours. 8-bits per sample allows for 256 shades per primary colour (a total of 16.78 million colours) while 10-bits per sample allows for 1024 shades per primary colour (a total of 1.07 billion colours). A higher bit depth allows for a smoother transition of colour which resolves the problem known as colour banding.
The Main 10 profile allows for improved video quality since it can support video with a higher bit depth than what is supported by the Main profile. Additionally, in the Main 10 profile, 8-bit video can be coded with a higher bit depth of 10-bits, which allows improved coding efficiency compared to the Main profile.
Ericsson has stated that the Main 10 profile will bring the benefits of 10-bits per sample video to consumer TV. They also state that for higher resolutions there is no bit rate penalty for encoding video at 10-bits per sample. Imagination Technologies states that 10-bits per sample video will allow for larger colour spaces and is required for the Rec. 2020 colour space that will be used by UHDTV. They also state that the Rec. 2020 colour space will drive the widespread adoption of 10-bits per sample video.
In a PSNR based performance comparison released in April 2013, the Main 10 profile was compared to the Main profile using a set of 3840×2160 10-bit video sequences.
The 10-bit video sequences were converted to 8-bits for the Main profile and remained at 10-bits for the Main 10 profile. The reference PSNR was based on the original 10-bit video sequences.
In the performance comparison, the Main 10 profile provided a 5% bit rate reduction for interframe video coding compared to the Main profile. The performance comparison states that for the tested video sequences the Main 10 profile outperformed the Main profile.
The Main 10 profile was added at the October 2012 HEVC meeting based on proposal JCTVC-K0109 which proposed that a 10-bit profile is added to HEVC for consumer applications.
The proposal stated that this was to allow for improved video quality and to support the Rec. 2020 colour space that has become widely used in UHDTV systems and to be able to deliver the higher dynamic range and colour fidelity avoiding the banding artefacts. A variety of companies supported the proposal which included ATEME, BBC, BSkyB, CISCO, DirecTV, Ericsson, Motorola Mobility, NGCodec, NHK, RAI, ST, SVT, Thomson Video Networks, Technicolor, and ViXS Systems.
Main Still Picture
The Main Still Picture profile allows for a single still picture to be encoded with the same constraints as the Main profile. As a subset of the Main profile the Main Still Picture profile allows for a bit depth of 8-bits per sample with 4:2:0 chroma sampling.
An objective performance comparison was done in April 2012 in which HEVC reduced the average bit rate for images by 56% compared to JPEG. A PSNR based performance comparison for still image compression was done in May 2012 using the HEVC HM 6.0 encoder and the reference software encoders for the other standards.
For still images HEVC reduced the average bit rate by 15.8% compared to H.264/MPEG-4 AVC, 22.6% compared to JPEG 2000, 30.0% compared to JPEG XR, 31.0% compared to WebP, and 43.0% compared to JPEG.
Performance comparison for still image compression was done in January 2013 using the HEVC HM 8.0rc2 encoder, Kakadu version 6.0 for JPEG 2000, and IJG version 6b for JPEG.
The performance comparison used PSNR for the objective assessment and mean opinion score (MOS) values for the subjective assessment.
The subjective assessment used the same test methodology and images as those used by the JPEG committee when it evaluated JPEG XR. For 4:2:0 chroma sampled images the average bit rate reduction for HEVC compared to JPEG 2000 was 20.26% for PSNR and 30.96% for MOS while compared to JPEG it was 61.63% for PSNR and 43.10% for MOS.
Comparison of standards for still image compression based on equal PSNR and MOS | ||
Still image coding standard (test method) | Average bitrate reduction compared to | |
JPEG 2000 | JPEG | |
HEVC (PSNR) | 20.26% | 61.63% |
HEVC (MOS) | 30.96% | 43.10% |
A PSNR based HEVC performance comparison for still image compression was done in April 2013 by Nokia. HEVC has a larger performance improvement for higher resolution images than lower resolution images and a larger performance improvement for lower bit rates than higher bit rates. For lossy compression to get the same PSNR as HEVC took on average 1.4× more bits with JPEG 2000, 1.6× more bits with JPEG-XR, and 2.3× more bits with JPEG.
A compression efficiency study of HEVC, JPEG, JPEG XR, and WebP was done in October 2013 by Mozilla. The study showed that HEVC was significantly better at compression than the other image formats that were tested. Four different methods for comparing image quality were used in the study which was Y-SSIM, RGB-SSIM, IW-SSIM, and PSNR-HVS-M.
Version 2 profiles
Version 2 of HEVC adds 21 range extensions profiles, two scalable extensions profiles, and one multi-view profile: Monochrome, Monochrome 12, Monochrome 16, Main 12, Main 4:2:2 10, Main 4:2:2 12, Main 4:4:4, Main 4:4:4 10, Main 4:4:4 12, Monochrome 12 Intra, Monochrome 16 Intra, Main 12 Intra, Main 4:2:2 10 Intra, Main 4:2:2 12 Intra, Main 4:4:4 Intra, Main 4:4:4 10 Intra, Main 4:4:4 12 Intra, Main 4:4:4 16 Intra, Main 4:4:4 Still Picture, Main 4:4:4 16 Still Picture, High Throughput 4:4:4 16 Intra, Scalable Main, Scalable Main 10, and Multiview Main.
All of the interframe range extensions profiles have an Intra profile.
Monochrome
The Monochrome profile allows for a bit depth of 8-bits per sample with support for 4:0:0 chroma sampling.
Monochrome 12
The Monochrome 12 profile allows for a bit depth of 8-bits to 12-bits per sample with support for 4:0:0 chroma sampling.
Monochrome 16
The Monochrome 16 profile allows for a bit depth of 8-bits to 16-bits per sample with support for 4:0:0 chroma sampling. HEVC decoders that conform to the Monochrome 16 profile must be capable of decoding bitstreams made with the following profiles: Monochrome, Monochrome 12, and Monochrome 16.
Main 12
The Main 12 profile allows for a bit depth of 8-bits to 12-bits per sample with support for 4:0:0 and 4:2:0 chroma sampling. HEVC decoders that conform to the Main 12 profile must be capable of decoding bitstreams made with the following profiles: Monochrome, Monochrome 12, Main, Main 10, and Main 12.
Main 4:2:2 10
The Main 4:2:2 10 profile allows for a bit depth of 8-bits to 10-bits per sample with support for 4:0:0, 4:2:0, and 4:2:2 chroma sampling. HEVC decoders that conform to the Main 4:2:2 10 profile must be capable of decoding bitstreams made with the following profiles: Monochrome, Main, Main 10, and Main 4:2:2 10.
Main 4:2:2 12
The Main 4:2:2 12 profile allows for a bit depth of 8-bits to 12-bits per sample with support for 4:0:0, 4:2:0, and 4:2:2 chroma sampling. HEVC decoders that conform to the Main 4:2:2 12 profile must be capable of decoding bitstreams made with the following profiles: Monochrome, Monochrome 12, Main, Main 10, Main 12, Main 4:2:2 10, and Main 4:2:2 12.
Main 4:4:4
The Main 4:4:4 profile allows for a bit depth of 8-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling. HEVC decoders that conform to the Main 4:4:4 profile must be capable of decoding bitstreams made with the following profiles: Monochrome, Main, and Main 4:4:4.
Main 4:4:4 10
The Main 4:4:4 10 profile allows for a bit depth of 8-bits to 10-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling. HEVC decoders that conform to the Main 4:4:4 10 profile must be capable of decoding bitstreams made with the following profiles: Monochrome, Main, Main 10, Main 4:2:2 10, Main 4:4:4, and Main 4:4:4 10.
Main 4:4:4 12
The Main 4:4:4 12 profile allows for a bit depth of 8-bits to 12-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling. HEVC decoders that conform to the Main 4:4:4 12 profile must be capable of decoding bitstreams made with the following profiles: Monochrome, Main, Main 10, Main 12, Main 4:2:2 10, Main 4:2:2 12, Main 4:4:4, Main 4:4:4 10, Main 4:4:4 12, and Monochrome 12.
Main 4:4:4 16 Intra
The Main 4:4:4 16 Intra profile allows for a bit depth of 8-bits to 16-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling. HEVC decoders that conform to the Main 4:4:4 16 Intra profile must be capable of decoding bitstreams made with the following profiles: Monochrome Intra, Monochrome 12 Intra, Monochrome 16 Intra, Main Intra, Main 10 Intra, Main 12 Intra, Main 4:2:2 10 Intra, Main 4:2:2 12 Intra, Main 4:4:4 Intra, Main 4:4:4 10 Intra, and Main 4:4:4 12 Intra.
High Throughput 4:4:4 16 Intra
The High Throughput 4:4:4 16 Intra profile allows for a bit depth of 8-bits to 16-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling. The High Throughput 4:4:4 16 Intra profile has an HbrFactor 12 times higher than other HEVC profiles allowing it to have a maximum bit rate 12 times higher than the Main 4:4:4 16 Intra profile.
The High Throughput 4:4:4 16 Intra profile is designed for high-end professional content creation and decoders for this profile are not required to support other profiles.
Main 4:4:4 Still Picture
The Main 4:4:4 Still Picture profile allows for a single still picture to be encoded with the same constraints as the Main 4:4:4 profile. As a subset of the Main 4:4:4 profile the Main 4:4:4 Still Picture profile allows for a bit depth of 8-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling.
Main 4:4:4 16 Still Picture
The Main 4:4:4 16 Still Picture profile allows for a single still picture to be encoded with the same constraints as the Main 4:4:4 16 Intra profile. As a subset of the Main 4:4:4 16 Intra profile the Main 4:4:4 16 Still Picture profile allows for a bit depth of 8-bits to 16-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling.
Scalable Main
The Scalable Main profile allows for a base layer that conforms to the Main profile of HEVC.
Scalable Main 10
The Scalable Main 10 profile allows for a base layer that conforms to the Main 10 profile of HEVC.
Multiview Main
The Multiview Main profile allows for a base layer that conforms to the Main profile of HEVC.
Version 3 and higher profiles
Version 3 of HEVC added one 3D profile: 3D Main. The February 2016 draft of the screen content coding extensions added seven screen content coding extensions profiles, three high throughput extensions profiles, and four scalable extensions profiles: Screen-Extended Main, Screen-Extended Main 10, Screen-Extended Main 4:4:4, Screen-Extended Main 4:4:4 10, Screen-Extended High Throughput 4:4:4, Screen-Extended High Throughput 4:4:4 10, Screen-Extended High Throughput 4:4:4 14, High Throughput 4:4:4, High Throughput 4:4:4 10, High Throughput 4:4:4 14, Scalable Monochrome, Scalable Monochrome 12, Scalable Monochrome 16, and Scalable Main 4:4:4.
3D Main
The 3D Main profile allows for a base layer that conforms to the Main profile of HEVC.
Screen-Extended Main
The Screen-Extended Main profile allows for a bit depth of 8-bits per sample with support for 4:0:0 and 4:2:0 chroma sampling. HEVC decoders that conform to the Screen-Extended Main profile must be capable of decoding bitstreams made with the following profiles: Monochrome, Main, and Screen-Extended Main.
Screen-Extended Main 10
The Screen-Extended Main 10 profile allows for a bit depth of 8-bits to 10-bits per sample with support for 4:0:0 and 4:2:0 chroma sampling. HEVC decoders that conform to the Screen-Extended Main 10 profile must be capable of decoding bitstreams made with the following profiles: Monochrome, Main, Main 10, Screen-Extended Main, and Screen-Extended Main 10.
Screen-Extended Main 4:4:4
The Screen-Extended Main 4:4:4 profile allows for a bit depth of 8-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling. HEVC decoders that conform to the Screen-Extended Main 4:4:4 profile must be capable of decoding bitstreams made with the following profiles: Monochrome, Main, Main 4:4:4, Screen-Extended Main, and Screen-Extended Main 4:4:4.
Screen-Extended Main 4:4:4 10
The Screen-Extended Main 4:4:4 10 profile allows for a bit depth of 8-bits to 10-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling. HEVC decoders that conform to the Screen-Extended Main 4:4:4 10 profile must be capable of decoding bitstreams made with the following profiles: Monochrome, Main, Main 10, Main 4:2:2 10, Main 4:4:4, Main 4:4:4 10, Screen-Extended Main, Screen-Extended Main 10, Screen-Extended Main 4:4:4, and Screen-Extended Main 4:4:4 10.
Screen-Extended High Throughput 4:4:4
The Screen-Extended High Throughput 4:4:4 profile allows for a bit depth of 8-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling. The Screen-Extended High Throughput 4:4:4 profile has an HbrFactor 6 times higher than most inter frame HEVC profiles allowing it to have a maximum bit rate 6 times higher than the Main 4:4:4 profile. HEVC decoders that conform to the Screen-Extended High Throughput 4:4:4 profile must be capable of decoding bitstreams made with the following profiles: Monochrome, Main, Main 4:4:4, Screen-Extended Main, Screen-Extended Main 4:4:4, Screen-Extended High Throughput 4:4:4, and High Throughput 4:4:4.
Screen-Extended High Throughput 4:4:4 10
The Screen-Extended High Throughput 4:4:4 10 profile allows for a bit depth of 8-bits to 10-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling.
The Screen-Extended High Throughput 4:4:4 10 profile has an HbrFactor 6 times higher than most inter frame HEVC profiles allowing it to have a maximum bit rate 6 times higher than the Main 4:4:4 10 profile. HEVC decoders that conform to the Screen-Extended High Throughput 4:4:4 10 profile must be capable of decoding bitstreams made with the following profiles: Monochrome, Main, Main 10, Main 4:2:2 10, Main 4:4:4, Main 4:4:4 10, Screen-Extended Main, Screen-Extended Main 10, Screen-Extended Main 4:4:4, Screen-Extended Main 4:4:4 10, Screen-Extended High Throughput 4:4:4, Screen-Extended High Throughput 4:4:4 10, High Throughput 4:4:4, and High Throughput 4:4:4.
Screen-Extended High Throughput 4:4:4 14
The Screen-Extended High Throughput 4:4:4 14 profile allows for a bit depth of 8-bits to 14-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling.
The Screen-Extended High Throughput 4:4:4 14 profile has an HbrFactor 6 times higher than most inter frame HEVC profiles. HEVC decoders that conform to the Screen-Extended High Throughput 4:4:4 14 profile must be capable of decoding bitstreams made with the following profiles: Monochrome, Main, Main 10, Main 4:2:2 10, Main 4:4:4, Main 4:4:4 10, Screen-Extended Main, Screen-Extended Main 10, Screen-Extended Main 4:4:4, Screen-Extended Main 4:4:4 10, Screen-Extended High Throughput 4:4:4, Screen-Extended High Throughput 4:4:4 10, Screen-Extended High Throughput 4:4:4 14, High Throughput 4:4:4, High Throughput 4:4:4 10, and High Throughput 4:4:4 14.
High Throughput 4:4:4
The High Throughput 4:4:4 profile allows for a bit depth of 8-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling. The High Throughput 4:4:4 profile has an HbrFactor 6 times higher than most interframe HEVC profiles allowing it to have a maximum bit rate 6 times higher than the Main 4:4:4 profile. HEVC decoders that conform to the High Throughput 4:4:4 profile must be capable of decoding bitstreams made with the following profiles: High Throughput 4:4:4.
High Throughput 4:4:4 10
The High Throughput 4:4:4 10 profile allows for a bit depth of 8-bits to 10-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling.
The High Throughput 4:4:4 10 profile has an HbrFactor 6 times higher than most interframe HEVC profiles allowing it to have a maximum bit rate 6 times higher than the Main 4:4:4 10 profile. HEVC decoders that conform to the High Throughput 4:4:4 10 profile must be capable of decoding bitstreams made with the following profiles: High Throughput 4:4:4 and High Throughput 4:4:4 10.
High Throughput 4:4:4 14
The High Throughput 4:4:4 14 profile allows for a bit depth of 8-bits to 14-bits per sample with support for 4:0:0, 4:2:0, 4:2:2, and 4:4:4 chroma sampling.
The High Throughput 4:4:4 14 profile has an HbrFactor 6 times higher than most interframe HEVC profiles. HEVC decoders that conform to the High Throughput 4:4:4 14 profile must be capable of decoding bitstreams made with the following profiles: High Throughput 4:4:4, High Throughput 4:4:4 10, and High Throughput 4:4:4 14.
Scalable Monochrome
The Scalable Monochrome profile allows for a base layer that conforms to the Monochrome profile of HEVC.
Scalable Monochrome 12
The Scalable Monochrome 12 profile allows for a base layer that conforms to the Monochrome 12 profile of HEVC.
Scalable Monochrome 16
The Scalable Monochrome 16 profile allows for a base layer that conforms to the Monochrome 16 profile of HEVC.
Scalable Main 4:4:4
The Scalable Main 4:4:4 profile allows for a base layer that conforms to the Main 4:4:4 profile of HEVC.
Tiers & Levels
The HEVC standard defines two tiers, Main and High, and thirteen levels. A level is a set of constraints for a bitstream. For levels below level 4 only the Main tier is allowed.
The Main tier is a lower tier than the High tier. The tiers were made to deal with applications that differ in terms of their maximum bit rate. The Main tier was designed for most applications while the High tier was designed for very demanding applications.A decoder that conforms to a given tier/level is required to be capable of decoding all bitstreams that are encoded for that tier/level and for all lower tiers/levels.
Tiers and levels with maximum property values | |||||
Level | Max luma sample rate (samples/s) | Max luma picture size (samples) | Max bit rate for Main and Main 10 profiles (kbit/s)A | Example picture resolution @ highest frame rateB (MaxDpbSizeC) |
|
Main Tier | High Tier | ||||
1 | 552,960 | 36,864 | 128 | – | 176×144@15.0 (6) |
2 | 3,686,400 | 122,880 | 1,500 | – | 352×288@30.0 (6) |
2.1 | 7,372,800 | 245,760 | 3,000 | – | 352×288@60.0 (12) 640×360@30.0 (6) |
3 | 16,588,800 | 552,960 | 6,000 | – | 640×360@67.5 (12) 720×576@37.5 (8) 960×540@30.0 (6) |
3.1 | 33,177,600 | 983,040 | 10,000 | – | 720×576@75.0 (12) 960×540@60.0 (8) 1280×720@33.7 (6) |
4 | 66,846,720 | 2,228,224 | 12,000 | 30,000 | 1,280×720@68.0 (12) 1,920×1,080@32.0 (6) 2,048×1,080@30.0 (6) |
4.1 | 133,693,440 | 2,228,224 | 20,000 | 50,000 | 1,280×720@136.0 (12) 1,920×1,080@64.0 (6) 2,048×1,080@60.0 (6) |
5 | 267,386,880 | 8,912,896 | 25,000 | 100,000 | 1,920×1,080@128.0 (16) 3,840×2,160@32.0 (6) 4,096×2,160@30.0 (6) |
5.1 | 534,773,760 | 8,912,896 | 40,000 | 160,000 | 1,920×1,080@128.0 (16) 3,840×2,160@32.0 (6) 4,096×2,160@30.0 (6) |
5.2 | 1,069,547,520 | 8,912,896 | 60,000 | 240,000 | 1,920×1,080@128.0 (16) 3,840×2,160@32.0 (6) 4,096×2,160@30.0 (6) |
6 | 1,069,547,520 | 35,651,584 | 60,000 | 240,000 | 3,840×2,160@128.0 (16) 7,680×4,320@32.0 (6) 8,192×4,320@30.0 (6) |
6.1 | 2,139,095,040 | 35,651,584 | 120,000 | 480,000 | 3,840×2,160@128.0 (16) 7,680×4,320@32.0 (6) 8,192×4,320@30.0 (6) |
6.2 | 4,278,190,080 | 35,651,584 | 240,000 | 800,000 | 3,840×2,160@128.0 (16) 7,680×4,320@32.0 (6) 8,192×4,320@30.0 (6) |
A The maximum bit rate of the profile is based on the combination of bit depth, chroma sampling, and the type of profile. For bit depth the maximum bit rate increases by 1.5× for 12-bit profiles and 2× for 16-bit profiles. For chroma sampling the maximum bit rate increases by 1.5× for 4:2:2 profiles and 2× for 4:4:4 profiles. For the Intra profiles the maximum bit rate increases by 2×. B The maximum frame rate supported by HEVC is 300 fps. C The MaxDpbSize is the maximum number of pictures in the decoded picture buffer. |
Decoder Picture Buffer
Previously decoded pictures are stored in a decoded picture buffer (DPB), and are used by HEVC encoders to form predictions for subsequent pictures.
The maximum number of pictures that can be stored in the DPB, called the DPB capacity, is 6 (including the current picture) for all HEVC levels when operating at the maximum picture size supported by the level. The DPB capacity (in units of pictures) increases from 6 to 8, 12, or 16 as the picture size decreases from the maximum picture size supported by the level.
The encoder selects which specific pictures are retained in the DPB on a picture-by-picture basis, so the encoder has the flexibility to determine for itself the best way to use the DPB capacity when encoding the video content.
Containers
MPEG has published an amendment which added HEVC support to the MPEG transport stream used by ATSC, DVB, and Blu-ray Disc; MPEG decided not to update the MPEG program stream used by DVD-Video. MPEG has also added HEVC support to the ISO base media file format.[127][128] HEVC is also supported by the MPEG media transport standard.
Support for HEVC was added to Matroska starting with the release of MKVToolNix v6.8.0 after a patch from DivX was merged.
A draft document has been submitted to the Internet Engineering Task Force which describes a method to add HEVC support to the Real-time Transport Protocol.
Using HEVC’s intraframe encoding, a still-image coded format called Better Portable Graphics (BPG) has been proposed by the programmer Fabrice Bellard. It is essentially a wrapper for images coded using the HEVC Main 4:4:4 16 Still Picture profile with up to 14 bits per sample, although it uses an abbreviated header syntax and adds explicit support for Exif, ICC profiles, and XMP metadata.