Misplaced Pages

H.262/MPEG-2 Part 2

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

YCbCr , Y′CbCr , or Y Pb/Cb Pr/Cr , also written as YC B C R or Y′C B C R , is a family of color spaces used as a part of the color image pipeline in video and digital photography systems. Y′ is the luma component and C B and C R are the blue-difference and red-difference chroma components. Y′ (with prime ) is distinguished from Y, which is luminance , meaning that light intensity is nonlinearly encoded based on gamma corrected RGB primaries.

#776223

90-569: H.262 or MPEG-2 Part 2 (formally known as ITU-T Recommendation H.262 and ISO/IEC 13818-2 , also known as MPEG-2 Video ) is a video coding format standardised and jointly maintained by ITU-T Study Group 16 Video Coding Experts Group (VCEG) and ISO / IEC Moving Picture Experts Group (MPEG), and developed with the involvement of many companies. It is the second part of the ISO/IEC MPEG-2 standard. The ITU-T Recommendation H.262 and ISO/IEC 13818-2 documents are identical. The standard

180-427: A video codec . Some video coding formats are documented by a detailed technical specification document known as a video coding specification . Some such specifications are written and approved by standardization organizations as technical standards , and are thus known as a video coding standard . There are de facto standards and formal standards. Video content encoded using a particular video coding format

270-531: A H.264 encoder/decoder a codec shortly thereafter ("open-source our H.264 codec"). A video coding format does not dictate all algorithms used by a codec implementing the format. For example, a large part of how video compression typically works is by finding similarities between video frames (block-matching) and then achieving compression by copying previously-coded similar subimages (such as macroblocks ) and adding small differences when necessary. Finding optimal combinations of such predictors and differences

360-406: A fast DCT algorithm with C.H. Smith and S.C. Fralick in 1977, and founded Compression Labs to commercialize DCT technology. In 1979, Anil K. Jain and Jaswant R. Jain further developed motion-compensated DCT video compression. This led to Chen developing a practical video compression algorithm, called motion-compensated DCT or adaptive scene coding, in 1981. Motion-compensated DCT later became

450-520: A given video coding format from/to uncompressed video are implementations of those specifications. As an analogy, the video coding format H.264 (specification) is to the codec OpenH264 (specific implementation) what the C Programming Language (specification) is to the compiler GCC (specific implementation). Note that for each specification (e.g., H.264 ), there can be many codecs implementing that specification (e.g., x264 , OpenH264, H.264/MPEG-4 AVC products and implementations ). This distinction

540-416: A lot more computing power than editing intraframe compressed video with the same picture quality. But, this compression is not very effective to use for any audio format. A video coding format can define optional restrictions to encoded video, called profiles and levels. It is possible to have a decoder which only supports decoding a subset of profiles and levels of a given video format, for example to make

630-426: A matrix. The decoding matrix for BT.2020-NCL is this with 14 decimal places: The smaller values in the matrix are not rounded; they are precise values. For systems with limited precision (8 or 10 bit, for example) a lower precision of the above matrix could be used, for example, retaining only 6 digits after decimal point. The CL version, YcCbcCrc, codes: The CL function can be used when preservation of luminance

720-403: A modified Rec. 601 Y′CbCr where Y′, C B and C R have the full 8-bit range of [0...255]. Below are the conversion equations expressed to six decimal digits of precision. (For ideal equations, see ITU-T T.871. ) Note that for the following formulae, the range of each input (R,G,B) is also the full 8-bit range of [0...255]. And back: The above conversion is identical to sYCC when the input

810-556: A much more efficient form of compression for video coding. The CCITT received 14 proposals for DCT-based video compression formats, in contrast to a single proposal based on vector quantization (VQ) compression. The H.261 standard was developed based on motion-compensated DCT compression. H.261 was the first practical video coding standard, and uses patents licensed from a number of companies, including Hitachi , PictureTel , NTT , BT , and Toshiba , among others. Since H.261, motion-compensated DCT compression has been adopted by all

900-407: A number of companies, primarily Mitsubishi, Hitachi and Panasonic . The most widely used video coding format as of 2019 is H.264/MPEG-4 AVC . It was developed in 2003, and uses patents licensed from a number of organizations, primarily Panasonic, Godo Kaisha IP Bridge and LG Electronics . In contrast to the standard DCT used by its predecessors, AVC uses the integer DCT . H.264 is one of

990-499: A patent lawsuit due to submarine patents . The motivation behind many recently designed video coding formats such as Theora , VP8 , and VP9 have been to create a ( libre ) video coding standard covered only by royalty-free patents. Patent status has also been a major point of contention for the choice of which video formats the mainstream web browsers will support inside the HTML video tag. The current-generation video coding format

SECTION 10

#1732766107777

1080-401: A result, a similar set of subsampling techniques can be used. These coefficients are not in use and were never in use. H.273 also describes constant and non-constant luminance systems which are derived strictly from primaries and white point, so that situations like sRGB/BT.709 default primaries of JPEG that use BT.601 matrix (that is derived from BT.470-6 System M) do not happen. Prior to

1170-438: A studio range of 16–240 for U and V. (These ranges are important in video editing and production, since using the wrong range will result either in an image with "clipped" blacks and whites, or a low-contrast image.) These matrices round all factors to the closest 1/256 unit. As a result, only one 16-bit intermediate value is formed for each component, and a simple right-shift with rounding (x + 128) >> 8 can take care of

1260-536: A trade-off to improved computation speeds. Y′ values are conventionally shifted and scaled to the range [16, 235] (referred to as studio swing or "TV levels") rather than using the full range of [0, 255] (referred to as full swing or "PC levels"). This practice was standardized in SMPTE-125M in order to accommodate signal overshoots ("ringing") due to filtering. U and V values, which may be positive or negative, are summed with 128 to make them always positive, giving

1350-454: Is HEVC (H.265), introduced in 2013. AVC uses the integer DCT with 4x4 and 8x8 block sizes, and HEVC uses integer DCT and DST transforms with varied block sizes between 4x4 and 32x32. HEVC is heavily patented, mostly by Samsung Electronics , GE , NTT , and JVCKenwood . It is challenged by the AV1 format, intended for free license. As of 2019 , AVC is by far the most commonly used format for

1440-402: Is a content representation format of digital video content, such as in a data file or bitstream . It typically uses a standardized video compression algorithm, most commonly based on discrete cosine transform (DCT) coding and motion compensation . A specific software, firmware , or hardware implementation capable of compression or decompression in a specific video coding format is called

1530-467: Is a complete frame. MPEG-2 supports both options. Digital television requires that these pictures be digitized so that they can be processed by computer hardware. Each picture element (a pixel ) is then represented by one luma number and two chroma numbers. These describe the brightness and the color of the pixel (see YCbCr ). Thus, each digitized picture is initially represented by three rectangular arrays of numbers. Another common practice to reduce

1620-412: Is a form of lossless video used in some circumstances such as when sending video to a display over a HDMI connection. Some high-end cameras can also capture video directly in this format. Interframe compression complicates editing of an encoded video sequence. One subclass of relatively simple video coding formats are the intra-frame video formats, such as DV , in which each frame of the video stream

1710-440: Is a separately-compressed version of a single uncompressed (raw) frame. The coding of an I-frame takes advantage of spatial redundancy and of the inability of the eye to detect certain changes in the image. Unlike P-frames and B-frames, I-frames do not depend on data in the preceding or the following frames, and so their coding is very similar to how a still photograph would be coded (roughly similar to JPEG picture coding). Briefly,

1800-458: Is an NP-hard problem, meaning that it is practically impossible to find an optimal solution. Though the video coding format must support such compression across frames in the bitstream format, by not needlessly mandating specific algorithms for finding such block-matches and other encoding steps, the codecs implementing the video coding specification have some freedom to optimize and innovate in their choice of algorithms. For example, section 0.5 of

1890-470: Is available for a fee from the ITU-T and ISO. MPEG-2 Video is very similar to MPEG-1 , but also provides support for interlaced video (an encoding technique used in analog NTSC, PAL and SECAM television systems). MPEG-2 video is not optimized for low bit-rates (e.g., less than 1 Mbit/s), but somewhat outperforms MPEG-1 at higher bit rates (e.g., 3 Mbit/s and above), although not by a large margin unless

SECTION 20

#1732766107777

1980-518: Is compressed independently without referring to other frames in the stream, and no attempt is made to take advantage of correlations between successive pictures over time for better compression. One example is Motion JPEG , which is simply a sequence of individually JPEG -compressed images. This approach is quick and simple, at the expense of the encoded video being much larger than a video coding format supporting Inter frame coding. Because interframe compression copies data from one frame to another, if

2070-505: Is derived as follows: Digital Y′CbCr (8 bits per sample) is derived from analog R'G'B' as follows: or simply componentwise The resultant signals range from 16 to 235 for Y′ (Cb and Cr range from 16 to 240); the values from 0 to 15 are called footroom , while the values from 236 to 255 are called headroom . The same quantisation ranges, different for Y and Cb, Cr also apply to BT.2020 and BT.709. Alternatively, digital Y′CbCr can derived from digital R'dG'dB'd (8 bits per sample, each using

2160-459: Is found. Then, the macroblock is treated like an I-frame macroblock. MPEG-2 video supports a wide range of applications from mobile to high quality HD editing. For many applications, it is unrealistic and too expensive to support the entire standard. To allow such applications to support only subsets of it, the standard defines profiles and levels. A profile defines sets of features such as B-pictures, 3D video, chroma format, etc. The level limits

2250-567: Is further defined in SMPTE ST 2084 and BT.2100 . BT.2100 will introduce the use of IC T C P , a semi-perceptual colorspace derived from linear RGB with good hue linearity. It is "near-constant luminance". The SMPTE 240M standard (used on the MUSE analog HD television system) defines YCC with these coefficients: The coefficients are derived from SMPTE 170M primaries and white point, as used in 240M standard. JFIF usage of JPEG supports

2340-515: Is given as sRGB, except that IEC 61966-2-1:1999/Amd1:2003 only gives four decimal digits. JPEG also defines a "YCCK" format from Adobe for CMYK input. In this format, the "K" value is passed as-is, while CMY are used to derive YCbCr with the above matrix by assuming R = 1 − C {\displaystyle R=1-C} , G = 1 − M {\displaystyle G=1-M} , and B = 1 − Y {\displaystyle B=1-Y} . As

2430-457: Is normally bundled with an audio stream (encoded using an audio coding format ) inside a multimedia container format such as AVI , MP4 , FLV , RealMedia , or Matroska . As such, the user normally does not have a H.264 file, but instead has a video file , which is an MP4 container of H.264-encoded video, normally alongside AAC -encoded audio. Multimedia container formats can contain one of several different video coding formats; for example,

2520-442: Is not consistently reflected terminologically in the literature. The H.264 specification calls H.261 , H.262 , H.263 , and H.264 video coding standards and does not contain the word codec . The Alliance for Open Media clearly distinguishes between the AV1 video coding format and the accompanying codec they are developing, but calls the video coding format itself a video codec specification . The VP9 specification calls

2610-455: Is not quite the same. Next, the quantized coefficient matrix is itself compressed. Typically, one corner of the 8×8 array of coefficients contains only zeros after quantization is applied. By starting in the opposite corner of the matrix, then zigzagging through the matrix to combine the coefficients into a string, then substituting run-length codes for consecutive zeros in that string, and then applying Huffman coding to that result, one reduces

2700-549: Is of primary importance (see: Chroma subsampling § Gamma luminance error ), or when "there is an expectation of improved coding efficiency for delivery." The specification refers to Report ITU-R BT.2246 on this matter. BT.2246 states that CL has improved compression efficiency and luminance preservation, but NCL will be more familiar to a staff that has previously handled color mixing and other production practices in HDTV YCbCr. BT.2020 does not define PQ and thus HDR, it

2790-547: Is that, with intraframe systems, each frame uses a similar amount of data. In most interframe systems, certain frames (such as I-frames in MPEG-2 ) are not allowed to copy data from other frames, so they require much more data than other frames nearby. It is possible to build a computer-based video editor that spots problems caused when I frames are edited out while other frames need them. This has allowed newer formats like HDV to be used for editing. However, this process demands

H.262/MPEG-2 Part 2 - Misplaced Pages Continue

2880-440: Is used to separate out a luma signal (Y′) that can be stored with high resolution or transmitted at high bandwidth, and two chroma components (C B and C R ) that can be bandwidth-reduced, subsampled, compressed, or otherwise treated separately for improved system efficiency. One practical example would be decreasing the bandwidth or resolution allocated to "color" compared to "black and white", since humans are more sensitive to

2970-399: The 4:4:4 sampling format). This stream of data must be compressed if digital TV is to fit in the bandwidth of available TV channels and if movies are to fit on DVDs. Video compression is practical because the data in pictures is often redundant in space and time. For example, the sky can be blue across the top of a picture and that blue sky can persist for frame after frame. Also, because of

3060-460: The main and high profiles but not in baseline . A level is a restriction on parameters such as maximum resolution and data rates. YCbCr Y′CbCr color spaces are defined by a mathematical coordinate transformation from an associated RGB primaries and white point. If the underlying RGB color space is absolute, the Y′CbCr color space is an absolute color space as well; conversely, if

3150-461: The temporal dimension . DCT coding is a lossy block compression transform coding technique that was first proposed by Nasir Ahmed , who initially intended it for image compression , while he was working at Kansas State University in 1972. It was then developed into a practical image compression algorithm by Ahmed with T. Natarajan and K. R. Rao at the University of Texas in 1973, and

3240-438: The temporal dimension . In 1967, University of London researchers A.H. Robinson and C. Cherry proposed run-length encoding (RLE), a lossless compression scheme, to reduce the transmission bandwidth of analog television signals. The earliest digital video coding algorithms were either for uncompressed video or used lossless compression , both methods inefficient and impractical for digital video coding. Digital video

3330-505: The BT.709 gamut. The form of Y′CbCr that was defined for standard-definition television use in the ITU-R BT.601 (formerly CCIR 601 ) standard for use with digital component video is derived from the corresponding RGB space (ITU-R BT.470-6 System M primaries) as follows: From the above constants and formulas, the following can be derived for ITU-R BT.601. Analog YPbPr from analog R'G'B'

3420-478: The DCT and the fast Fourier transform (FFT), developing inter-frame hybrid coders for them, and found that the DCT is the most efficient due to its reduced complexity, capable of compressing image data down to 0.25- bit per pixel for a videotelephone scene with image quality comparable to a typical intra-frame coder requiring 2-bit per pixel. The DCT was applied to video encoding by Wen-Hsiung Chen, who developed

3510-538: The H.264 specification says that encoding algorithms are not part of the specification. Free choice of algorithm also allows different space–time complexity trade-offs for the same video coding format, so a live feed can use a fast but space-inefficient algorithm, and a one-time DVD encoding for later mass production can trade long encoding-time for space-efficient encoding. The concept of analog video compression dates back to 1929, when R.D. Kell in Britain proposed

3600-511: The MP4 container format can contain video coding formats such as MPEG-2 Part 2 or H.264. Another example is the initial specification for the file type WebM , which specifies the container format (Matroska), but also exactly which video ( VP8 ) and audio ( Vorbis ) compression format is inside the Matroska container, even though Matroska is capable of containing VP9 video, and Opus audio support

3690-507: The RGB space is ill-defined, so is Y′CbCr. The transformation is defined in equations 32, 33 in ITU-T H.273 . Nevertheless that rule does not apply to P3-D65 primaries used by Netflix with BT.2020-NCL matrix, so that means matrix was not derived from primaries, but now Netflix allows BT.2020 primaries (since 2021). The same happens with JPEG: it has BT.601 matrix derived from System M primaries, yet

H.262/MPEG-2 Part 2 - Misplaced Pages Continue

3780-468: The agreements on its requirements. The technology was developed with contributions from a number of companies. Hyundai Electronics (now SK Hynix ) developed the first MPEG-2 SAVI (System/Audio/Video) decoder in 1995. The majority of patents that were later asserted in a patent pool to be essential for implementing the standard came from three companies: Sony (311 patents), Thomson (198 patents) and Mitsubishi Electric (119 patents). In 1996, it

3870-487: The amount of data to be processed is to subsample the two chroma planes (after low-pass filtering to avoid aliasing ). This works because the human visual system better resolves details of brightness than details in the hue and saturation of colors. The term 4:2:2 is used for video with the chroma subsampled by a ratio of 2:1 horizontally, and 4:2:0 is used for video with the chroma subsampled by 2:1 both vertically and horizontally. Video that has luma and chroma at

3960-422: The bandwidth available in the 2000s. Practical video compression emerged with the development of motion-compensated DCT (MC DCT) coding, also called block motion compensation (BMC) or DCT motion compensation. This is a hybrid coding algorithm, which combines two key data compression techniques: discrete cosine transform (DCT) coding in the spatial dimension , and predictive motion compensation in

4050-617: The black-and-white information (see image example to the right). This is called chroma subsampling . YCbCr is sometimes abbreviated to YCC . Typically the terms Y′CbCr, YCbCr, YPbPr and YUV are used interchangeably, leading to some confusion. The main difference is that YPbPr is used with analog images and YCbCr with digital images, leading to different scaling values for U max and V max (in YCbCr both are 1 2 {\displaystyle {\tfrac {1}{2}}} ) when converting to/from YUV. Y′CbCr and YCbCr differ due to

4140-477: The concept of transmitting only the portions of the scene that changed from frame-to-frame. The concept of digital video compression dates back to 1952, when Bell Labs researchers B.M. Oliver and C.W. Harrison proposed the use of differential pulse-code modulation (DPCM) in video coding. In 1959, the concept of inter-frame motion compensation was proposed by NHK researchers Y. Taki, M. Hatori and S. Tanaka, who proposed predictive inter-frame video coding in

4230-626: The constants are: This form of Y′CbCr is based on an RGB model that more closely fits the phosphor emission characteristics of newer CRTs and other modern display equipment. The conversion matrices for BT.709 are these: The definitions of the R', G', and B' signals also differ between BT.709 and BT.601, and differ within BT.601 depending on the type of TV system in use (625-line as in PAL and SECAM or 525-line as in NTSC ), and differ further in other specifications. In different designs there are differences in

4320-408: The data in a previous I-frame or P-frame – a reference frame . To generate a P-frame, the previous reference frame is reconstructed, just as it would be in a TV receiver or DVD player. The frame being compressed is divided into 16 pixel by 16 pixel macroblocks . Then, for each of those macroblocks, the reconstructed reference frame is searched to find a 16 by 16 area that closely matches the content of

4410-414: The decoder program/hardware smaller, simpler, or faster. A profile restricts which encoding techniques are allowed. For example, the H.264 format includes the profiles baseline , main and high (and others). While P-slices (which can be predicted based on preceding slices) are supported in all profiles, B-slices (which can be predicted based on both preceding and following slices) are supported in

4500-444: The definition of the corresponding RGB space, and required to satisfy K R + K G + K B = 1 {\displaystyle K_{R}+K_{G}+K_{B}=1} . The equivalent matrix manipulation is often referred to as the "color matrix": And its inverse: Here, the prime (′) symbols mean gamma correction is being used; thus R′, G′ and B′ nominally range from 0 to 1, with 0 representing

4590-404: The definitions of the R, G, and B chromaticity coordinates, the reference white point, the supported gamut range, the exact gamma pre-compensation functions for deriving R', G' and B' from R, G, and B, and in the scaling and offsets to be applied during conversion from R'G'B' to Y′CbCr. So proper conversion of Y′CbCr from one form to the other is not just a matter of inverting one matrix and applying

SECTION 50

#1732766107777

4680-410: The development of fast SIMD floating-point processors, most digital implementations of RGB → Y′UV used integer math, in particular fixed-point approximations. Approximation means that the precision of the used numbers (input data, output data and constant values) is limited, and thus a precision loss of typically about the last binary digit is accepted by whoever makes use of that option in typically

4770-458: The encoder takes the difference of all corresponding pixels of the two regions, and on that macroblock difference then computes the DCT and strings of coefficient values for the four 8×8 areas in the 16×16 macroblock as described above. This "residual" is appended to the motion vector and the result sent to the receiver or stored on the DVD for each macroblock being compressed. Sometimes no suitable match

4860-478: The footroom offset 16 needs to be subtracted first from each signal, and a scale factor of 255 219 {\displaystyle {\frac {255}{219}}} needs to be included in the equations. The inverse transform is: The inverse transform without any roundings (using values coming directly from ITU-R BT.601 recommendation) is: This form of Y′CbCr is used primarily for older standard-definition television systems, as it uses an RGB model that fits

4950-578: The fraction (235-16)/(240-16) = 219/224 is sometimes required when doing color matrixing or processing in YCbCr space, resulting in quantization distortions when the subsequent processing is not performed using higher bit depths. The scaling that results in the use of a smaller range of digital values than what might appear to be desirable for representation of the nominal range of the input data allows for some "overshoot" and "undershoot" during processing without necessitating undesirable clipping . This " headroom " and "toeroom" can also be used for extension of

5040-418: The full range with zero representing black and 255 representing white) according to the following equations: In the formula below, the scaling factors are multiplied by 256 255 {\displaystyle {\frac {256}{255}}} . This allows for the value 256 in the denominator, which can be calculated by a single bitshift . If the R'd G'd B'd digital source includes footroom and headroom,

5130-424: The inverse cosine transform (also with perfect precision). The conversion from 8-bit integers to real-valued transform coefficients actually expands the amount of data used at this stage of the processing, but the advantage of the transformation is that the image data can then be approximated by quantizing the coefficients. Many of the transform coefficients, usually the higher frequency components, will be zero after

5220-492: The level restrictions. A few common MPEG-2 Profile/Level combinations are presented below, with particular maximum limits noted: Some applications are listed below. The following organizations have held patents for MPEG-2 video technology, as listed at MPEG LA . All of these patents are now expired in the US and most other territories. Video coding format A video coding format (or sometimes video compression format )

5310-424: The macroblock being compressed. The offset is encoded as a "motion vector". Frequently, the offset is zero, but if something in the picture is moving, the offset might be something like 23 pixels to the right and 4-and-a-half pixels up. In MPEG-1 and MPEG-2, motion vector values can either represent integer offsets or half-integer offsets. The match between the two regions will often not be perfect. To correct for this,

5400-501: The major video coding standards (including the H.26x and MPEG formats) that followed. MPEG-1 , developed by the Moving Picture Experts Group (MPEG), followed in 1991, and it was designed to compress VHS -quality video. It was succeeded in 1994 by MPEG-2 / H.262 , which was developed with patents licensed from a number of companies, primarily Sony , Thomson and Mitsubishi Electric . MPEG-2 became

5490-401: The matrix to a smaller quantity of data. It is this entropy coded data that is broadcast or that is put on DVDs. In the receiver or the player, the whole process is reversed, enabling the receiver to reconstruct, to a close approximation, the original frame. The processing of B-frames is similar to that of P-frames except that B-frames use the picture in a subsequent reference frame as well as

SECTION 60

#1732766107777

5580-574: The memory and processing power needed, defining maximum bit rates, frame sizes, and frame rates. A MPEG application then specifies the capabilities in terms of profile and level. For example, a DVD player may say it supports up to main profile and main level (often written as MP@ML). It means the player can play back any MPEG stream encoded as MP@ML or less. The tables below summarizes the limitations of each profile and level, though there are constraints not listed here. Note that not all profile and level combinations are permissible, and scalable modes modify

5670-419: The minimum intensity (e.g., for display of the color black ) and 1 the maximum (e.g., for display of the color white ). The resulting luma (Y) value will then have a nominal range from 0 to 1, and the chroma (P B and P R ) values will have a nominal range from -0.5 to +0.5. The reverse conversion process can be readily derived by inverting the above equations. When representing the signals in digital form,

5760-512: The nominal color gamut , as specified by xvYCC . The value 235 accommodates a maximum overshoot of (255 - 235) / (235 - 16) = 9.1%, which is slightly larger than the theoretical maximum overshoot ( Gibbs' Phenomenon ) of about 8.9% of the maximum (black-to-white) step. The toeroom is smaller, allowing only 16 / 219 = 7.3% overshoot, which is less than the theoretical maximum overshoot of 8.9%. In addition, because values 0 and 255 are reserved in HDMI,

5850-457: The original frame is simply cut out (or lost in transmission), the following frames cannot be reconstructed properly. Making cuts in intraframe-compressed video while video editing is almost as easy as editing uncompressed video: one finds the beginning and ending of each frame, and simply copies bit-for-bit each frame that one wants to keep, and discards the frames one does not want. Another difference between intraframe and interframe compression

5940-469: The other. In fact, when Y′CbCr is designed ideally, the values of K B and K R are derived from the precise specification of the RGB color primary signals, so that the luma (Y′) signal corresponds as closely as possible to a gamma-adjusted measurement of luminance (typically based on the CIE 1931 measurements of the response of the human visual system to color stimuli). The ITU-R BT.2020 standard uses

6030-485: The parts of the bitstream, and other pieces of information. Aside from features for handling fields for interlaced coding, MPEG-2 Video is very similar to MPEG-1 Video (and even quite similar to the earlier H.261 standard), so the entire description below applies equally well to MPEG-1. MPEG-2 includes three basic types of coded frames: intra-coded frames ( I-frames ), predictive-coded frames ( P-frames ), and bidirectionally-predictive-coded frames ( B-frames ). An I-frame

6120-477: The phosphor emission characteristics of older CRTs. A different form of Y′CbCr is specified in the ITU-R BT.709 standard, primarily for HDTV use. The newer form is also used in some computer-display oriented applications, as sRGB (though the matrix used for sRGB form of YCbCr, sYCC , is still BT.601). In this case, the values of Kb and Kr differ, but the formulas for using them are the same. For ITU-R BT.709,

6210-596: The picture in a preceding reference frame. As a result, B-frames usually provide more compression than P-frames. B-frames are never reference frames in MPEG-2 Video. Typically, every 15th frame or so is made into an I-frame. P-frames and B-frames might follow an I-frame like this, IBBPBBPBBPBB(I), to form a Group of Pictures (GOP) ; however, the standard is flexible about this. The encoder selects which pictures are coded as I-, P-, and B-frames. P-frames provide more compression than I-frames because they take advantage of

6300-637: The primaries of most images are BT.709. Cathode-ray tube (CRT) displays are driven by red, green, and blue voltage signals, but these RGB signals are not efficient as a representation for storage and transmission, since they have a lot of redundancy . YCbCr and Y′CbCr are a practical approximation to color processing and perceptual uniformity, where the primary colors corresponding roughly to red, green and blue are processed into perceptually meaningful information. By doing this, subsequent image/video processing, transmission and storage can do operations and introduce errors in perceptually meaningful ways. Y′CbCr

6390-404: The quantization, which is basically a rounding operation. The penalty of this step is the loss of some subtle distinctions in brightness and color. The quantization may either be coarse or fine, as selected by the encoder. If the quantization is not too coarse and one applies the inverse transform to the matrix after it is quantized, one gets an image that looks very similar to the original image but

6480-443: The raw frame is divided into 8 pixel by 8 pixel blocks. The data in each block is transformed by the discrete cosine transform (DCT). The result is an 8×8 matrix of coefficients that have real number values. The transform converts spatial variations into frequency variations, but it does not change the information in the block; if the transform is computed with perfect precision, the original block can be recreated exactly by applying

6570-514: The recording, compression, and distribution of video content, used by 91% of video developers, followed by HEVC which is used by 43% of developers. Consumer video is generally compressed using lossy video codecs , since that results in significantly smaller files than lossless compression. Some video coding formats designed explicitly for either lossy or lossless compression, and some video coding formats such as Dirac and H.264 support both. Uncompressed video formats, such as Clean HDMI ,

6660-450: The results are scaled and rounded, and offsets are typically added. For example, the scaling and offset applied to the Y′ component per specification (e.g. MPEG-2 ) results in the value of 16 for black and the value of 235 for white when using an 8-bit representation. The standard has 8-bit digitized versions of C B and C R scaled to a different range of 16 to 240. Consequently, rescaling by

6750-535: The room is actually slightly less. Since the equations defining Y′CbCr are formed in a way that rotates the entire nominal RGB color cube and scales it to fit within a (larger) YCbCr color cube, there are some points within the Y′CbCr color cube that cannot be represented in the corresponding RGB domain (at least not within the nominal RGB range). This causes some difficulty in determining how to correctly interpret and display some Y′CbCr signals. These out-of-range Y′CbCr values are used by xvYCC to encode colors outside

6840-714: The same gamma function as BT.709. It defines: For both, the coefficients derived from the primaries are: For NCL, the definition is classical: Y ′ = 0.2627 R ′ + 0.6780 G ′ + 0.0593 B ′ {\displaystyle Y'=0.2627R'+0.6780G'+0.0593B'} ; C b = ( B ′ − Y ′ ) / 1.8814 {\displaystyle Cb=(B'-Y')/1.8814} ; C r = ( R ′ − Y ′ ) / 1.4746 {\displaystyle Cr=(R'-Y')/1.4746} . The encoding conversion can, as usual, be written as

6930-519: The same resolution is called 4:4:4 . The MPEG-2 Video document considers all three sampling types, although 4:2:0 is by far the most common for consumer video, and there are no defined "profiles" of MPEG-2 for 4:4:4 video (see below for further discussion of profiles). While the discussion below in this section generally describes MPEG-2 video compression, there are many details that are not discussed, including details involving fields, chrominance formats, responses to scene changes, special codes that label

7020-532: The standard coding technique for video compression from the late 1980s onwards. The first digital video coding standard was H.120 , developed by the CCITT (now ITU-T) in 1984. H.120 was not usable in practice, as its performance was too poor. H.120 used motion-compensated DPCM coding, a lossless compression algorithm that was inefficient for video coding. During the late 1980s, a number of companies began experimenting with discrete cosine transform (DCT) coding,

7110-434: The standard video format for DVD and SD digital television . Its motion-compensated DCT algorithm was able to achieve a compression ratio of up to 100:1, enabling the development of digital media technologies such as video on demand (VOD) and high-definition television (HDTV). In 1999, it was followed by MPEG-4 / H.263 , which was a major leap forward for video compression technology. It uses patents licensed from

7200-428: The two fields are displayed alternately with the lines of one field interleaving between the lines of the previous field; this format is called interlaced video . The typical field rate is 50 (Europe/PAL) or 59.94 (US/NTSC) fields per second, corresponding to 25 (Europe/PAL) or 29.97 (North America/NTSC) whole frames per second. If the video is not interlaced, then it is called progressive scan video and each picture

7290-485: The values being gamma corrected or not. The equations below give a better picture of the common principles and general differences between these formats. Y′CbCr signals (prior to scaling and offsets to place the signals into digital form) are called YPbPr , and are created from the corresponding gamma-adjusted RGB (red, green and blue) source using three defined constants K R , K G , and K B as follows: where K R , K G , and K B are ordinarily derived from

7380-475: The video coding format VP9 itself a codec . As an example of conflation, Chromium's and Mozilla's pages listing their video formats support both call video coding formats, such as H.264 codecs . As another example, in Cisco's announcement of a free-as-in-beer video codec, the press release refers to the H.264 video coding format as a codec ("choice of a common video codec"), but calls Cisco's implementation of

7470-640: The video encoding standards for Blu-ray Discs ; all Blu-ray Disc players must be able to decode H.264. It is also widely used by streaming internet sources, such as videos from YouTube , Netflix , Vimeo , and the iTunes Store , web software such as the Adobe Flash Player and Microsoft Silverlight , and also various HDTV broadcasts over terrestrial ( ATSC standards , ISDB-T , DVB-T or DVB-T2 ), cable ( DVB-C ), and satellite ( DVB-S2 ). A main problem for many video coding formats has been patents , making it expensive to use or potentially risking

7560-521: The video is interlaced. All standards-conforming MPEG-2 Video decoders are also fully capable of playing back MPEG-1 Video streams. The ISO/IEC approval process was completed in November 1994. The first edition was approved in July 1995 and published by ITU-T and ISO/IEC in 1996. Didier LeGall of Bellcore chaired the development of the standard and Sakae Okubo of NTT was the ITU-T coordinator and chaired

7650-440: The way the eye works, it is possible to delete or approximate some data from video pictures with little or no noticeable degradation in image quality. A common (and old) trick to reduce the amount of data is to separate each complete "frame" of video into two "fields" upon broadcast/encoding: the "top field", which is the odd numbered horizontal lines, and the "bottom field", which is the even numbered lines. Upon reception/decoding,

7740-542: Was extended by two amendments to include the registration of copyright identifiers and the 4:2:2 Profile. ITU-T published these amendments in 1996 and ISO in 1997. There are also other amendments published later by ITU-T and ISO/IEC. The most recent edition of the standard was published in 2013 and incorporates all prior amendments. An HDTV camera with 8-bit sampling generates a raw video stream of 25 × 1920 × 1080 × 3 = 155,520,000 bytes per second for 25 frame-per-second video (using

7830-432: Was initially limited to intra-frame coding in the spatial dimension. In 1975, John A. Roese and Guner S. Robinson extended Habibi's hybrid coding algorithm to the temporal dimension, using transform coding in the spatial dimension and predictive coding in the temporal dimension, developing inter-frame motion-compensated hybrid coding. For the spatial transform coding, they experimented with different transforms, including

7920-446: Was introduced in the 1970s, initially using uncompressed pulse-code modulation (PCM), requiring high bitrates around 45–200 Mbit/s for standard-definition (SD) video, which was up to 2,000 times greater than the telecommunication bandwidth (up to 100   kbit/s ) available until the 1990s. Similarly, uncompressed high-definition (HD) 1080p video requires bitrates exceeding 1   Gbit/s , significantly greater than

8010-460: Was later added to the WebM specification. A format is the layout plan for data produced or consumed by a codec . Although video coding formats such as H.264 are sometimes referred to as codecs , there is a clear conceptual difference between a specification and its implementations. Video coding formats are described in specifications, and software, firmware , or hardware to encode/decode data in

8100-426: Was published in 1974. The other key development was motion-compensated hybrid coding. In 1974, Ali Habibi at the University of Southern California introduced hybrid coding, which combines predictive coding with transform coding. He examined several transform coding techniques, including the DCT, Hadamard transform , Fourier transform , slant transform, and Karhunen-Loeve transform . However, his algorithm

#776223