SlideShare a Scribd company logo
1 of 272
Download to read offline
Dr. Mohieddin Moradi
mohieddinmoradi@gmail.com
1
Dream
Idea
Plan
Implementation
Section I
– ISO/IEC JTC 1/SC 29 Structure and MPEG
– ITU-T structure and VCEG (Video Coding Experts Group or Visual Coding Experts Group)
– A Generic Interframe Video Encoder
– H.261 Video Coding Standard
– MPEG-1 Video Coding Standard
– MPEG-2 Video Coding Standard
Section II
– MPEG-2 Transport and Program Streams
– H.263 Video Coding Standard
– H.263+ Video Coding Standard
– H.263++ Video Coding Standard
– Bit-rate (R) and Distortion (D) in Video Coding
2
Outline
JTC1
IEC ISO
SC 29
RAAGM
AG
WG12WG11WG1
WG
JBIG
JPEG
SG
MHEG-5
Main- tenance
MHEG-6
SG
Audio
SNHC
System
Video
Requirements
Implementation Studies
Test
SG
Liaisons
Advisory Group (AG) on Management (AGM)
• To advise SC 29 and its WGs on matters of management that
affect their works.
Advisory Group (AG) on Registration Authority (RA)
WG1: Still images, JPEG and JBIG
• Joint Photographic Experts Group and
Joint Bi-level Image Group
WG11: Video, MPEG
• Motion Picture Experts Group
WG12: Multimedia, MHEG
• Multimedia Hypermedia Experts Group
International
Standardization
Organization
Subcommittee 29
Title: “Coding of Audio, Picture, Multimedia and Hypermedia Information”
Joint Technical Committee
ISO/IEC JTC 1/SC 29 Structure and MPEG
MPEG (Moving Picture Experts Group, 1988 )
To develop standards for coded representation of
digital audio, video, 3D Graphics and other data
International
Electrotechnical
Committee
3
Telecommunication Standardization
Advisory Group (TSAG)
WTSA
World Telecommunication
Standardization Assembly
SG
Workshops,
Seminars,
Symposia
…
IPRs (Intellectual
Property Rights)
WP
Questions: Develop Recommendations
SG
WP WP
Q
Focus
Group
VCEG (ITU-T SG16/Q6) )
• Study Group 16
Multimedia terminals, systems and
applications
• Working Party 3
Media coding
• Question 6
Video coding
Rapporteurs (R):
Mr Gary SULLIVAN, Mr Thomas WIEGAND
SG16
WP3
4
ITU-T structure and VCEG (Video Coding Experts Group or Visual Coding Experts Group)
Administrative Entities
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q6 VCEG
5
ITU, International Telecommunication Union structure
− Founded in 1865, it is the oldest specialized agency of the United Nations system
− ITU is an International organization where governments, industries, telecom operators, service providers
and regulators work together to coordinate global telecommunication networks and services
− Help the world communicate!
− What does ITU actually do?
• Spectrum allocation and registration
• Coordinate national spectrum planning
• International telecoms/ICT standardization
• Collaborate in international tariff-setting
• Cooperate in telecommunications development assistance
• Develop measures for ensuring safety of life
• Provide policy reviews and information exchange
• Insure and extend universal Telecom access
6
ITU, International Telecommunication Union structure
− Plenipotentiary Conference: Key event, all ITU Member States decide on the future role of the organization
(Held every four years)
− ITU Council: The role of the Council is to consider, in the interval between Plenipotentiary Conferences,
broad telecommunication policy issues to ensure that the Union's activities, policies and strategies fully
respond to today's dynamic, rapidly changing telecommunication environment (held yearly)
7
ITU, International Telecommunication Union structure
− General Secretariat: Coordinates and manages the administrative and financial aspects of the Union’s activities
(provision of conference services, information services, legal advice, finance, personnel, etc.)
− ITU-R: Coordinates radio communications, radio-frequency spectrum management and wireless services.
− ITU-D: Technical assistance and deployment of telecom networks and services in developing and least developed
countries to allow the development of telecommunication.
− ITU-T: Telecommunication standardization on a world-wide basis. Ensures the efficient and on-time production of high
quality standards covering all fields of telecommunications (technical, operating and tariff issues). (The Secretariat of ITU-T
(TSB: Telecommunication Standardization Bureau) provides services to ITU-T Participants)
8
ITU, International Telecommunication Union structure
Telecommunication Standardization Bureau (TSB) (Place des Nations, CH-1211 Geneva 20)
− The TSB provides secretarial support for ITU-T and services for participants in ITU-T work (e.g. organization of meeting,
publication of Recommendations, website maintenance etc.).
− Disseminates information on international telecommunications and establishes agreements with many international SDOs.
Mission of ITU-T Standardization Sector of ITU
− Helping people all around the world to communicate and to equally share the advantages and opportunities of
telecommunication reducing the digital divide by studying technical, operating and tariff matters to develop
telecommunication standards (Recommendations) on a worldwide basis.
9
ITU, International Telecommunication Union structure
World Telecommunication Standardization Assembly (WTSA)
− WTSA sets the overall direction and structure for ITU-T, meets every four years and for the next four-year period:
• Defines the general policy for the Sector
• Establishes the study groups (SG)
• Approves SG work programmes
• Appoints SG chairmen and vice-chairmen
Telecommunication Standardization Advisory Group (TSAG)
− TSAG provides ITU-T with flexibility between WTSAs, and reviews priorities, programmes, operations, financial matters and
strategies for the Sector (meets ~~ 9 months )
• Follows up on accomplishment of the work programme
• Restructures and establishes ITU-T study groups
• Provides guidelines to the study groups
• Advises the TSB Director
• Produces the A-series Recommendations on organization and working procedures
• ISO/IEC MPEG = “Moving Picture Experts Group”
(ISO/IEC JTC 1/SC 29/WG 11 = International Standardization Organization and International Electrotechnical
Commission, Joint Technical Committee 1, Subcommittee 29, Working Group 11)
• ITU-T VCEG = “Video Coding Experts Group”
(ITU-T SG16/Q6 = International Telecommunications Union – Telecommunications Standardization Sector (ITU-T,
a United Nations Organization, formerly CCITT), Study Group 16, Working Party 3, Question 6)
• JVT = “Joint Video Team”
Collaborative team of MPEG & VCEG, responsible for developing AVC (discontinued in 2009)
• JCT-VC = “Joint Collaborative Team on Video Coding”
Team of MPEG & VCEG , responsible for developing HEVC (established January 2010)
• JVET = “Joint Video Experts Team”
Exploring potential for new technology beyond HEVC (established Oct. 2015 as Joint Video Exploration Team, renamed
Apr. 2018)
10
Video Coding Standardization Organizations
11
H.263/+/++
(1995-2000+)
MPEG-4
Visual
(1998-2001+)
MPEG-1
(1993)
ISO/IECITU-T
H.261
(1990+)
H.262 / 13818-2
(1994/95-1998+)
(2003-2018+) (2013-2018+)
H.120
(1984-1988)
Computer
SD HD
H.264 / 14496-10
AVC
4K UHD
H.265 / 23008-2
HEVC
It developed by
Joint Video Team (JVT)
It developed by
Joint Collaborative Team on
Video Coding (JCT-VC)
(MPEG-2)
(2020-...)
8K, 360, ...
H.26x / 23090-3
VVC
It will be developed by
Joint Video Experts Team (JVET)
1990 1994 2003 2013 2020
History of Video Coding Standardization (1985 ~ 2020)
Video telephony
12
ITU-T
Standard
Joint
ITU-T/MPEG
Standards
MPEG
Standard
1988 1990 1992 1994 1996 1998 2002 2004 20062000 2008 2010
H.261
(Version 1)
H.261
(Version 2)
H.263 H.263+ H.263++
H.262/MPEG-2 H.264/MPEG-4 AVC H.265/HVC
MPEG-1
MPEG-4
(Version 1)
MPEG-4
(Version 2)
H.261 Video Compression Standard
13
H series are low delay codecs for telecom applications (International Telecommunication Union (ITU-T)
developed several recommendations for video coding)
• H.120 The first digital video coding standard
− H.261 (1990): the first video codec specification, “Video Codec for Audio Visual Services at p x 64kbps”
− H.262 (1995) : Infrastructure of audiovisual services—Coding of moving video
− H.263 (1996): next conf. solution, Video coding for low bit rate communications
− H.263+ (H.263V2) (1998)
− H.263++ (H.263V3)(2000), follow-on solutions
− H.26L: “long-term” solution for low bit-rate video coding for communication applications (Not backward
compatible to H.263+)
− H.264 (H.26L) completed in May 2003 and lead to H.264: known as advanced video coding (AVC)
− H.265/HEVC (2013) High Efficiency Video Coding
ITU H.26x History
14
Motion Picture Experts Group (MPEG) codecs are designed for storage/broadcast/streaming applications
MPEG-1 (1992)
• Started in 1988 by Lenardo Chiariglione
• Compression standard for progressive frame-based video in SIF (360x240) formats
• Applications: VCD
MPEG-2 (1994-5)
• Compression standard for interlaced frame-based video in CCIR-601 (720x480) and high definition (1920x1088i)
formats
• Applications: DVD, SVCD, DIRECTV, GA, DVB, HDTV Studio, DTV Broadcast, DVD, HD, video standards for
television and telecommunications standards
MPEG-4 (1999)
• Multimedia standard for object-based video from natural or synthetic source
• Applications: Internet, cable TV, virtual studio, home LAN etc..
• Object-oriented
• Over-ambitious?
MPEG History
MPEG 21
MPEG-2
MPEG-1
MPEG-4
MPEG-7
15
Motion Picture Experts Group (MPEG) codecs are designed for storage/broadcast/streaming applications
MPEG-7, 2001
• Standardized descriptions of multimedia information, formally called “Multimedia Content Description
Interface”
• Metadata for audio-video streams
• Applications: Internet, video search engine, digital library
MPEG-21, 2002
• Intellectual right protection propose
• Distribution, exchange, user access of multimedia data and intellectual property management
AVC (2003), also known as MPEG-4 version 10
• Conventional to HD
• Emphasis on compression performance and loss resilience
HEVC (2013) High Efficiency Video Coding
MPEG History
MPEG 21
MPEG-2
MPEG-1
MPEG-4
MPEG-7
16
ITU and MPEG (ISO/IEC) have also worked together for joint codecs:
− MPEG-2 is also called H.262
− H.26L has lead to a codec now is called:
• H.264 in telecom
• MPEG-4 (version 10) in broadcast
• AVC (Advanced Video Coding) in broadcast
• Joint Video Team (JVT) Codec
− H.265/HEVC (2013) High Efficiency Video Coding
Joint ITU/MPEG
17
The Story of MPEG and VCEG
18
ITU and MPEG (ISO/IEC) have also worked together for joint codecs:
Joint ITU/MPEG
50% bitrate saving – Direct-to-home
30% bitrate saving – Contribution
50% bitrate saving – Direct-to-home
30% bitrate saving – Contribution
2020
VVC
2020
≈50% bitrate saving – Direct-to-home
≈30% bitrate saving – Contribution
Milestones in Video Coding
19
Milestones in Video Coding
20
21
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
22
Spatial Domain
− Elements are used “raw” in suitable combinations.
− The frequency of occurrence of such combinations is used to influence the design of the
coder so that shorter codewords are used for more frequent combinations and vice versa
(entropy coding).
Transform Domain
− Elements are mapped onto a different domain (i.e. the frequency domain).
− The resulting coefficients are quantised and entropy-coded.
Hybrid
− Combinations of the above.
Classification of Compression Techniques
Current Stage
Used since early days of video compression
standards, e.g. MPEG-1/-2/-4, H.264/AVC, HEVC and
also in most proprietary codecs (VC1, VP8 etc.)
Input Frame 1
,Q
23
A Generic Interframe Video Encoder
Input Frame 1 DCT
,Q
24
A Generic Interframe Video Encoder
Quantized
010011101001…
Input Frame 1 DCT
,Q
25
A Generic Interframe Video Encoder
QuantizedInput Frame 1 DCT
010011101001…
Reconstructed
Frame 1
,Q
26
A Generic Interframe Video Encoder
Input Frame 2
,Q
27
Reconstructed
Frame 1
A Generic Interframe Video Encoder
010011101001…
Entropy Coded MVs
,Q
28
Reconstructed
Frame 1
Input Frame 2
A Generic Interframe Video Encoder
010011101001…
Entropy Coded MVs
,Q
29
Reconstructed Frame 1 with MC
Input Frame 2
A Generic Interframe Video Encoder
Input Frame 2 Residual with MC (Frames 1&2)
,Q
30
Reconstructed Frame 1 with MC
A Generic Interframe Video Encoder
If the motion prediction is successful, the energy
in the residual is lower than in the original frame
and can be represented with fewer bits.
Residual with MC
(Frames 1&2)
DCT
,Q
31
A Generic Interframe Video Encoder
010011101001…
QuantizedDCT
Residual with MC
(Frames 1&2)
,Q
32
A Generic Interframe Video Encoder
Reconstructed Residual with
MC (Frames 1&2)
QuantizedDCT
Residual with MC
(Frames 1&2)
,Q
33
A Generic Interframe Video Encoder
,Q
34
Reconstructed Residual with
MC (Frames 1&2)
Reconstructed Frame 1
with MC
+
Reconstructed Frame 2
with MC
=
A Generic Interframe Video Encoder
35
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
36
− All standard codecs follow the generic interframe codec of: DCT/DPCM/MC/VLC
− Their main differences lie on the way these elements are employed
• Block transform length and type
• Block size for Motion estimation and its precision
• Methods of VLC
• Quantisation
• Coding of quantised transform coefficients
• Addressing of data
• Preventing error propagation
• Various types of coding each frame
Generic Standard Codec
37
− An earlier digital video compression standard, its principle of MC-based compression is retained in all later
video compression standards.
− The standard was designed for videophone, video conferencing and other audiovisual services over ISDN.
− The video codec supports bit-rates of p×64 kbps, where p ranges from 1 to 30 (Hence also known as p ' 64).
− Require that the delay of the video encoder be less than 150 msec so that the video can be used for real-
time bidirectional video conferencing.
− Problems:
• Error propagation
• In case of errors, it needs updating
Video Formats Supported by H.261
H.261 Standard
38
Some Image Formats
Some Picture Formats Recall
39
H.261 Standard
− The coding parameters of the compressed video signal are multiplexed and then combined with the
audio, data and end-to-end signalling for transmission.
− The transmission buffer controls the bit rate, either by changing the quantiser step size at the encoder or, in
more severe cases, by requesting reduction in frame rate to be carried out at the preprocessor.
A block diagram of an H.261 audio-visual encoder
40
1. Block (8x8)
3. Group of Blocks = 33 MBs
GOB
33 Macroblocks (3× 11 Matrix)
1 2 --------------- 11
12 13 --------------- 22
23 24 --------------- 33
MB QCIF
GOB 1
GOB 5
GOB 3
CIF
GOB 1 GOB 2
GOB 3 GOB 4
GOB 5 GOB 6
GOB 7 GOB 8
GOB 9 GOB 10
GOB 11 GOB 12
4. Picture Layer
H.261 Layer Structures
2. Macroblock (MB)
16
16
𝑪𝒓 𝑪𝒃
𝒀
𝒀 𝟎 𝒀 𝟏
𝒀 𝟐 𝒀 𝟑
8
8
8
8
































B
G
R
C
C
Y
b
r
500.0331.0169.0
081.0419.0500.0
114.0587.0299.0
41
Picture layer
Group of Blocks (GOB)
Macroblocks (MB)
Blocks
(CIF=352x288, QCIF=176x144)
(GOB=176x48)
(MB=16x16)
H.261 Layer Structures
GOBs within CIF GOBs within QCIF
352
288
176
144
16
16
1
3
5
7
9
11
2
4
6
8
10
12
macroblocks
within a GOB
1
3
5
1 2 3 4 5 6 7 8 9 10 11
12 13 14 15 16 17 18 19 20 21 22
23 24 25 26 27 28 29 30 31 32 33
Y 0
Y3
Y1
Y2
8
8
A macroblock structure
CrCb
Preventing error propagation
– Macroblock (MB) is the smallest
Coding Unit of video
– In the standard codecs, we only
define how a MB is coded
– How many Luma/Chroma blocks in
an MB, depends on picture format
(B=8x8)
MBA CODE MBA COOE
1 1 17 0000 0101 10
2 011 18 0000 0101 01
3 010 19 0000 0101 00
4 0011 20 0000 0100 11
5 0010 21 0000 0100 10
6 0001 1 22 0000 0100 011
7 0001 0 23 0000 0100 010
8 0000 111 24 0000 0100 001
9 0000 110 25 0000 0100 000
10 0000 1011 25 0000 0011 111
11 0000 1010 27 0000 0011 110
12 0000 1001 28 0000 0011 101
13 0000 1000 29 0000 0011 100
14 0000 0111l 30 0000 0011 011
15 0000 0110 31 0000 0011 010
16 0000 0101 11 32 0000 0011 001
33 0000 0011 000
MBA Stuffing 0000 0001 111
Start code 0000 0000 0000 0001
Macroblock Addressing (MBA)
MBA stuffing:
− An extra codeword in the table for bit stuffing immediately after a GOB header or a coded macroblock.
− This codeword should be discarded by decoders.
42
43
H.261 Layer Structures, Example
44
Intra
Motion Vector
Coding Control
qz
Picture
Memory
+
-
Inter
DCT Q RLC+VLC
Motion
Estimation
Video
Input
+
Q
-1
IDCT
+
Loop
Filter
H.261 Standard
When too much data accumulates in transmission buffer, the
rate controller raises the quantization level to low down quality!
45
H.261 Standard H.261 Frame Sequence
46
H.261 Standard
COMP: a comparator for deciding inter/intra
coding mode for an MB
Th: threshold, to extend the quantisation range
T: transform coding blocks of 8 8 pixels
Q: quantisation of DCT coefficients
P: picture memory with motion-compensated
variable delay
F: loop filter
p: flag for inter/intra
t: flag for transmitted or not
q: quantisation index for transform coefficients
qz: quantiser indication
v: motion vector information
f: switching on/off of the loop filter
− For DC coefficients in Intra mode:
− For all other coefficients:
• scale — an integer in the range of [1, 31])
47
H.261 Standard
A uniform quantiser with threshold
Example: Th=q=16
83 12 21 7 –10 7
–10 35 11 5 –31
–5 15 12
10 –24
5
83
83
88
5
12
0
0
0
–10
0
0
0
–5
0
0
0
35
35
40
2
21
21
24
1
7
0
0
0
11
0
0
0
15
0
0
0
10
0
0
0
5
0
0
0
–24
–24
–24
–1
12
0
0
0
5
0
0
0
–10
0
0
0
7
0
0
0
–31
–31
–24
–1
Raw coefficients
New coefficients
Quantised values
Index
Events to be transmitted: (run, index) (0,5) (3,2) (0,1) (5,–1) (4,–1)
Quantization and Entropy Coding
48
49
− In interframe coding in the event of channel error, the error
propagates into the subsequent frames. If that part of the
picture is not updated, the error can persist for a long time.
− The variance of intraframe MB is compared with that of the
variance of interframe MB (motion compensated or not)
in previous frame. The smallest is chosen.
• For large variances, no preference between the two
modes.
• For smaller variances, interframe is preferred.
− The reason is that, in intra mode, the DC coefficients of the
blocks have to be quantised with a quantiser without a
dead zone and with 8-bit resolutions. This increases the bit
rate compared to that of the interframe mode, and hence
interframe is preferred.
MC/NO_MC mode decision in H.261
Inter/Intra Switch
(Intraframe AC energy)
(Interframe AC energy)
50
P-frame
Motion estimation in H.261 was optional
Macro-block and Motion Vector Range
51
BD –Block Difference
DBD – Displaced Block Difference
X
X
3
2.7
MC
No MC
256
DBD
y 
x 
256
BD
1.5
0.5
1
DBD   c[x, y] r[x  dx, y  dy]
256 MB
1
BD   c[x, y] r[x, y]
256 MB
1
𝑦 = 𝑥/1.1
Motion Compensation Decision Characteristic
– Not all blocks are motion compensated
– The one which generates less bits are preferred.
Macro-block
– Motion estimation of a macroblock involves finding a 16×16-sample
region in a reference frame that closely matches the current
macroblock.
– Luminance: 16x16, four 8x8 blocks
– Chrominance: two 8x8 blocks
– Motion estimation only performed for luminance component
Motion Vector Range
– [ -15, 15]
– MB: 16 x 16
15
15
15 15
Search Area in Reference Frame
MB
52
Macro-block and Motion Vector Range
𝑪𝒓 𝑪𝒃
𝒀
𝒀 𝟎 𝒀 𝟏
𝒀 𝟐 𝒀 𝟑
− Integer pixel ME search only
− Motion vectors are differentially & separately encoded
− 11-bit VLC for MVD (Motion Vector Delta)
Example
MV = 2 2 3 5 3 1 -1
MVD = 0 1 2 -2 -2 -2…
− Binary: 1 010 0010 0011 0011 0011…
]1[][
]1[][


nMVnMVMVD
nMVnMVMVD
yyy
xxx
53
MVD VLC
… …
-2 & 30 0011
-1 011
0 1
1 010
2 & -30 0010
3 & -29 0001 0
Addressing of Motion Vectors
54
1) Motion Estimation for each Marco Block (MB)
MB: 16 x 16
Search range (Motion Vector Range): ±15
2) Select a compression mode
DBD = Displace Block Difference
= 𝑓(𝑥, 𝑦, 𝑡) − 𝑓(𝑥+Δ𝑥, 𝑦 + Δ𝑦, 𝑡 − 1)
3) Process each MB to generate a header followed by a data bit stream that is consistent
with the compression mode chosen.
H.261 Motion Estimation and Compression Modes
]1[][
]1[][


nMVnMVMVD
nMVnMVMVD
yyy
xxx
55
Selection Considerations:
 Variance of Macroblock
 Macroblock Difference (DB)
 Macroblock Displaced Macroblock Difference (DBD)
Determination Rules:
(a) If variance of DBD is smaller than BD
Inter + MC (Selected) (Motion vector must be transmitted)
otherwise:
Motion vector will not be transmitted
(b) Small variance : Intra
Large variance : Inter (Motion vector=8)
(c) Prediction error can be chosen to be modified by a 2-D spatial filter for each 8×8 block.
(separable coefficients with 1/4 1/2 1/4)
H.261 Mode Selection
56
H.261 Mode Selection
Forced Updating
− The intraframe coded MB increases the resilience of H.261 codec to channel errors.
− In case in inter/intra MB decision, no intra mode is chosen, some of the MBs in a frame are forced to be
intra coded.
− The specification recommends that an MB should be updated at least once every 132 frames.
− This means that for CIF pictures with 396 MBs/frame, on average 3 MBs of every frame are intraframe
coded.
57
Decision tree for macroblock type
Types of Macroblocks
1. Inter coded: interframe coded MBs with no motion vector or with a zero motion vector.
2. MC coded: motion-compensated MB, where the MC error is significant and needs to be
DCT coded.
3. MC not coded: these are motion-compensated error MBs, where the motion-
compensated error is insignificant. Hence, there is no need to be DCT coded.
4. Intra coded: intraframe coded MBs.
5. Skipped (not coded, fixed):
• If all the six blocks in an MB without MC have an insignificant energy, they are not
coded. These MBs are sometimes called skipped, not coded or fixed MBs.
• These types of MBs normally occur at the static parts of the image sequence. Fixed
MBs are therefore not transmitted, and at the decoder they are copied from the
previous frame.
• Since the quantiser step sizes are determined at the beginning of each GOB or
row of GOBs, they have to be transmitted to the receiver.
• Hence, the first MBs have to be identified with a new quantiser parameter.
• Therefore, we can have some new MB types:
6. Inter coded + Q
7. MC coded + Q
8. Intra + Q
58
Addressing of Blocks
Once the type of an MB is identified and variable length coded, its position inside the GOB should also be
determined.
− The quantity of the combinations of the coded/noncoded blocks.
• Since an MB has six blocks, there will be 26
= 64 different state.
• Except the one with all six blocks not coded (fixed MB),
the remaining 63 are identified within 63 different patterns.
− The pattern information consists of a set of 63 Coded Block Pattern (CBP) indicating coded/noncoded
blocks within an MB.
− With a coding order of Y0, Y1, Y2, Y3, Cb and Cr, the block pattern information or pattern number is
defined as Pattern number
Where the coded and noncoded blocks are assigned 1 and 0, respectively.
𝑪𝒓 𝑪𝒃
𝒀
𝒀 𝟎 𝒀 𝟏
𝒀 𝟐 𝒀 𝟑
𝑷𝒂𝒕𝒕𝒆𝒓𝒏 𝑵𝒖𝒎𝒃𝒆𝒓 = 𝟑𝟐𝒀 𝟎 + 𝟏𝟔𝒀 𝟏 + 𝟖𝒀 𝟐 + 𝟒𝒀 𝟑 + 𝟐𝑪𝒃 + 𝑪 𝒓
59
Addressing of Blocks
Examples of bit pattern for indicating the coded/not-
coded blocks in an MB (black, coded; white, not coded)
𝑷𝒂𝒕𝒕𝒆𝒓𝒏 𝑵𝒖𝒎𝒃𝒆𝒓 = 𝟑𝟐𝒀 𝟎 + 𝟏𝟔𝒀 𝟏 + 𝟖𝒀 𝟐 + 𝟒𝒀 𝟑 + 𝟐𝑪𝒃 + 𝑪 𝒓The pattern information is not transmitted for
Intracoded MB
− Each pattern number is variable length
coded.
− It should be noted that if an MB is intracoded,
its pattern information is not transmitted.
− This is because, in intraframe coded MB, all
blocks have significant energy and will be
definitely coded.
− In other words, there will not be any
noncoded blocks in an intra coded MB.
EX:
CBP = 1100112 = Transmitting Y1, Y2, Cr, Cb
= 4110
60
CBP CODE CBP CODE
60 111 35 0001 1100
4 1101 13 0001 1011
8 1100 49 0001 1010
16 1011 21 0001 1001
32 1010 41 0001 1000
12 1001 1 14 0001 0111
48 1001 0 50 0001 0101
40 1000 0 42 0001 0100
28 0111 1 15 0001 0011
44 0111 0 51 0001 0010
52 0110 1 23 0001 0001
56 0110 0 43 0001 0000
1 0101 1 5 0000 1111
61 0101 0 37 0000 1110
2 0100 1 26 0000 1101
62 0100 0 38 0000 1100
CBP CODE CBP CODE
24 0011 11 29 0000 1011
36 0011 10 45 0000 1010
3 0011 01 53 0000 1001
63 0011 00 57 0000 1000
5 0010 111 30 0000 0111
9 0010 110 46 0000 0110
17 0010 101 54 0000 0101
33 0010 100 53 0000 0100
6 0010 011 31 0000 0011 1
10 0010 010 47 0000 0011 0
18 0010 001 55 0000 0010 1
34 0010 000 59 0000 0010 0
7 0001 1111 27 0000 0001 1
11 0001 1110 39 0000 0001 0
19 0001 1101
VLC Table for Coded Block Pattern (CBP)
Addressing of Blocks
61
Addressing of Blocks
Relative addressing of coded MB
− The overhead information for addressing of the positions of the coded MB is minimised if they are relatively
addressed to each other.
− Numbers represent the relative addressing value of the number of fixed MBs preceding a nonfixed MB.
− The GOB start code indicates the beginning of the GOB.
− These relative addressing numbers are finally variable length coded.
62
Loop Filter
− At low bit rates the quantiser step size is normally large that can force many DCT coefficients to zero.
− If only the DC and a few AC coefficients remain, then the reconstructed picture appears blocky.
− When the positions of blocky areas vary from one frame to another, it appears as a high-frequency noise,
commonly referred to as mosquito noise.
− The blockiness degradations at the slant edges of the image appear as staircase noise.
− Coarse quantisation of the coefficients that results in the loss of high-frequency components implies that
compression can be modelled as a low-pass filtering process.
− These artefacts are to some extent reduced by using the loop filter. The low-pass filter removes the
highfrequency and block boundary distortions.
63
Loop Filter
− Loop filtering is introduced after the
motion compensator to improve the
prediction.
− It should be noted that the loop filter
has a picture blurring effect.
− It should be activated only for blocks
with motion, otherwise, nonmoving
parts of the pictures are repeatedly
filtered in the following frames, blurring
the picture.
− The filtering should be applied for
coding rates less than 6×64 kbit/s (six
DCT blocks of an MB) and switched off
otherwise.
Coded pictures with loop filter: (a) 128 kbit/s and (b) 64 kbit/s
H.261 coded at (a) 128 kbit/s and (b) 64 kbit/s
64
Mode VLC codes Mquant MVD CBP T COFF
Intra 0001 ×
Intra 0000 001 × ×
Inter 1 × ×
Inter 00001 × × ×
Inter+MC 0000 00001 ×
Inter+MC 0000 0001 × × ×
Inter+MC 0000 000001 × × × ×
Inter+MC+Filter 001 ×
Inter+MC+Filter 01 × × ×
Inter+MC+Filter 000001 × × × ×
H.261 Compression Modes Summary
T
COFF
=Transformed Coefficient
Mquant=Quantization step size for MB
CBP=Coded block pattern (6 bits)
𝑪𝒓 𝑪𝒃
𝒀
𝒀 𝟎 𝒀 𝟏
𝒀 𝟐 𝒀 𝟑
MVD=Motion Vector Delta
65
Bit-Stream Syntax
− The Picture layer: Picture Start Code (PSC) delineates
boundaries between pictures. TR (Temporal Reference)
provides picture time-stamp.
− The GOB layer: H.261 pictures are divided into regions
of 11×3 macroblocks, each of which is called a Group
of Blocks (GOB). (GQuant indicates the Quantizer to be
used in the GOB)
− The Macroblock layer: Each Macroblock (MB) has its
own Address indicating its position within the GOB,
Quantizer (MQuant: Quantizer for Macroblock), and six
8×8 image blocks (4 Y, 1 Cb, 1 Cr).
− The Block layer: For each 8×8 block, the bitstream starts
with DC value, followed by pairs of length of zero-run
(Run) and the subsequent non-zero value (Level) for
ACs, and finally the End of Block (EOB) code. The range
of Run is [0, 63]. Level reflects quantized values — its
range is [−127, 127] and Level )= 0.
66
Bit-Stream Syntax
Picture layer
GOB layer
Macroblock layer
Block layer
Date format for Picture Layer
PSC TR Ptype PEI GOB
20 bits 5 bits 6 bits 1 bit Variable
1. PSC: Picture Start Code
2. TR: Temporal Reference
3. Ptype: Picture Type
4. PEI: Picture Extra Insertion
5. GOB Layer (Variable Length Codes)
VLC: Variable Length Coding
FLC: fixed length coding
Data Format of H.261
67
PSC: Picture Start Code: 20 bits
0000 0000 0000 0001 0000
(one code happen once in a picture)
TR: Temporal Reference: 5 bits (0-31)
Since the last transmitted picture, it is formed by incrementing its value in the previously transmitted picture
header by one plus the number of non-transmitted pictures.
(Each picture unit time: 1/30 or 1/29.97 second)
Format for Picture Layer
PSC TR Ptype PEI GOB
20 bits 5 bits 6 bits 1 bit Variable
68
Ptype: Information about the complete picture
Bit 1: Split screen indicator, "0" off; "1" on.
Bit 2: Document camera indicator, "0" off; "1" on.
Bit 3: Freeze Picture Release, "0" off; "1" on.
Bit 4: Source Format, "0" QCIF; "1" CIF.
Bit 5-6: Spare
PEI: Picture Extra Insertion Information (1 bit)
Bit 1: ,"0" No Pspare; "1" Pspare.
To determine if Pspare 1: + 8-bit Pspare
0: GOB; (usually PEI=0)
Pspare: Picture Spare Information ( 0/8/16 … bits )
• If PEI is set to "1", then 9 bits follow consisting of 8 bits of data (Pspare) and then another PEI bit to
indicate if further 9 bits follow and so on.
• Encoder must not insert Pspare until specified by the CCITT.
• Decoder must specify future "backward" compatible additions in SPARE
Format for Picture Layer
PSC TR Ptype PEI GOB
20 bits 5 bits 6 bits 1 bit Variable
69
PSC TR Ptype PEI Pspare
PEI=0
PEI=1
For 3 (12) GOBs, it will go 3 (12) times
Next Picture
GOB
Picture Layer Loop Structure
PSC TR Ptype PEI GOB
20 bits 5 bits 6 bits 1 bit Variable
70
1. GBSC: Group of Block Start Code
2. GN: Group Number
3. Gquant: GOB quantization number
4. GEI: Group Extra Insertion
5. Gspare: GOB Spare
6. MB Data: Macroblock Data (Variable Length Code)
GOB Date structure
16 bits 4bits 5bits 1bit 0/8/16..bits
GBSC GN Gqunat GEI Gspare MB Data
Format for GOB Layer
71
GBSC: Group of Block Start Code
0000 0000 0000 0001
It is fixed and all codes will not occur again, otherwise the picture crash by finding the start code.
GN: Group Number 4 bits
0000 Reserved for PSC (should not be used)
13, 14, 15Reserved for future use
Gquant: 5 bits
A fixed length codeword which indicates the quantizer to be used in the group of block until overridden
by any subsequent Mquant.
GEI: Picture Extra Insertion Information (1 bit)
Bit 1: ,"0" No Gspare; "1" Gspare.
Gspare: Picture Spare Information ( 0/8/16 … bits )
Same as Pspare
Format for GOB Layer
16 bits 4bits 5bits 1bit 0/8/16..bits
GBSC GN Gqunat GEI Gspare MB Data
72
GBSC GN Gquant GEI Gspare
MB
Layer
Could run for at most 33 times!
GOP Layer Loop Structure
16 bits 4bits 5bits 1bit 0/8/16..bits
GBSC GN Gqunat GEI Gspare MB Data
73
Data structure of MB layer
1. MBA: Marcoblock Address
2. Mtype: Marcoblock Type
3. Mquant: Marcoblock quantization level
4. MVD: Motion Vector Difference
5. CBP: Coded block pattern
6. Block Data
5bits
MBA Mtype Mquant CBPMVD Block Data
Format for MB Layer
74
Macroblock
MBA: Macroblock Address
A variable length codeword indicating the position of a macroblock within a group of blocks
to indicateg the position of a macroblock in the GOB.
GOB
16
Y
Cr Cb
16
8 8
88
Format for MB Layer
5bits
MBA Mtype Mquant CBPMVD Block Data
1 2
12 13
23 24
11
22
33
75
MBA CODE MBA COOE
1 1 17 0000 0101 10
2 011 18 0000 0101 01
3 010 19 0000 0101 00
4 0011 20 0000 0100 11
5 0010 21 0000 0100 10
6 0001 1 22 0000 0100 011
7 0001 0 23 0000 0100 010
8 0000 111 24 0000 0100 001
9 0000 110 25 0000 0100 000
10 0000 1011 25 0000 0011 111
11 0000 1010 27 0000 0011 110
12 0000 1001 28 0000 0011 101
13 0000 1000 29 0000 0011 100
14 0000 0111l 30 0000 0011 011
15 0000 0110 31 0000 0011 010
16 0000 0101 11 32 0000 0011 001
33 0000 0011 000
MBA Stuffing 0000 0001 111
Start code 0000 0000 0000 0001
Macroblock Addressing (MBA)
MBA stuffing:
− An extra codeword in the table for bit stuffing immediately after a GOB header or a coded macroblock.
− This codeword should be discarded by decoders.
76
Mtype: Marcoblock Type
Mquant: (5 bits) - fixed length
Mquant signify the quantizer to be used for this and any following blocks in the GOB until overridden by
any Mquant:
1. Use for coding control
2. Can be adjusted to meet the bit rate required
3. Used to control image quality
MVD: Motion Vector Data: (Variable Length)
MVD is included for all MC macroblocks. MVD is obtained from the macroblock by subtracting the vector
of the preceding macroblock , except :
1. MVD for macroblocks #1, 12, 23
2. MBA does not represent a difference of 1
3. Mtype of the previous marcoblock was not MC
Mquant and MVD Codes
5bits
MBA Mtype Mquant CBPMVD Block Data
77
78
VLC Table for MVD
MVD CODE
-16 & 16 0000 0011 001
-15 & 17 0000 0011 011
-14 & 18 0000 0011 101
-13 & 19 0000 0011 111
-12 & 20 0000 0100 001
-11 & 21 0000 0100 011
-10 & 22 0000 0100 11
-8 & 24 0000 0101 11
-7 & 25 0000 0111
-6 & 25 0000 1001
-5 & 27 0000 1011
-4 & 28 0000 111
-3 & 29 0001 l
-2 & 30 0011
-1 011
0 1
MVD CODE
1 010
2 & -30 0010
3 & -29 0001 0
4 & -28 0000 110
5 & -27 0000 1010
6 & -26 0000 1000
7 & -25 0000 0110
8 & -24 0000 0101 10
9 & -23 0000 0101 00
10 & -22 0000 0100 10
11 & -21 0000 0100 010
12 & -20 0000 0100 000
13 & -19 0000 0011 110
14 & -18 0000 0011 100
15 & -17 0000 0011 010
CBP: is present if indicated by Mtype.
The codeword gives a pattern number signifying those blocks in the macroblock for which at
least one transform coefficient is transmitted. The pattern number is
Where the coded and noncoded blocks are assigned 1 and 0, respectively.
CBP: Coded Block Pattern (Variable length)
5bits
MBA Mtype Mquant CBPMVD Block Data
𝑪𝒓 𝑪𝒃
𝒀
𝒀 𝟎 𝒀 𝟏
𝒀 𝟐 𝒀 𝟑
𝑷𝒂𝒕𝒕𝒆𝒓𝒏 𝒏𝒖𝒎𝒃𝒆 = 𝟑𝟐𝒀 𝟎 + 𝟏𝟔𝒀 𝟏 + 𝟖𝒀 𝟐 + 𝟒𝒀 𝟑 + 𝟐𝑪𝒃 + 𝑪 𝒓
79
MBA Mtype Mquant MVD
MVD
CBP
CBP Block
Layer
MBA STUFFING
5
6
3/4
1/2
MB Layer
MB Layer Loop Structure
5bits
MBA Mtype Mquant CBPMVD Block Data
80
81
− A macroblock comprises four luminance blocks and one of each of the two colour difference
blocks
OR
− Data for a block consists of codewords for transform coefficients followed by an end of block
marker.
− The order of clock transmission is as
1 2
3 4
5 6
𝑌
𝐶𝑟 𝐶𝑏
TCOEFF EOB
𝑪𝒓 𝑪𝒃
𝒀
𝒀 𝟎 𝒀 𝟏
𝒀 𝟐 𝒀 𝟑
Block layer
EOB: End of Block
TCOFF EOB
Block Layer Loop Structure
16
)12(
cos
16
)12(
cos),(
4
)()(
),(

 
vyux
yxf
vCuC
vuF
IDCT: Inverse Discrete Cosine Transform
DCT: Discrete Cosine Transform
16
)12(
cos
16
)12(
cos),()()(
4
1
),(
7
0
7
0

  
 
vyux
vuFvCuCyxf
u v
2
1
)()( vCuC
1)()( vCuC
𝑖𝑓 𝑢 = 𝑣 = 0
𝑜𝑡ℎ𝑒𝑟
DCT and IDCT
82
− For Intra blocks the DC coefficient linearly quantized with a step size of 8 without dead-zone.
− The DC coefficient of all Intra Blocks are fixed length coded (FLC) with 8 bits.
− A nominally black block will give 0001 0000 and a nominally white one 1110 1011.
− The codes 0000 0000 and1000 0000 are not used.
− For Intra DC one, the Reconstruction Levels (RECs) are as following table:
Intra DC Coefficient Inverse Quantization
Reconstruction level (REC)
into inverse transform
0000 0001 (1) 8
0000 0010 (2) 16
0000 0011 (3) 24
0111 1111 (127) 1016
1111 1111 (255) 1024
1000 0001 (129) 1032
1111 1101 (253) 2024
1111 1110 (254) 2032
FLC (Fixed Length Coding)
83
− For all coefficients other than the Intra DC one, the Reconstruction Levels (RECs) are in the
range -2048 to 2047 and are given by clipping the results of the following formulae:
− Note: QUANT ranges from l to 31 and is transmitted by either Gquant or Mquant.
QUANT =“Odd”
REC = QUANT*(2*LEVEL+1); LEVEL > 0
REC = QUANT*(2*LEVEL1); LEVEL < O
QUANT =“Even”
REC = QUANT*(2*LEVEL+1)1; LEVEL > O
REC = QUANT*(2*LEVEL1)1; LEVEL < O
REC = 0; LEVEL=O
DCT Coefficient (except Intra DC) Inverse Quantization
84
QUANT
LEVEL 1 2 3 4 … 8 9 … 17 18 … 30 31
-127 -255 -509 -765 -1019…-2039 -2048 …-2048 -2048 … -2048 -2048
-126 -253 -505 -759 -1011…-2023 -2048 …-2048 -2048 … -2048 -2048
-2 -5 -9 -15 -19 … -39 -45 … -85 -89 … -149 -155
-1 -3 -5 -9 -11 … -23 -27 … -51 -53 … -89 -93
0 0 0 0 0 … 0 0 … 0 0 … 0 0
1 3 5 9 11 … 23 27 … 51 53 … 89 93
2 5 9 15 19 39 45 … 85 89 … 149 155
3 7 13 21 27 … 55 63 … 119 125 … 209 217
4 9 17 27 35 … 71 81 … 153 161 … 269 279
5 11 21 33 43 … 87 99 … 187 197 … 329 341
56 113 225 339 451 … 903 1017 … 1921 2033 … 2047 2047
57 115 229 345 459 … 919 1035 … 1955 2047 … 2047 2047
58 117 233 351 467 … 935 1053 … 1989 2047 … 2047 2047
59 119 237 357 475 … 951 1071 … 2023 2047 … 2047 2047
60 121 241 363 483 … 967 1089 … 2047 2047 … 2047 2047
125 251 501 753 1003 … 2007 2047 … 2047 2047 … 2047 2047
126 253 505 759 1011 … 2023 2047 … 2047 2047 … 2047 2047
127 255 509 765 1019 … 2039 2047 … 2047 2047 … 2047 2047
Reconstruction Levels (REC)
85
86
1 2 6 7 15 16 28 29
3 5 8 14 17 27 30 43
4 9 13 18 26 31 42 44
10 12 19 25 32 41 45 54
11 20 24 33 40 46 53 55
21 23 34 39 47 52 56 61
22 35 38 48 51 57 60 62
36 37 49 50 58 59 63 64
− Transform coefficient data is always present for all six blocks in a macroblock when MTYPE
indicates Intra.
− In other cases MTYPE and CBP signal which blocks-have coefficient data transmitted for them.
− The quantized transform coefficients are sequentially transmitted according to the zig zag
scan sequence as follows.
Ordering of DCT Coefficients or Transform Coefficient (TCOEFF)
87
 The most commonly occurring combinations of (RUN, LEVEL) are encoded with Variable
Length Codes.
 The least commonly occurring combinations of (RUN, LEVEL) are encoded with a 20 bit word
consisting of 6 bits ESCAPE, 6 bits RUN and 8 bits LEVEL.
− There are two code tables for VLC:
• One being used for the first transmitted LEVEL in “Inter” and “Inter + MC” blocks
• One being used for all other LEVELs except DC in Intra blocks witch is fixed length coded with 8 bits.
DCT Coefficients Coding
88
RUN LEVEL CODE
EOB 10
0 1 1s IF FIRST COEFFICIENT
0 1 11s NOT FIRST COEFFICIENT
0 2 0100 s
0 3 0010 1s
0 4 0000 1110 s
0 5 0010 0110 s
0 6 0010 0001 s
0 7 0000 0010 10 s
0 8 0000 0001 1101 s
0 9 0000 0001 1000 s
0 10 0000 0001 0011 s
0 11 0000 0001 0000 s
0 12 0000 0000 1101 0s
0 13 0000 0000 1100 1s
0 14 0000 0000 1100 0s
0 15 0000 0000 1011 1s
RUN LEVEL CODE
1 1 011s
1 2 0001 10s
1 3 0010 0101 s
1 4 0000 0011 00s
1 5 0000 0001 1011 s
1 6 0000 0000 1011 0s
1 7 0000 0000 1010 1s
2 1 0101 s
2 2 0000 100s
2 3 0000 0010 11s
2 4 0000 0001 0100 s
2 5 0000 0000 1010 0s
3 1 0011 1s
3 2 00l0 0100 s
3 3 0000 0001 1100 s
3 4 0000 0000 1001 1s
4 1 0011 0s
4 2 0000 0011 11s
4 3 0000 0001 0010 s
VLC Table for TCOEFF (1)
End of Block (EOB)
− It is in this set.
− Because CBP indicates
those blocks with no
coefficient data, the
EOB cannot occur as
the first coefficient.
− Hence, the EOB can
be removed from the
VLC table for the first
coefficient
RUN LEVEL CODE
5 1 0001 11s
5 2 0000 0010 01s
5 3 0000 0000 1001 0s
6 1 0001 01s
6 2 0000 0001 1110 s
7 1 000l 00s
7 2 0000 0001 0101 s
8 1 0000 111s
8 2 0000 0001 0001
9 1 0000 101s
9 2 0000 0000 1000 1s
10 1 0010 0111 s
10 2 0000 0000 1000 0s
11 1 0010 0011s
12 1 0010 0010 s
13 1 0010 0000 s
RUN LEVEL CODE
14 1 0000 0011 10s
15 1 0000 0011 01s
16 1 0000 0010 00s
17 1 0000 0001 1111
18 1 0000 0001 1010 s
19 1 0000 0001 1001 s
20 1 0000 0001 0111
21 1 0000 0001 0110 s
22 1 0000 0000 1111 1s
23 1 0000 0000 1111 0s
24 1 0000 0000 1110 1s
25 1 0000 0000 1110 0s
26 1 0000 0000 1101 1s
ESCAPE 0000 01
VLC Table for TCOEFF (2)
89
 The least commonly occurring combinations of (RUN, LEVEL) are encoded with a 20 bit word
consisting of 6 bits ESCAPE, 6 bits RUN and 8 bits LEVEL.
Fixed Length Coding Table for TCOEFF
RUN is a 6-bit LEVEL is an 8-bit
fixed length code fixed length code
RUN CODE LEVEL CODE
0 0000 00 -128 FORBIDDEN
1 0000 01 -127 1000 0001
2 0000 10
-2 1111 1110
-1 1111 1111
63 1111 11 0 FORBIDDEN
1 0000 0001
2 0000 0010
127 0111 1111
The last bit "s" denotes the sign of the level,
"0” for positive,"1" for negative 90
91
Bit-Stream Syntax, FLC and VLC Loop Structures Summary
Examples of FLC (Fixed Length Coding)
− PSC: Picture Start Code, 20 bits
− TR: Temporal Reference, 5-bit
− PTYPE: Picture Type, 6 bits
− PEI: Extra insertion information (1 bit) – set if
PSPARE to follow.
− PSPARE: Extra information (0/8/16. . .bits) – not
used, always followed by PEI.
− GBSC: GOB Start Code, 16 bits
− GN: Group Number, 4 bits, indexing 12 GOBs
− GQUANT: Group Quantization information, 5 bits
− MQUANT: MB Quantization information, 5 bits
− EOB: End-of-Block
92
Bit-Stream Syntax, FLC and VLC Loop Structures Summary
Examples of VLC (Variable Length Coding)
− MBA: MB Address, indexing MBs within a GOP,
11 bits max
− MTYPE: MB Type information
− GEI: Same function and size as PEI.
− GSPARE: Same function and size as PSPARE.
− MVD: Motion Vector Data, 11 bits max, 32 VLCs
− CBP: Coded Block Pattern, 9 bits max, 63 VLCs
− TCOEFF: Transform Coefficients
93
− The Problem: H.261 is typically used to send data over a constant bit rate channel, such as ISDN (e.g.
384kbps).
− The encoder output bit rate varies depending on amount of movement in the scene.
− Therefore, a rate control mechanism is required to map this varying bit rate onto the constant bit rate
channel.
Rate Control
94
− The encoded bitstream is buffered and the buffer is emptied at the constant bit rate of the channel
− An increase in scene activity will result in the buffer filling up
• The quantization step size in the encoder is increased which increases the compression factor and reduces
the output bit rate
− If the buffer starts to empty, then the quantization step size is reduced which reduces compression
and increases the output bit rate.
− The compression, and the quality, can vary considerably depending on the amount of motion in
the scene
• Relatively "static" scenes lead to low compression and high quality
• “Active" scenes lead to high compression and lower quality
Encoder
Rate Ctrl
Channel
Buffer
Video
Sequence
Rate Control
− Even when channel coding is used, some residual (transmission) errors may end at the source decoder.
− Residual errors may be detected at the source detector due to syntactical and semantic
inconsistencies.
− For digital video, the most basic error concealment techniques imply:
− Repeating the co-located data from previous frame
− Repeating data from previous frame after motion compensation
− Error concealment for non-detected errors may be performed through post-processing.
95
Error Concealment
96
Error Concealment and Post-Processing, Examples
Error Concealment
97
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
What Is MPEG?
– MPEG is an encoding and compression system for digital multimedia content defined by the
Motion Pictures Expert Group (MPEG).
– MPEG reduces the amount of data needed to represent video many times over, but still
manages to retain very high picture quality.
– MPEG can compress both audio & video
– Similar to the reference model in H.261, software-based reference codecs for laboratory
testing have also been thought for MPEG-1 and MPEG-2. For these codecs, the reference
codec is called the Test Model (TM).
98
− Coding of moving pictures and associated audio for digital storage media (Standard ISO/IEC
11172-2 (1991))
− The MPEG-1 video coding algorithm is largely an extension of H.261, and many of the features
are common. Their bitstreams are, however, incompatible, although their encoding units are
very similar.
− MPEG-1 is the first generation of video codecs proposed by the MPEG as a standard to
provide video coding for digital storage media or DSM (other than the conventional analogue
video cassette recorders (VCRs))
− Since coding for digital storage can be regarded as a competitor to VCRs, MPEG-1 video
quality at the rate of 1–1.5 Mbit/s is expected to be comparable to VCRs.
99
MPEG-1 Standard
− Designed for up to 1.5 Mbit/sec (Although in most applications the MPEG-1 video bit rate is in
the range of 1–1.5 Mbit/s, the international standard does not limit the bit rate, and higher bit
rates might be used for other applications)
− A popular standard for video on the Internet, transmitted as .mpg files.
− Standard for the compression of moving pictures and audio.
− Level 3 of MPEG-1 is the most popular standard for digital compression of audio--known as
MP3.
− Optimized & used for storing movies on CD ROM
− Supports progressive images, non-interlaced video (Interlaced sources have to be converted
to a non-interlaced format before coding.)
100
MPEG-1 Standard
Video
− Optimized for bitrates around 1.5 Mbit/s
− Originally optimized for SIF picture format, but not limited to it:
• 352x240 pixels a 30 frames/sec [ NTSC based ]
• 352x288 pixels at 25 frames/sec [ PAL based ]
− Progressive frames only - no direct provision for interlaced video applications, such as broadcast television
Audio
− Joint stereo audio coding at 192 kbit/s (layer 2)
System
− Mainly designed for error-free digital storage media
− Multiplexing of audio, video and data
Applications
− CD-I, digital multimedia, and video database (e.g. video-on-demand)
101
MPEG-1 Standard (Standard ISO/IEC 11172-2 (1991))
Source Input
− Supports only 352 * 240 resolution
− All the three main picture types, I, P and B, have the same SIF size with 4:2:0 format.
− (In SIF-625, the luminance part of each picture has 360 pixels, 288 lines and 25 Hz, and those of each chrominance
are 180 pixels, 144 lines and 25 Hz)
− Before we describe how I-frames are encoded, we should describe our input.
− 3 planes of Y, U, V
• 8 bits per pixel.
• Y range [0,255].
• U and V range [-128,127] (U and V biased by 128 to put in range [0,255])
− Planes are all of the same size.
− Pixels colocated between frames.
MPEG-1 Standard
102
103
MPEG-1 Standard
H.261 MPEG-1
Sequential Access Random Access
One basic frame rate Flexible frame rate
OCIF and CIF images only Flexible image size
I and P frame only I, P, and B frames
MC over 1 frame MC over 1 or more frame
1 pixel MV accuracy 1/2 pixel MV accuracy
121 filter in the loop No filter
Variable threshold + Uniform quantiz. Quantization Matrix
No GOP structure GOP structure
GOB structure Slice structure
− The MPEG-1 standard gives the syntax description of how audio, video and data are combined
into a single data stream. This sequence is formally termed as the ISO 11172 stream.
− It consists of a compression layer and a systems layer.
104
Systems Coding Outline
To support the combination of video
and audio elementary streams
Multiplexing of elementary
audio, video and data
− The MPEG-1 systems standard defines a packet structure for multiplexing coded audio and video
into one stream and keeping it synchronised.
− A pack consists of a pack header that gives the systems clock reference (SCR) and the bit rate of
the multiplexed stream followed by one or more packets.
− Each packet has its own header that conveys essential information about the elementary data
that it carries.
− The basic functions in systems layer are as follows:
• Synchronised presentation of decoded streams
• Construction of the multiplexed stream
• Initialisation of buffering for playback start-up
• Continuous buffer management
• Time identification
105
Systems Coding Outline
Multiplexing elementary streams
− The multiplexing of elementary stream (ES) of audio, video and data is performed at the packet
level.
− Each packet thus contains only one elementary data type.
− The systems layer syntax allows up to 32 audio, 16 video and 2 data streams to be multiplexed
together.
− If more than two data streams are needed, substreams may be defined.
106
Systems Coding Outline
107
Systems Coding Outline
ES Packetization process into MPEG-1 PS Stream (Packs)
Packet
Header
Packet
Payload
Pack
Header
Pack
Payload
108
Systems Coding Outline
MPEG-1 PS bitstream and its time related fields
SCR: Systems Clock Reference
STD: System Target Decoder
PTS: Presentation Time Stamp
DTS: Decoding Time Stamp
Synchronisation
− Prototypical encoder and decoder of MPEG-1,
illustrating end-to-end synchronisation
• STC: Systems Time Clock
• SCR: Systems Clock Reference
• PTS: Presentation Time Stamp
• DSM: Digital Storage Media
109
Systems Coding Outline
Synchronisation
− Multiple elementary streams are synchronised by means of Presentation Time Stamps (PTS) in the ISO
11172 bit stream (by recording time stamps during capture of raw data)
− The receivers will then make use of these PTS in each associated decoded stream to schedule their
presentations.
− Playback synchronisation is pegged onto a master time base, which may be extracted from one of the
elementary streams, DSM, channel or some external source.
− The occurrences of PTS and other information such as SCR and systems headers will also be essential for
facilitating random access of the MPEG-1 bitstream.
− This set of access codes should therefore be located near to the part of the elementary stream where
decoding can begin. In the case of video, this site will be near the head of an intraframe.
110
Systems Coding Outline
111
Structure of the Coded Bit-Stream
• Intraframe Compression
– Frames marked by (I) denote the frames that are strictly intraframe compressed.
– The purpose of these frames, called the "I pictures", is to serve as random access points
to the sequence.
I Frames
112
• P Frames use motion-compensated forward predictive compression on a block basis.
– Motion vectors and prediction errors are coded.
– Predicting blocks from closest (most recently decoded) I and P pictures are utilised.
Forward Prediction
P Frames
113
• B frames use motion-compensated bi-directional predictive compression on a block basis.
– Motion vectors and prediction errors are coded.
– Predicting blocks from closest (most recently decoded) I and P pictures are utilised.
Forward Prediction
Bi-Directional Prediction
B Frames
114
Backward Prediction
• Relative number of (I), (P), and (B) pictures can be arbitrary.
• Group of Pictures (GOP) is the Distance from one I frame to the next I frame
1 2 3 4 5 6 7 8 9 10 11 12 1
GOP = 12
Group of Pictures
115
1 2 3 4 5 6 7 8 9 10 11 12 1
Source and Display Order
Transmission Order
116
Structure of the Coded Bit-Stream, Example
I-pictures
• They are coded without reference to the previous picture.
• They provide access points to the coded sequence for decoding (intraframe coded as for JPEG)
P-pictures
• They are predictively coded with reference to the previous I- or P-coded pictures.
• They themselves are used as a reference (anchor) for coding of the future pictures.
B-pictures
• Bidirectionally coded pictures, which may use past, future or combinations of both pictures in their
predictions.
D-pictures
• As intraframe coded, where only the DC coefficients are retained.
• Hence, the picture quality is poor and normally used for applications like fast forward.
• D-pictures are not part of the GOP; hence, they are not present in a sequence containing any other
picture types. 117
Structure of the Coded Bit-Stream
Group of pictures and Reordering
− I and P pictures are called “anchor” pictures
− A GOP is a series of one or more pictures to assist random access into the picture sequence.
− The GOP length is normally defined as the distance between I-pictures, which is represented by
parameter N in the standard codecs.
− The distance between the anchor I/P and P-pictures is represented by M.
− The encoding or transmission order of pictures differs from the display or incoming picture order.
− This reordering introduces delays amounting to several frames at the encoder (equal to the number of B-
pictures between the anchor I- and P-pictures).
− The same amount of delay is introduced at the decoder in putting the transmission/ decoding sequence
back to its original. This format inevitably limits the application of MPEG-1 for telecommunications.
− A GOP, in coding, must start with an I picture and in display order, must start with an I or B picture and
must end with an I or P picture
118
Structure of the Coded Bit-Stream
119
Structure of the Coded Bit-Stream
Video Sequence
... ...
Group of Pictures
Picture
Slice
Macroblock
8
pixels
8
pixels
Block
Video Sequence
– Begins with a sequence header and ends with an end-of-sequence code.
– It includes one or more groups of pictures.
Group of Pictures (GOP)
– A Header and a series of one or more pictures intended to allow random access into the
sequence.
120
Structure of the Coded Bit-Stream
Video Sequence
... ...
Group of Pictures
Picture
Slice
Macroblock
8
pixels
8
pixels
Block
Picture
• The primary coding unit of a video sequence.
A picture consists of three rectangular matrices representing
luminance (Y) and two chrominance (Cb and Cr) values.
Slice
• Each picture is divided into a group of macroblocks, called
slices. Slices can have different sizes within a picture, and
different division in pictures.
• The reason for defining a slice is resetting the variable length
code (VLC) to prevent channel error propagation into the
picture. Each slice is coded independently from the other
slices of the picture.
• Slice are important in the handling of errors. If the bit stream
contains an error, the decoder can skip to the next slice.
121
Structure of the Coded Bit-Stream
Video Sequence
... ...
Group of Pictures
Picture
Slice
Macroblock
8
pixels
8
pixels
Block
− If the coded data are corrupted, and the decoder detects it, then it can search for the new slice, and
the decoding starts from that point.
− Each slice starts with a slice start code and is followed by a code that defines its position and a code
that sets the quantisation step size.
122
Structure of the Coded Bit-Stream
− To optimise the slice structure, that is, to give a good immunity from channel errors and at the same time
to minimise the slice overhead, one might use short slices for macroblocks with significant energy (such
as intra MB) and long slices for less significant ones (e.g. macroblocks in B-pictures).
123
Structure of the Coded Bit-Stream
Short slices for macroblocks with significant energy
− The division of slices may vary from picture to picture.
− If "restricted slice structure" is applied, the slices must cover the whole pictures.
− If "restricted slice structure" is not applied, the decoder will have to decide what to do with that part of
the picture, which is not covered by a slice.
124
Structure of the Coded Bit-Stream
Restricted Slice StructureGeneral Slice Structure
A
B
C
G
E D
F
H
I
A
B
C
GE
D
F
H
I
J
K
OM
L
N
A
I
A
C
G
E
D
F
H
B
I
Macro block
• A portion of image that consists of 16x16 pixels and
comprises 4 blocks of luminance component and
1 block each of the 2 chrominance components.
• At this layer, motion compensation and prediction
are performed.
• Since a slice has a raster scan structure,
macroblocks are addressed in a raster scan order.
• The top left macroblock in a picture has address 0,
the next one on the right has address 1 and so on.
125
Structure of the Coded Bit-Stream
Video Sequence
... ...
Group of Pictures
Picture
Slice
Macroblock
8
pixels
8
pixels
Block
Macro block
• To reduce the address overhead, macroblocks
are relatively addressed by transmitting the
difference between the current macroblock and
the previously coded macroblock.
• This difference is called macroblock address
increment.
• In I-pictures, since all the macroblocks are coded,
the macroblock address increment is always 1.
• The first and last macroblocks of a slice, shall not
be skipped macroblocks.
126
Structure of the Coded Bit-Stream
Video Sequence
... ...
Group of Pictures
Picture
Slice
Macroblock
8
pixels
8
pixels
Block
Block and Color Sampling
127
4:2:0
Block
• A matrix of 8x8 elements.
• One of the ways rate control is achieved is by increasing the quantisation step size in blocks which would otherwise
have a higher entropy.
128
YUV Y Only
YUV YUV
YUV
Sampling
Points
13.5 MHz
4:2:2
4:4:4
Recall, 4:4:4 & 4:2:2 Sampling
129
YUV Y Only Y Only Y Only
4:2:0
YUV
Sampling
Points
13.5 MHz
4:1:1
Y V Y
Y U Y
JPEG/JFIF
H.261
MPEG-1
Recall, 4:1:1 & 4:2:0 MPEG-1 Sampling
130
YUV Y Only Y Only Y Only
4:2:0
YUV
Sampling
Points
13.5 MHz
4:1:1
YV Y Only
YU Y Only
Co-sited
Sampling
MPEG-2
Recall, 4:1:1 & 4:2:0 MPEG-2 Sampling
131
4:2:0
YV Y Only
YU Y Only
Co-sited
Sampling
MPEG-2
4:2:0 Sampling in MPEG-1 and MPEG-2
4:2:0
Y V Y
Y U Y
JPEG/JFIF
H.261
MPEG-1
Downsize chrominance Components.
• 4:2:0 (with chrominance samples centered)
• Requires bilinear interpolation
Structure of the Coded Bit-Stream, Summary
• Sequence layer: picture dimensions, pixel
aspect ratio, picture rate, minimum buffer size,
DCT quantization matrices
• GOP layer: will have one I picture, start with I or
B picture, end with I or P picture, has closed
GOP flag, timing info, user data
• Picture layer: temporal ref number, picture
type, synchronization info, resolution, range of
motion vectors
• Slices: position of slice in picture, quantization
scale factor
• Macroblock: position, H and V motion vectors,
which blocks are coded and transmitted
GOP-1 GOP-2 GOP-n
I B B B P B B..
Slice-1
Slice-2
…
Slice-N
MB-1 MB-2 MB-n
0 1
2 3 4 5
Sequence layer
GOP layer
Picture layer
Slice layer
Macroblock layer
8x8 block
132
133
Headers in Structure of the Coded Bit-Stream
Seq. Header
• Width
• Height
• Frame Rate
• Buffer Control
GOP Header
• Time Code
Picture Header
• Temporal Ref
• Picture Type
• Motion Vector Parameters
Picture Data Seq. End Code
• All headers begin with 23 zeroes followed by 9 bits that indicate header type.
• Encoding process will never produce 23 zeroes.
Headers in Structure of the Coded Bit-Stream
134
135
Motion
Estimator
MC Mode
Decision
Picture
Predictor
& Store
MVMC
Modes
Residual
DCT Q
Q-1
IDCT
Decoded
Picture
Prediction
Lossless
Coder
(RLC+VLC)
Rate Control
Buffer
Coded Video
Bit Steam
Ordered
Source
Pictures
_+
++
MPEG Video Encoding
Simplified MPEG Encoder
The main differences between this encoder and H.261
Frame reordering: at the input of the encoder, coding
of B-pictures is postponed to be carried out after
coding the anchor I- and P-pictures.
Quantisation: intraframe coded macroblocks are
subjectively weighted to emulate perceived coding
distortions.
Motion estimation: not only is the search range
extended but the search precision is increased to half a
pixel. B-pictures use bidirectional motion compensation.
No loop filter.
Frame store and predictors: to hold two anchor pictures
for prediction of B-pictures.
Rate regulator: here there is more than one type of
picture, each generating different bit rates.
136
MPEG-1 Encoder
− Within each picture, macroblocks are coded in a sequence from left to right.
− Since 4:2:0 image format is used, the six blocks of 8×8 pixels, four luminance and one of each
chrominance components are coded in turn.
− First, for a given macroblock, the coding mode is chosen. This depends on the picture type, the effectiveness of
motion-compensated prediction in that local region and the nature of the signal within the block.
− Second, depending on the coding mode, a motion-compensated prediction of the contents of the block based
on the past and/or future reference pictures is formed. This prediction is subtracted from the actual data in the
current macroblock to form an error signal.
− Third, this error signal is divided into 8×8 blocks and a DCT is performed on each block. The resulting DCT
coefficients is quantised and is scanned in zigzag order to convert into a one-dimensional string of quantised DCT
coefficients.
− Fourth, the side information for the macroblock, including the type, block pattern, motion vector and address
alongside the DCT coefficients are coded (The DCT coefficients are run length coded)
137
MPEG-1 Encoder
− The insensitivity of the human visual system to high-frequency distortions can be exploited for further
bandwidth compression.
− The DCT coefficients, prior to quantisation (-2047 to +2047), are divided by the weighting matrix.
− Weighted coefficients are then quantised by the quantisation step size, and at the decoder, reconstructed
quantised coefficients are then multiplied to the weighting matrix to reconstruct the coefficients.
138
Default Intra and Inter Quantisation Weighting Matrices
DCT Coefficients Weighting Matrix
Quantisation by
Quantisation Step Size
÷
Intra Quantisation Weighting Matrix
− Experience has shown that for SIF pictures, a suitable distortion weighting matrix for the intra-DCT
coefficients is the one shown in Figure. This intra matrix is used as the default quantisation matrix for
intraframe coded macroblocks.
Inter (or Nonintra) Quantisation Weighting Matrix (A flat matrix)
− The different weightings may not be used for interframe coded macroblocks.
− This is because high-frequency interframe error does not necessarily mean high spatial frequency.
(It might be due to poor motion compensation or block boundary artefacts).
139
Default Intra and Inter Quantisation Weighting Matrices
The strategy for motion estimation in this codec is different from the H.261 in four main respects:
1. Motion estimation is an integral part of the codec.
• The motion estimation in H.261 was optional.
2. Motion search range is much larger (larger search area).
• H.261 is normally used for head-and-shoulders pictures, where the motion speed is normally very small.
• In contrast, MPEG-1 is used mainly for coding of films with much larger movements and activities.
3. Higher precision of motion compensation is used.
• Motion estimation with half-pixel precision
4. B-pictures can benefit from bidirectional motion compensation.
• When B-pictures are present, due to various distances between a picture and its anchor, it is expected that
the search range for motion estimation to be different for different picture types.
• For normal scenes, the maximum search range for P-pictures is usually taken as 11 pixels/3 frames, and the
forward and backward motion range for B1-pictures are 3 pixels/frame and 7 pixels/2 frames, respectively.
These values for B2-pictures become 7 and 3.
140
Motion Estimation
Motion estimation with half-pixel precision
− The normal block matching with integer pixel positions is carried out first.
− Then eight new positions, with a distance of half a pixel around the final integer pixel, are tested.
141
Motion Estimation
Motion-compensated prediction error (a) with and (b) without half-pixel precision
Coding of Pictures
change
MQUANT
no change to
MQUANT
I picture
change
MQUANT
no change to
MQUANT
coded not coded
interframe
change
MQUANT
no change to
MQUANT
intraframe
motion comp.
A
motion vector
set to 0
P picture
A
Fwd motion
compensation
A
Bwd motion
compensation
A
interpolated
compensation
B picture
Picture Type
142
A
MQUANT: MB Quantization information
In I-pictures, all the macroblocks are intra coded.
− There are two intra macroblock types:
intra-d: one that uses the current quantiser scale
• Variable length coded with 1
• The default value when the quantiser scale is not changed
• no quantiser scale is transmitted and the decoder uses the previously set value.
intra-q: and the other that defines a new value for the quantiser scale, intra-q
• Variable length coded with 01
• The macroblock overhead should contain an extra 5 bits to define the new quantiser scale between 1 and 31
• In I-pictures of MPEG-1, an intra-q can be any of the macroblocks.
143
I-pictures Coding
DC indices are coded losslessly by DPCM (DC_DIFF)
− The quantiser step size is different for different coefficients and may change from MB to MB.
− The only exception is the DC coefficients, which are treated differently. This is because the eye is
sensitive to large areas of luminance and chrominance errors; then the accuracy of each DC value
should be high and fixed.
− The quantiser step size for the DC coefficient is fixed to eight. Since in the quantisation weighting matrix,
the DC weighting element is eight, then the quantiser index for the DC coefficient is always 1,
irrespective of the quantisation index used for the remaining AC coefficients.
− Because of the strong correlation between the DC values of blocks within a picture, the DC indices are
coded losslessly by DPCM (DC_DIFF).
− Such a correlation does not exist among the AC coefficients, and hence they are coded independently.
144
I-pictures Coding
DC indices are coded losslessly by DPCM (DC_DIFF)
− The prediction for the DC coefficients of luminance blocks follows the coding order of blocks within a
macroblock and the raster scan order.
− For example, in the macroblocks of 4:2:0 format pictures shown in Figure, the DC coefficient of block Y2
is used as a prediction for the DC coefficient of block Y3.
− The DC coefficient of block Y3 is a prediction for the DC coefficient of Y0 of the next macroblock.
− For the chrominance, we use the DC coefficients of the corresponding value of the block in the previous
macrobloc
145
I-pictures Coding
𝑪𝒓 𝑪𝒃
𝒀
𝒀 𝟎 𝒀 𝟏
𝒀 𝟐 𝒀 𝟑
DC term is expressed as difference from previous DC term (DC_DIFF)
Encoded as two parts:
– Size of difference (i.e., log(DC_DIFF))
– Size number of bits that provides the value.
Size is encoded as a Huffman code.
AC terms are given as (run,value) pairs.
Encoded in one of two ways:
– Huffman code for (run, abs(value)) followed by single bit for sign of value.
– Special Huffman code indicating ESCAPE, followed by 6 bits for run and either 8 or 16 bits for value.
• 6 bits for run simply encode 0 through 63
• First 8 bits of value put value at –128 to 127.
• If first 8 bits is -128, next 8 bits provide codes for –128 through –255
• If first 8 bits is 0, next 8 bits provide codes for 128 through 255.
DC and AC Terms Coding
Macroblock Block to be encoded
8
8
8 8
DCT
Q
sz
DPCM
DC
AC
ZigZag
Scanning
Runlength
Encoding
VLC
sz: Step Size
JPEG encoded DC
JPEG encoded AC
146
Similar to those of H.261
− 8 types of macroblocks for P-frames:
• intra-d and intra-q: the same as used in I-frames
• pred-m: the macroblock is forward-predictive encoded (difference from
the previous frame) using a forward motion vector
• pred-c: the macroblock is encoded using a coded pattern; a 6-bit
coded block pattern is transmitted as a variable-length code and this
tells the decoder which of the 6 blocks in the macroblock are coded (1)
and which are not coded (0)
• pred-mc: the macroblock is forward-predictive encoded using a forward
motion vector and also a 6-bit coded pattern is included
• pred-cq: a pred-c macroblock with a new quantization scale
• pred-mcq: a forward-predictive macroblock encoded using a coded
pattern with a new quantization scale
• skipped: they have a zero motion vector and no code; the decoder
copies the corresponding macroblock from the previous frame into the
current frame
147
P-pictures Coding
− The encoder has more decisions to make than in the case of P-pictures.
− These are how to divide the picture into slices; determine the best motion vectors to use; decide
whether to use forward, backward or interpolated motion compensation or to code intra; and how to
set the quantiser scale.
− The encoder first calculates the best forward motion-compensated macroblock from the previous
anchor picture for forward motion compensation.
− It then calculates the best motion-compensated macroblock from the future anchor picture, as the
backward motion compensation.
− Finally, the average of the two motion-compensated errors is calculated to produce the interpolated
macroblock. It then selects one that had the smallest error difference with the current macroblock.
− In the event of a tie, an interpolated mode is chosen.
148
B-pictures Coding
149
B-pictures Coding
12 types of macroblocks for B-frames
• intra-d, intra-q: the same as used for I-frames
• pred-i: bidirectionally-predictive encoded macroblock with forward
motion vector and backward motion vector
• pred-ic: a pred-c macroblock encoded using a 6-bit coded pattern
• pred-b: backward-predictive encoded macroblock with backward
motion vector
• pred-bc: a pred-b macroblock encoded using a 6-bit coded
pattern
• pred-f: forward-predictive encoded macroblock with forward
motion vector
• pred-fc: a pred-b macroblock encoded using a 6-bit coded pattern
• pred-icq: a pred-ic macroblock with a new quantization scale
• pred-fcq: a pred-fc macroblock with a new quantization scale
• pred-bcq: a pred-bc macroblock with a new quantization scale
• skipped: the same as for P-frames.
150
Video Sequence l Sequence 2
Picture (I) Picture (B) Picture (B) Picture (I)
Slice 1 Slice 2 Slice N…...
MB 1 MB 2 MB6…...
Block 1 Block 2 Block 6…...
GOP l GOP2 GOP 12 GOP 13
Video Sequence Structure
151
Layers of MPEG-1 Video Bit stream
152
Layers of MPEG-1 Video Bit stream
• Video Sequence Layer Header contains: the picture size (horizontal and
vertical), pel aspect ratio, picture rate, bit rate, minimum decoder buffer size,
constraint parameters flag, control for loading 64-bit values for intra and
nonintra quantization tables and user data
• GOP layer header contains: the time interval from the start of the video
sequence, the closed GOP flag (decoder needs frames from previous GOP or
not?), broken link flag and user data
• Picture layer header contains: the temporal reference of the picture, picture
type (I,P,B,D), decoder buffer initial occupancy, forward motion vector
resolution and range for P- and B-frames, backward motion vector resolution
and range for B-frames and user data
• Slice layer header contains: vertical position where the slice starts and the
quantizer scale for this slice
• Macroblock layer header contains: optional stuffing bits, macroblock address
increment, macroblock type, quantizer scale, motion vector, coded block
pattern
• A block contains: 8x8 coded DCT coefficients
153
MPEG-1 Bit Stream Organization
Seq.
Header
Block
Data
MB
Header
Slice
Header
Picture
Header
GOP
Header
154
Coded
Video
Bit Steam
Picture
Frame
Buffer
MVs
MV Mode
Q-1 IDCT
Ordered
Source
Pictures
Lossless
Decoder
VLC/RLC
Motion
Compensation
Simplified MPEG Decoder
155
Simplified MPEG Decoder
Inverse
Scan
Inverse
DCT
Motion
Compen-
sation
Frame-
Store
Memory
Inverse
Quantiz-
ation
Variable
Length
Decoding
Coded
Data
QFS[n] QF(u, v)
F(u, v) f(x, y) d(x, y)
− The incoming bitstream is stored in the buffer and is demultiplexed into the coding parameters such as
DCT coefficients, motion vectors, macroblock types and addresses.
− They are then variable length decoded using the locally provided tables.
− The DCT coefficients after inverse quantisation are inverse DCT transformed and added to the motion-
compensated prediction (as required) to reconstruct the pictures.
− The frame stores are updated by the decoded I- and P-pictures.
− Finally, the decoded pictures are reordered to their original scanned form.
− At the beginning of the sequence, the decoder will decode the sequence header, including the
sequence parameters.
156
MPEG-1 Decoder
Picture Header
Picture Data Row Major Scan of Encoded Macroblocks
Macroblock Address Increment (1-bit)
Macroblock Type (1 or 2 bits)
Q Scale (5 bits)
Luminance Blocks U Block V Block
Stepping Back a Bit in Decoder
DC Size (2-7 bits)
DC Bits (0-8 bits)
First Non-zero AC Coeff.
(variable bit length)
Last Non-zero AC Coeff.
(variable bit length)
EOB (2 bits) 157
Encoder
Output buffer
Decoder
Input buffer
Filled at a variable rate because the
encoder output bit rate is variable
(depends on how much change is
going on between frames)
If a fixed bit rate channel is used, then buffering is required.
Emptied at a constant
rate by the channel.
• Feedback mechanism detects when buffer is
at risk of over-flowing or under-flowing.
• This is used to adjust the degree of quantisation
– and hence the quality of the images being
transmitted.
Buffering
158
− A coded bitstream contains different types of pictures, and each type ideally requires a different number of bits to
encode.
− In addition, the video sequence may vary in complexity with time, and it may be desirable to devote more coding bits
to one part of a sequence than to another.
− For constant bit rate coding, varying the number of bits allocated to each picture requires that the decoder has a buffer
to store the bits not needed to decode the immediate picture.
− The extent to which an encoder can vary the number of bits allocated to each picture depends on the size of this
buffer (i.e. decoder buffer).
− large buffer → greater variations → increasing the picture quality → increasing the decoding delay
− The delay is the time taken to fill the input buffer from empty to its current level
− An encoder needs to know the size of the decoder’s input buffer in order to determine to what extent it can vary the
distribution of coding bits among the pictures in the sequence.
159
Video Buffer Verifier (VBV)
− The decoder will display the decoded pictures at their specific rate.
− If the display clock is not locked to the channel data rate, and this is typically the case, then any
mismatch between the encoder and channel clock and the display clock will eventually cause a buffer
overflow or underflow.
Model Decoder
− The model decoder is defined to resolve three problems:
– It constrains the variability in the number of bits that may be allocated to different pictures;
– It allows a decoder to initialise its buffer when the system is started;
– It allows the decoder to maintain synchronisation while the stream is played.
160
Video Buffer Verifier (VBV)
The definition of the parameterised model decoder is known as Video Buffer Verifier (VBV).
− The parameters used by a particular encoder are defined in the bitstream.
− This really defines a model decoder that is needed if encoders are to be assured that the coded
bitstream they produce will be decodable.
• A fixed rate channel is assumed to put bits at a constant rate into the buffer, at regular intervals, set by the picture rate
• The picture decoder instantaneously removes all the bits pertaining to the next picture from the input buffer (Practical
decoders may differ).
• If there are too few bits in the input buffer, that is, all the bits for the next picture have been received, then the input buffer
underflows, and there is an underflow error.
• If during the time between the picture starts, the capacity of the input buffer is exceeded, then there is an overflow error. 161
Video Buffer Verifier (VBV)
− Practical decoders may differ from this model in several important ways.
− They may not remove all the bits required to decode a picture from the input buffer instantaneously.
− They may not be able to control the start of decoding very precisely as required by the buffer fullness parameters in the
picture header, and they take a finite time to decode.
− They may also be able to delay decoding for a short time to reduce the chance of underflow occurring.
− But these differences depend in degree and kind on the exact method of implementation.
− To satisfy the requirements of different implementations, the MPEG video committee chose a very simple model for the
decoder.
− Practical implementations of decoders must ensure that they can decode the bitstream constrained in this model.
− In many cases, this will be achieved by using an input buffer that is larger than the minimum required and by using a
decoding delay that is larger than the value derived from the buffer fullness parameter.
− The designer must compensate for any differences between the actual design and the model in order to guarantee
that the decoder can handle any bitstream that satisfies the model.
− Encoders monitor the status of the model to control the encoder so that overflow does not occur.
− The calculated buffer fullness is transmitted at the start of each picture so that the decoder can maintain
synchronisation.
162
Video Buffer Verifier (VBV)
− The encoder must make sure that the input buffer of the model decoder is neither overflowed nor
underflowed by the bitstream.
− Since the model decoder removes all the bits associated with a picture from its input buffer
instantaneously, it is necessary to control the total number of bits per picture.
− The encoder could control the bit rate by simply checking its output buffer content. As the buffer fills up,
the quantiser step size is raised to reduce the generated bit rate, and vice versa.
− This situation in MPEG-1, because of the existence of three different picture types, where each generates
a different bit rate, is slightly more complex.
− First, the encoder should allocate the total number of bits among the various types of picture within a
GOP, so that the perceived image quality is suitably balanced.
− The distribution will vary with the scene content and the particular distribution of I-, P- and B-pictures
within a GOP.
163
Rate Control and Adaptive Quantisation
− Investigations have shown that for most natural scenes, each P-picture might generate as many as two
to five times the number of bits of a B-picture, and an I-picture three times those of the P-picture.
− If there is little motion and high texture, then a greater proportion of the bits should be assigned to I-
pictures.
− Similarly, if there is strong motion, then a proportion of bits assigned to P-pictures should be increased.
− In both cases, lower quality from the B-pictures is expected to permit the anchor I- and P-pictures to be
coded at their best possible quality.
− Our investigations with variable bit rate (VBR) video, where the quantiser step size is kept constant (no
rate control), show that the ratios of generated bits are 6:3:2, for I-, P- and B-pictures, respectively.
− Of course, at these ratios, because of the fixed quantiser step size, the image quality is almost constant,
not only for each picture (in fact, slightly better for B-pictures due to better motion compensation) but
throughout the image.
− Again, if we lower the expected quality for B-pictures, we can change that ratio in favour of I- and P-
pictures (it is possible to make the encoder intelligent enough to learn the best ratio).
164
Rate Control and Adaptive Quantisation
165
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
− Following the universal success of H.261 and (MPEG)-1 video codecs, there was a growing need for
a video codec to address a wide variety of applications.
− Considering the similarity between H.261 and MPEG-1, ITU-T and ISO/IEC made a joint effort to
devise a generic video codec.
− Joining the study was a special group in ITU-T, Study Group 15 (SG15), who were interested in
coding of video for transmission over the future broadband integrated services digital networks
(BISDN) using asynchronous transfer mode (ATM) transport.
− The devised generic codec was finalised in 1995 and takes the name of MPEG-2/H.262, though it is
more commonly known as MPEG-2.
− It has error resilience for broadcasting, and ATM networks.
− It delivers multiple programmes simultaneously without requiring them to have a common time base.
These require that the MPEG-2 transport packet length should be short and fixed.
166
MPEG-2 Standard
At the time of the development, the following applications for the generic codec were foreseen:
• BSS broadcasting satellite service (to the home)
• CATV cable TV distribution on optical networks, copper, etc.
• CDAD cable digital audio distribution
• DAB digital audio broadcasting (terrestrial and satellite)
• DTTB digital terrestrial television broadcast
• EC electronic cinema
• ENG electronic news gathering (including satellite news gathering (SNG))
• FSS fixed satellite service (e.g. to head ends)
• HTT home television theatre
• IPC interpersonal communications (videoconferencing, videophone, etc.)
• ISM interactive storage media (optical discs, etc.)
• MMM multimedia mailing
• NCA news and current affairs
• NDS networked database services (via ATM, etc.)
• RVS remote video surveillance
• SSM serial storage media (digital VTR, etc.)
167
Application of MPEG-2 Coded
− Part 1, Systems : synchronization and multiplexing of audio and video
− Part 2, Video
− Part 3, Audio (an extension of the MPEG 1 audio standards)
− Part 4, Testing Compliance
− Part 5, Software Simulation
− Part 6, extensions for Digital Storage Media Command and Control (DSM-CC) (eg. rewind forward etc)
− Part 7, Advanced Audio Coding (AAC) (a 2nd audio standard there are even more parts)
− [Part 8 withdrawn due to lack of industry interest ]
− Part 9, Extensions for Real Time Interfaces
− Part 10, Conformance Extensions for DSM-CC
− Part 11, Intellectual Property Management and Protection
168
MPEG-2 Parts (MPEG-2 Related Standards)
Video
• 2-15 or 16-80 Mbit/s bit rate ( target bit rate: 4…9 Mbit/sec )
• TV and HDTV picture formats
• Supports interlaced material
• MPEG-2 consists of profiles and levels
• Main Profile, Main Level (MP@ML) refers to 720x480 resolution video at 30 frames/sec, at bit rates up to 15 Mbit/sec for NTSC video (typical ~4 Mbit/sec)
• Main Profile, High Level (MP@HL) refers to HDTV resolution of 1920x1152 pixels at 30 frames/sec, at a bit rate up to 80 Mbit/sec (typical ~15 Mbit/sec)
Audio
• Compatible multichannel extension of MPEG-1 audio
System
• Video, audio and data multiplexing defines tow presentations:
• Program Stream for applications using near error free media
• Transport Stream for more error prone channels
Applications
• Satellite, cable, and terrestrial broadcasting, digital networks, and digital VCR
169
MPEG-2 Audio, Video, System and Application Parts
170
Comparison Between MPEG-1 and MPEG-2 MP@ML Video
Specifications MPEG-2 MP@ML MPEG-1
Video Format 720x480x30(NTSC) 320x240x30(NTSC)
720x576x25(PAL) 320x288x25(PAL)
Coded Data 4-6Mbps for CCIR601 1.8Mbps Max
Speed 15Mbps Max
Coded Picture Frame, Picture Frame
Prediction Inter Frame, Field Interframe
DCT Frame, Field Frame
Resolution 12 bits 9 bits
VLC Resol. 8, 9,10 bits 8 bits
Quantization Non-linear Mapping Linear Mapping
Pan, Scan Yes No
171
MPEG-1 MPEG-2
Video format SIF
progressive
SIF, 4:2:0, 4:2:2, 4:4:4
progressive/interlaced
Picture quality VHS Distribution/contribution
Bit rate Variable
(  1.856 Mbps)
Variable up to 100Mbps
Low delay mode < 150 ms < 150 ms (no B pictures)
Accessibility Random access Random access/channel hopping
Scalability SNR, spatial, temporal,
simulcast, data partitioning
Compatibility Forward, backward, upward,
and downward
Transmission error Error protection Error resilience
Editing bit stream Yes Yes
DCT Noninterlaced Field (progressive) or
frame (interlaced)
Motion estimation Noninterlaced
Field, frame, and dual-prime
based. Top (168) block
and bottom (168) block
Motion vectors Motion vectors for
P, B picture only
Concealment motion vectors
for I pictures besides MV
for P & B
Scanning of DCT
coefficients
Zigzag scan Zigzag scan, alternate scan
for interlaced video
Functional Comparison Between MPEG-1 and MPEG-2 Video
− Picture resolutions vary from SIF to HDTV
− Frame and Field DCT Coding in MPEG-2
− Both Linear and Nonlinear Quantisation in MPEG-2
− All Chroma Channels Subsampling in MPEG-2
− Search range can be larger (distance between P-frames is larger than B1 and B2)
− A new range of macroblock (MB) types in the MPEG-2 standard, by combining of various picture formats and the
interlaced/progressive option create.
• While each MB in a progressive mode has 6 blocks in the 4:2:0 format, the number of blocks in the 4:4:4 image format is 12.
− Macroblock size can be 16 x 8 pixels
• The dimensions of the unit of blocks used for motion estimation/compensation can change.
• In the interlaced pictures, since the number of lines per field is half the number of lines per frame, with equal horizontal and vertical
resolutions for motion estimation, it might be appropriate to choose blocks of 16 × 8, that is, 16 pixels over eight lines. These types of
sub-MBs have half the number of blocks of the progressive mode.
− Scalability
• The scalable modes of MPEG-2 are intended to offer interoperability among different services or to accommodate the varying
capabilities of different receivers and networks upon which a single service may operate.
172
Main difference between MPEG-2 and MPEG-1
MPEG-1 and MPEG-2 syntax differences
− All MPEG-2 decoders that comply with currently defined profiles and levels are required to decode MPEG-1 constrained
bit streams:
− MPEG-2 syntax can be made to be very close to MPEG-1, by using particular values for the various MPEG-2 syntax
elements that do not exist in MPEG-1 syntax
− The IDCT mismatch control
− The run level values in VLC
− The constraint parameter flag mechanism in MPEG-1 is replaced by the profile and level structures in MPEG-2.
− The concept of the GOP layer is slightly different.
• GOP in MPEG-2 may indicate that certain B-pictures at the beginning of an edited sequence comprise a
broken link, which occurs if the forward reference picture needed to predict the current B-pictures is removed
from the bitstream by an editing process.
• It is an optional structure for MPEG-2 but mandatory for MPEG-1.
− The slices in MPEG-2 must always start and end on the same horizontal row of MBs.
• This is to assist the implementations in which the decoding process is split into some parallel operations along
horizontal strips within the same pictures. 173
Main difference between MPEG-2 and MPEG-1
• IDCT Mismatch Control
• Macroblock stuffing
• Run-level escape syntax
• Chrominance samples horizontal position (co-locate with luminance in MPEG-2, half the way between luminance samples in MPEG-1
• Slices (in MPEG-2 slices start on the same horizontal row of macroblocks, in MPEG-1 its possible to have all macroblocks of a picture in
one slice, for example
• D-pictures (not permitted in MPEG-2; in MPEG-1 only Intra-DC-coefficient, special end_of_macroblock code)
• Full-pel Motion Vectors (in MPEG-1 full-pel motion vectors possible, in MPEG-2 always half-pel motion vectors)
• Aspect Ratio Information (MPEG-1 specifies pel aspect ratio, MPEG-2 specifies display aspect ratio and pel aspect ratio can be
calculated from this and from frame size and display size)
• Forward_f_code and backward_f_code (differencies in parameter location and contents)
• Constrained_parameter_flag and maximum horizontal_size (MPEG-2 has profile and level mechanism)
• Bit_rate and vbv_delay (fixed values are reserved for variable bit rate in MPEG-1, other values are for constant bit rate; in MPEG-2
semantics for bit_rate are changed, etc.)
• VBV (in MPEG-1 VBV is only defined for constant bit rate operation; in MPEG-2 VBV is only defined for variable bit rate and constant bit
rate is assumed to be a special case of variable bit rate)
• temporal_reference (a small difference between MPEG-1 and MPEG-2)
174
Details of MPEG-2 and MPEG-1 Differences
175
Motion
Estimator
MC Mode
Decision
Picture
Predictor
& Store
MVMC
Modes
Residual
DCT Q
Q-1
IDCT
Decoded
Picture
Prediction
Lossless
Coder
(RLC+VLC)
Rate Control
Buffer
Coded Video
Bit Steam
Ordered
Source
Pictures
_+
++
MPEG Video Encoding
Simplified MPEG Encoder
176
Structure of the Coded Bit-Stream
Video Sequence
... ...
Group of Pictures
Picture
Slice
Macroblock
8
pixels
8
pixels
Block
− All chroma channels subsampling! (4:4:4, 4:2:2 and 4:2:0 support)
177
All Chroma Channels Subsampling in MPEG-2
178
4:2:0
YV Y Only
YU Y Only
Co-sited
Sampling
MPEG-2
Co-sited 4:2:0 Sampling in MPEG-2
4:2:0
Y V Y
Y U Y
JPEG/JFIF
H.261
MPEG-1
Downsize chrominance Components.
• 4:2:0 (with chrominance samples centered)
• Requires bilinear interpolation
Co-sited
179
Luminance MB structure in frame-organized DCT coding (for slow moving)
Luminance MB in field-organized DCT coding (for fast moving)
Blocks (8×8)MB (16×16)
Frame Type DCT vs. Field Type DCT
Blocks (8×8)MB (16×16)
180
Frame and Field DCT Coding in MPEG-2
− Interlacing! (Motion estimation is different from MPEG-1)
− MPEG-2 can chose between Previous Frame and previous Field
− The odd and even fields can be coded together as if it were a frame or the can be coded independently
• if there is no motion then we can combine the two fields into a single image called a “frame-picture.”
Better for compression efficiency.
• if there is motion then the two fields are coded separately as if they were two pictures called “field-
pictures”.
181
Frame and Field DCT Coding in MPEG-2
Odd Field-Picture
Even Field-Picture
Frame Picture
− For interlaced pictures, since the vertical correlation in the field pictures is greatly reduced, should the
field prediction be used, an alternate scan may perform better than a zigzag scan.
182
Frame and Field DCT Coding in MPEG-2
183
Five motion compensation modes in MPEG-2
16
8
16
816
16
Interlaced pictures
Five motion compensation modes in MPEG-2
More information: Standard Codecs, Dr, Ghanbari, 8.4 MPEG-2 nonscalable coding modes
− In motion compensation mode, a field of 16×16 pixel macroblocks is split into upper half and lower half
16×8 pixel blocks, and a separate field prediction is carried out for each.
− Two motion vectors are transmitted for each P-picture macroblock and two or four motion vectors for the
B-picture macroblock.
− This mode of motion compensation may be useful in field pictures that contain irregular motion.
− Here a field macroblock is split into two halves, and in the field prediction for frame pictures a frame
macroblock is split into two top and bottom field blocks.
− It should be noted that field pictures have some restrictions on I, P and B-picture coding type and motion
compensation.
− Normally, the second field picture of a frame must be of the same coding type as the first field. However, if
the first field picture of a frame is an I-picture, then the second field can be either I or P. If it is a P-picture,
the prediction macroblocks must all come from the previous I-picture, and dual prime cannot be used
184
Five motion compensation modes in MPEG-2
− In this case the target macroblock in a frame picture is split into two top field and bottom field pixels. (For
interlaced pictures, a target Macroblock can be split into two field macroblocks).
− Field prediction is then carried out independently for each of the 16 x 8 pixel target macroblocks.
− For P-pictures, two motion vectors are assigned for each 16×16 pixel target macroblock.
− The 16×8 predictions may be taken from either of the two most recently decoded anchor pictures.
− Note that the 16x8 field prediction cannot come from the same frame, as was the case in field prediction
for field pictures.
− For B-pictures, due to the forward and the backward motion, there can be two or four motion vectors for
each target macroblock.
− The 16×8 predictions may be taken from either field of the two most recently decoded anchor pictures.
185
Five motion compensation modes in MPEG-2
− Motion Vectors are differentially coded wrt the vector for the previous macroblock (ie. to the left)
• PMV – Previous Motion Vector.
• MV – Motion Vector for the Current Macroblock.
− Define 𝚫 = 𝜟 𝒙 𝜟 𝒚 = 𝟐 × 𝐌𝐕 − 𝐏𝐌𝐕
• Multiply by 2 as 0.5 pel quantisation used.
• Δ 𝑥 and Δ 𝑦 are coded separately.
Coding of Motion Vectors
186
Coding Δx and Δy
− The absolute value and sign of each component is coded separately.
− The absolute value is broken down as
𝚫∗ = 𝒂 − 𝟏 𝟐 𝒃
+ 𝒄 + 𝟏
𝑎 – is called the motion_code and ranges from 0 to 16.
It is Huffman Coded
𝑏 – is called the size and effectively limits the range of motion vector. It ranges from 0 to 8.
It is Fixed Length Coded (FLC) (4 bit binary value).
𝑐 – is the motion_residual. It ranges from 0 to 2 𝑏
− 1
It is Fixed Length Coded (FLC). It is a 𝑏-bit binary number.
187
Coding Δx and Δy
Δ∗
A table of how the choice of Size effects the range of difference that can be coded.
• Size is set once at the start of each Picture Layer. (ie. it is the same over the entire picture).
• It is common to choose larger size for P-frames cause motion is bigger.
188
Coding Δx and Δy
Size is chosen based on the range of motion vectors.
EX: Say we limit search width to 10.
• Then we could have a vector [10, 10] and a previous vector [-10 10].
• The max Δ 𝑥 or Δ 𝑦 is 2 × 10 + 10 = 40.
• Therefore we need to choose 𝑏 = 2.
• Given an MV [4.5, 3] and PMV [5, -1] then
𝚫 = 2 × 4.5 3 − 5 −1 = [−1 8]
Then for 𝑏 = 2,
Δ 𝑥 = 1 = 1 − 1 22
+ 0 + 1
Δ 𝑦 = 8 = 2 − 1 22
+ 3 + 1
𝑎 = 1, 𝑏 = 2, 𝑐 = 0
𝑎 = 2, 𝑏 = 2, 𝑐 = 3
189
Huffman Codes for motion_code
− s is 0 if the component is positive.
− s is 1 if the component is negative.
− Each vector is specified by a (motion_code, motion_residual)
pair.
• The Size value is specified at the start of the Picture Layer.
− If Δ∗ = 0 then we set the motion_code to 0 (codeword is 1).
There is no motion_residual.
190
Example
− if Δ 𝑥 = −1 then the motion_code is 1, the sign bit is 1 and the
motion_residual is 0. Therefore the code
𝟎𝟏𝟏 𝟎
is inserted into the bitstream.
− if Δ 𝑥 = −1 then the motion_code is 2, the sign bit is 0 and the
motion_residual is 3. Therefore the code
𝟎𝟎𝟏𝟎 𝟏𝟏𝟏
is inserted into the bitstream.
191
192
Spatial Domain and Frequency Domain Blocks
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs

More Related Content

What's hot

What's hot (20)

HDR and WCG Principles-Part 1
HDR and WCG Principles-Part 1HDR and WCG Principles-Part 1
HDR and WCG Principles-Part 1
 
An Introduction to HDTV Principles-Part 4
An Introduction to HDTV Principles-Part 4An Introduction to HDTV Principles-Part 4
An Introduction to HDTV Principles-Part 4
 
Broadcast Lens Technology Part 3
Broadcast Lens Technology Part 3Broadcast Lens Technology Part 3
Broadcast Lens Technology Part 3
 
Video Compression, Part 2-Section 2, Video Coding Concepts
Video Compression, Part 2-Section 2, Video Coding Concepts Video Compression, Part 2-Section 2, Video Coding Concepts
Video Compression, Part 2-Section 2, Video Coding Concepts
 
H.264 vs HEVC
H.264 vs HEVCH.264 vs HEVC
H.264 vs HEVC
 
HDR and WCG Principles-Part 5
HDR and WCG Principles-Part 5HDR and WCG Principles-Part 5
HDR and WCG Principles-Part 5
 
HEVC VIDEO CODEC By Vinayagam Mariappan
HEVC VIDEO CODEC By Vinayagam MariappanHEVC VIDEO CODEC By Vinayagam Mariappan
HEVC VIDEO CODEC By Vinayagam Mariappan
 
Video Compression, Part 2-Section 1, Video Coding Concepts
Video Compression, Part 2-Section 1, Video Coding Concepts Video Compression, Part 2-Section 1, Video Coding Concepts
Video Compression, Part 2-Section 1, Video Coding Concepts
 
An Introduction to Video Principles-Part 1
An Introduction to Video Principles-Part 1   An Introduction to Video Principles-Part 1
An Introduction to Video Principles-Part 1
 
VIDEO QUALITY ENHANCEMENT IN BROADCAST CHAIN, OPPORTUNITIES & CHALLENGES
VIDEO QUALITY ENHANCEMENT IN BROADCAST CHAIN,   OPPORTUNITIES & CHALLENGESVIDEO QUALITY ENHANCEMENT IN BROADCAST CHAIN,   OPPORTUNITIES & CHALLENGES
VIDEO QUALITY ENHANCEMENT IN BROADCAST CHAIN, OPPORTUNITIES & CHALLENGES
 
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-based
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-basedDesigning an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-based
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-based
 
HEVC overview main
HEVC overview mainHEVC overview main
HEVC overview main
 
HDR and WCG Principles-Part 3
HDR and WCG Principles-Part 3HDR and WCG Principles-Part 3
HDR and WCG Principles-Part 3
 
Latest Technologies in Production & Broadcasting
Latest  Technologies in Production & BroadcastingLatest  Technologies in Production & Broadcasting
Latest Technologies in Production & Broadcasting
 
An Introduction to Video Principles-Part 2
An Introduction to Video Principles-Part 2An Introduction to Video Principles-Part 2
An Introduction to Video Principles-Part 2
 
Broadcast Lens Technology Part 1
Broadcast Lens Technology Part 1Broadcast Lens Technology Part 1
Broadcast Lens Technology Part 1
 
HDR and WCG Principles-Part 4
HDR and WCG Principles-Part 4HDR and WCG Principles-Part 4
HDR and WCG Principles-Part 4
 
An Overview of High Efficiency Video Codec HEVC (H.265)
An Overview of High Efficiency Video Codec HEVC (H.265)An Overview of High Efficiency Video Codec HEVC (H.265)
An Overview of High Efficiency Video Codec HEVC (H.265)
 
Broadcast Camera Technology, Part 2
Broadcast Camera Technology, Part 2Broadcast Camera Technology, Part 2
Broadcast Camera Technology, Part 2
 
Thinking about IP migration
Thinking about IP migration Thinking about IP migration
Thinking about IP migration
 

Similar to Video Compression, Part 3-Section 1, Some Standard Video Codecs

Acknowledge 09 Useraspecten En Evaluatie Ilse MariëN Ibbt Smit Vub
Acknowledge 09 Useraspecten En Evaluatie Ilse MariëN   Ibbt Smit VubAcknowledge 09 Useraspecten En Evaluatie Ilse MariëN   Ibbt Smit Vub
Acknowledge 09 Useraspecten En Evaluatie Ilse MariëN Ibbt Smit Vub
imec.archive
 
Crsm 1 2009 Andrea Lorelli Etsi Towards Standardization Of Cognitive Radio
Crsm 1 2009   Andrea Lorelli Etsi   Towards Standardization Of Cognitive RadioCrsm 1 2009   Andrea Lorelli Etsi   Towards Standardization Of Cognitive Radio
Crsm 1 2009 Andrea Lorelli Etsi Towards Standardization Of Cognitive Radio
imec.archive
 
Standardisation In Media Formats
Standardisation In Media FormatsStandardisation In Media Formats
Standardisation In Media Formats
FITT
 
T-REC-G.709-201202-I!!PDF-E
T-REC-G.709-201202-I!!PDF-ET-REC-G.709-201202-I!!PDF-E
T-REC-G.709-201202-I!!PDF-E
Michel Rodrigues
 
Cognitive Radio Standardisation In Europe Etsi
Cognitive Radio Standardisation In Europe EtsiCognitive Radio Standardisation In Europe Etsi
Cognitive Radio Standardisation In Europe Etsi
melvincabatuan
 
The H.264/AVC Advanced Video Coding Standard: Overview and ...
The H.264/AVC Advanced Video Coding Standard: Overview and ...The H.264/AVC Advanced Video Coding Standard: Overview and ...
The H.264/AVC Advanced Video Coding Standard: Overview and ...
Videoguy
 
International Telecom Standardization Bodies
International Telecom Standardization BodiesInternational Telecom Standardization Bodies
International Telecom Standardization Bodies
Santanu Mukhopadhyay
 

Similar to Video Compression, Part 3-Section 1, Some Standard Video Codecs (20)

ITU-T Study Group 16 Meeting Achievements
ITU-T Study Group 16 Meeting AchievementsITU-T Study Group 16 Meeting Achievements
ITU-T Study Group 16 Meeting Achievements
 
VVC tutorial at ICIP 2020 together with Benjamin Bross
VVC tutorial at ICIP 2020 together with Benjamin BrossVVC tutorial at ICIP 2020 together with Benjamin Bross
VVC tutorial at ICIP 2020 together with Benjamin Bross
 
VVC tutorial at VCIP 2020 together with Benjamin Bross
VVC tutorial at VCIP 2020 together with Benjamin BrossVVC tutorial at VCIP 2020 together with Benjamin Bross
VVC tutorial at VCIP 2020 together with Benjamin Bross
 
VVC tutorial at ICME 2020 together with Benjamin Bross
VVC tutorial at ICME 2020 together with Benjamin BrossVVC tutorial at ICME 2020 together with Benjamin Bross
VVC tutorial at ICME 2020 together with Benjamin Bross
 
Standard standardization protocol
Standard standardization protocolStandard standardization protocol
Standard standardization protocol
 
En 300421v010102p
En 300421v010102pEn 300421v010102p
En 300421v010102p
 
ITU-T Study Group 9 Introduction
ITU-T Study Group 9 IntroductionITU-T Study Group 9 Introduction
ITU-T Study Group 9 Introduction
 
Acknowledge 09 Useraspecten En Evaluatie Ilse MariëN Ibbt Smit Vub
Acknowledge 09 Useraspecten En Evaluatie Ilse MariëN   Ibbt Smit VubAcknowledge 09 Useraspecten En Evaluatie Ilse MariëN   Ibbt Smit Vub
Acknowledge 09 Useraspecten En Evaluatie Ilse MariëN Ibbt Smit Vub
 
Crsm 1 2009 Andrea Lorelli Etsi Towards Standardization Of Cognitive Radio
Crsm 1 2009   Andrea Lorelli Etsi   Towards Standardization Of Cognitive RadioCrsm 1 2009   Andrea Lorelli Etsi   Towards Standardization Of Cognitive Radio
Crsm 1 2009 Andrea Lorelli Etsi Towards Standardization Of Cognitive Radio
 
SDI to IP 2110 Transition Part 1
SDI to IP 2110 Transition Part 1SDI to IP 2110 Transition Part 1
SDI to IP 2110 Transition Part 1
 
Standardisation In Media Formats
Standardisation In Media FormatsStandardisation In Media Formats
Standardisation In Media Formats
 
T-REC-G.709-201202-I!!PDF-E
T-REC-G.709-201202-I!!PDF-ET-REC-G.709-201202-I!!PDF-E
T-REC-G.709-201202-I!!PDF-E
 
Cognitive Radio Standardisation In Europe Etsi
Cognitive Radio Standardisation In Europe EtsiCognitive Radio Standardisation In Europe Etsi
Cognitive Radio Standardisation In Europe Etsi
 
FITT Toolbox: Standardisation in Media Formats
FITT Toolbox: Standardisation in Media FormatsFITT Toolbox: Standardisation in Media Formats
FITT Toolbox: Standardisation in Media Formats
 
The H.264/AVC Advanced Video Coding Standard: Overview and ...
The H.264/AVC Advanced Video Coding Standard: Overview and ...The H.264/AVC Advanced Video Coding Standard: Overview and ...
The H.264/AVC Advanced Video Coding Standard: Overview and ...
 
International Telecom Standardization Bodies
International Telecom Standardization BodiesInternational Telecom Standardization Bodies
International Telecom Standardization Bodies
 
RA 15, WRC-15 and CPM19-1: Suymmary of Outcomes
RA 15, WRC-15 and CPM19-1: Suymmary of OutcomesRA 15, WRC-15 and CPM19-1: Suymmary of Outcomes
RA 15, WRC-15 and CPM19-1: Suymmary of Outcomes
 
ITU-T Study Group 11 Introduction
ITU-T Study Group 11 IntroductionITU-T Study Group 11 Introduction
ITU-T Study Group 11 Introduction
 
UK SPF Steering Board Update
UK SPF Steering Board UpdateUK SPF Steering Board Update
UK SPF Steering Board Update
 
Wtdc10 side event_digitalbroadcasting
Wtdc10 side event_digitalbroadcastingWtdc10 side event_digitalbroadcasting
Wtdc10 side event_digitalbroadcasting
 

More from Dr. Mohieddin Moradi (10)

HDR and WCG Principles-Part 2
HDR and WCG Principles-Part 2HDR and WCG Principles-Part 2
HDR and WCG Principles-Part 2
 
SDI to IP 2110 Transition Part 2
SDI to IP 2110 Transition Part 2SDI to IP 2110 Transition Part 2
SDI to IP 2110 Transition Part 2
 
Broadcast Lens Technology Part 2
Broadcast Lens Technology Part 2Broadcast Lens Technology Part 2
Broadcast Lens Technology Part 2
 
An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3
 
An Introduction to HDTV Principles-Part 2
An Introduction to HDTV Principles-Part 2An Introduction to HDTV Principles-Part 2
An Introduction to HDTV Principles-Part 2
 
Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3
 
Broadcast Camera Technology, Part 1
Broadcast Camera Technology, Part 1Broadcast Camera Technology, Part 1
Broadcast Camera Technology, Part 1
 
An Introduction to Audio Principles
An Introduction to Audio Principles An Introduction to Audio Principles
An Introduction to Audio Principles
 
Video Compression, Part 4 Section 1, Video Quality Assessment
Video Compression, Part 4 Section 1,  Video Quality Assessment Video Compression, Part 4 Section 1,  Video Quality Assessment
Video Compression, Part 4 Section 1, Video Quality Assessment
 
Video Compression, Part 4 Section 2, Video Quality Assessment
Video Compression, Part 4 Section 2,  Video Quality Assessment Video Compression, Part 4 Section 2,  Video Quality Assessment
Video Compression, Part 4 Section 2, Video Quality Assessment
 

Recently uploaded

VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 

Recently uploaded (20)

CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTINGMANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
MANUFACTURING PROCESS-II UNIT-1 THEORY OF METAL CUTTING
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptxBSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
BSides Seattle 2024 - Stopping Ethan Hunt From Taking Your Data.pptx
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 

Video Compression, Part 3-Section 1, Some Standard Video Codecs

  • 2. Section I – ISO/IEC JTC 1/SC 29 Structure and MPEG – ITU-T structure and VCEG (Video Coding Experts Group or Visual Coding Experts Group) – A Generic Interframe Video Encoder – H.261 Video Coding Standard – MPEG-1 Video Coding Standard – MPEG-2 Video Coding Standard Section II – MPEG-2 Transport and Program Streams – H.263 Video Coding Standard – H.263+ Video Coding Standard – H.263++ Video Coding Standard – Bit-rate (R) and Distortion (D) in Video Coding 2 Outline
  • 3. JTC1 IEC ISO SC 29 RAAGM AG WG12WG11WG1 WG JBIG JPEG SG MHEG-5 Main- tenance MHEG-6 SG Audio SNHC System Video Requirements Implementation Studies Test SG Liaisons Advisory Group (AG) on Management (AGM) • To advise SC 29 and its WGs on matters of management that affect their works. Advisory Group (AG) on Registration Authority (RA) WG1: Still images, JPEG and JBIG • Joint Photographic Experts Group and Joint Bi-level Image Group WG11: Video, MPEG • Motion Picture Experts Group WG12: Multimedia, MHEG • Multimedia Hypermedia Experts Group International Standardization Organization Subcommittee 29 Title: “Coding of Audio, Picture, Multimedia and Hypermedia Information” Joint Technical Committee ISO/IEC JTC 1/SC 29 Structure and MPEG MPEG (Moving Picture Experts Group, 1988 ) To develop standards for coded representation of digital audio, video, 3D Graphics and other data International Electrotechnical Committee 3
  • 4. Telecommunication Standardization Advisory Group (TSAG) WTSA World Telecommunication Standardization Assembly SG Workshops, Seminars, Symposia … IPRs (Intellectual Property Rights) WP Questions: Develop Recommendations SG WP WP Q Focus Group VCEG (ITU-T SG16/Q6) ) • Study Group 16 Multimedia terminals, systems and applications • Working Party 3 Media coding • Question 6 Video coding Rapporteurs (R): Mr Gary SULLIVAN, Mr Thomas WIEGAND SG16 WP3 4 ITU-T structure and VCEG (Video Coding Experts Group or Visual Coding Experts Group) Administrative Entities Q Q Q Q Q Q Q Q Q Q Q6 VCEG
  • 5. 5 ITU, International Telecommunication Union structure − Founded in 1865, it is the oldest specialized agency of the United Nations system − ITU is an International organization where governments, industries, telecom operators, service providers and regulators work together to coordinate global telecommunication networks and services − Help the world communicate! − What does ITU actually do? • Spectrum allocation and registration • Coordinate national spectrum planning • International telecoms/ICT standardization • Collaborate in international tariff-setting • Cooperate in telecommunications development assistance • Develop measures for ensuring safety of life • Provide policy reviews and information exchange • Insure and extend universal Telecom access
  • 6. 6 ITU, International Telecommunication Union structure − Plenipotentiary Conference: Key event, all ITU Member States decide on the future role of the organization (Held every four years) − ITU Council: The role of the Council is to consider, in the interval between Plenipotentiary Conferences, broad telecommunication policy issues to ensure that the Union's activities, policies and strategies fully respond to today's dynamic, rapidly changing telecommunication environment (held yearly)
  • 7. 7 ITU, International Telecommunication Union structure − General Secretariat: Coordinates and manages the administrative and financial aspects of the Union’s activities (provision of conference services, information services, legal advice, finance, personnel, etc.) − ITU-R: Coordinates radio communications, radio-frequency spectrum management and wireless services. − ITU-D: Technical assistance and deployment of telecom networks and services in developing and least developed countries to allow the development of telecommunication. − ITU-T: Telecommunication standardization on a world-wide basis. Ensures the efficient and on-time production of high quality standards covering all fields of telecommunications (technical, operating and tariff issues). (The Secretariat of ITU-T (TSB: Telecommunication Standardization Bureau) provides services to ITU-T Participants)
  • 8. 8 ITU, International Telecommunication Union structure Telecommunication Standardization Bureau (TSB) (Place des Nations, CH-1211 Geneva 20) − The TSB provides secretarial support for ITU-T and services for participants in ITU-T work (e.g. organization of meeting, publication of Recommendations, website maintenance etc.). − Disseminates information on international telecommunications and establishes agreements with many international SDOs. Mission of ITU-T Standardization Sector of ITU − Helping people all around the world to communicate and to equally share the advantages and opportunities of telecommunication reducing the digital divide by studying technical, operating and tariff matters to develop telecommunication standards (Recommendations) on a worldwide basis.
  • 9. 9 ITU, International Telecommunication Union structure World Telecommunication Standardization Assembly (WTSA) − WTSA sets the overall direction and structure for ITU-T, meets every four years and for the next four-year period: • Defines the general policy for the Sector • Establishes the study groups (SG) • Approves SG work programmes • Appoints SG chairmen and vice-chairmen Telecommunication Standardization Advisory Group (TSAG) − TSAG provides ITU-T with flexibility between WTSAs, and reviews priorities, programmes, operations, financial matters and strategies for the Sector (meets ~~ 9 months ) • Follows up on accomplishment of the work programme • Restructures and establishes ITU-T study groups • Provides guidelines to the study groups • Advises the TSB Director • Produces the A-series Recommendations on organization and working procedures
  • 10. • ISO/IEC MPEG = “Moving Picture Experts Group” (ISO/IEC JTC 1/SC 29/WG 11 = International Standardization Organization and International Electrotechnical Commission, Joint Technical Committee 1, Subcommittee 29, Working Group 11) • ITU-T VCEG = “Video Coding Experts Group” (ITU-T SG16/Q6 = International Telecommunications Union – Telecommunications Standardization Sector (ITU-T, a United Nations Organization, formerly CCITT), Study Group 16, Working Party 3, Question 6) • JVT = “Joint Video Team” Collaborative team of MPEG & VCEG, responsible for developing AVC (discontinued in 2009) • JCT-VC = “Joint Collaborative Team on Video Coding” Team of MPEG & VCEG , responsible for developing HEVC (established January 2010) • JVET = “Joint Video Experts Team” Exploring potential for new technology beyond HEVC (established Oct. 2015 as Joint Video Exploration Team, renamed Apr. 2018) 10 Video Coding Standardization Organizations
  • 11. 11 H.263/+/++ (1995-2000+) MPEG-4 Visual (1998-2001+) MPEG-1 (1993) ISO/IECITU-T H.261 (1990+) H.262 / 13818-2 (1994/95-1998+) (2003-2018+) (2013-2018+) H.120 (1984-1988) Computer SD HD H.264 / 14496-10 AVC 4K UHD H.265 / 23008-2 HEVC It developed by Joint Video Team (JVT) It developed by Joint Collaborative Team on Video Coding (JCT-VC) (MPEG-2) (2020-...) 8K, 360, ... H.26x / 23090-3 VVC It will be developed by Joint Video Experts Team (JVET) 1990 1994 2003 2013 2020 History of Video Coding Standardization (1985 ~ 2020) Video telephony
  • 12. 12 ITU-T Standard Joint ITU-T/MPEG Standards MPEG Standard 1988 1990 1992 1994 1996 1998 2002 2004 20062000 2008 2010 H.261 (Version 1) H.261 (Version 2) H.263 H.263+ H.263++ H.262/MPEG-2 H.264/MPEG-4 AVC H.265/HVC MPEG-1 MPEG-4 (Version 1) MPEG-4 (Version 2) H.261 Video Compression Standard
  • 13. 13 H series are low delay codecs for telecom applications (International Telecommunication Union (ITU-T) developed several recommendations for video coding) • H.120 The first digital video coding standard − H.261 (1990): the first video codec specification, “Video Codec for Audio Visual Services at p x 64kbps” − H.262 (1995) : Infrastructure of audiovisual services—Coding of moving video − H.263 (1996): next conf. solution, Video coding for low bit rate communications − H.263+ (H.263V2) (1998) − H.263++ (H.263V3)(2000), follow-on solutions − H.26L: “long-term” solution for low bit-rate video coding for communication applications (Not backward compatible to H.263+) − H.264 (H.26L) completed in May 2003 and lead to H.264: known as advanced video coding (AVC) − H.265/HEVC (2013) High Efficiency Video Coding ITU H.26x History
  • 14. 14 Motion Picture Experts Group (MPEG) codecs are designed for storage/broadcast/streaming applications MPEG-1 (1992) • Started in 1988 by Lenardo Chiariglione • Compression standard for progressive frame-based video in SIF (360x240) formats • Applications: VCD MPEG-2 (1994-5) • Compression standard for interlaced frame-based video in CCIR-601 (720x480) and high definition (1920x1088i) formats • Applications: DVD, SVCD, DIRECTV, GA, DVB, HDTV Studio, DTV Broadcast, DVD, HD, video standards for television and telecommunications standards MPEG-4 (1999) • Multimedia standard for object-based video from natural or synthetic source • Applications: Internet, cable TV, virtual studio, home LAN etc.. • Object-oriented • Over-ambitious? MPEG History MPEG 21 MPEG-2 MPEG-1 MPEG-4 MPEG-7
  • 15. 15 Motion Picture Experts Group (MPEG) codecs are designed for storage/broadcast/streaming applications MPEG-7, 2001 • Standardized descriptions of multimedia information, formally called “Multimedia Content Description Interface” • Metadata for audio-video streams • Applications: Internet, video search engine, digital library MPEG-21, 2002 • Intellectual right protection propose • Distribution, exchange, user access of multimedia data and intellectual property management AVC (2003), also known as MPEG-4 version 10 • Conventional to HD • Emphasis on compression performance and loss resilience HEVC (2013) High Efficiency Video Coding MPEG History MPEG 21 MPEG-2 MPEG-1 MPEG-4 MPEG-7
  • 16. 16 ITU and MPEG (ISO/IEC) have also worked together for joint codecs: − MPEG-2 is also called H.262 − H.26L has lead to a codec now is called: • H.264 in telecom • MPEG-4 (version 10) in broadcast • AVC (Advanced Video Coding) in broadcast • Joint Video Team (JVT) Codec − H.265/HEVC (2013) High Efficiency Video Coding Joint ITU/MPEG
  • 17. 17 The Story of MPEG and VCEG
  • 18. 18 ITU and MPEG (ISO/IEC) have also worked together for joint codecs: Joint ITU/MPEG 50% bitrate saving – Direct-to-home 30% bitrate saving – Contribution 50% bitrate saving – Direct-to-home 30% bitrate saving – Contribution 2020 VVC 2020 ≈50% bitrate saving – Direct-to-home ≈30% bitrate saving – Contribution
  • 19. Milestones in Video Coding 19
  • 20. Milestones in Video Coding 20
  • 22. 22 Spatial Domain − Elements are used “raw” in suitable combinations. − The frequency of occurrence of such combinations is used to influence the design of the coder so that shorter codewords are used for more frequent combinations and vice versa (entropy coding). Transform Domain − Elements are mapped onto a different domain (i.e. the frequency domain). − The resulting coefficients are quantised and entropy-coded. Hybrid − Combinations of the above. Classification of Compression Techniques
  • 23. Current Stage Used since early days of video compression standards, e.g. MPEG-1/-2/-4, H.264/AVC, HEVC and also in most proprietary codecs (VC1, VP8 etc.) Input Frame 1 ,Q 23 A Generic Interframe Video Encoder
  • 24. Input Frame 1 DCT ,Q 24 A Generic Interframe Video Encoder
  • 25. Quantized 010011101001… Input Frame 1 DCT ,Q 25 A Generic Interframe Video Encoder
  • 26. QuantizedInput Frame 1 DCT 010011101001… Reconstructed Frame 1 ,Q 26 A Generic Interframe Video Encoder
  • 27. Input Frame 2 ,Q 27 Reconstructed Frame 1 A Generic Interframe Video Encoder
  • 28. 010011101001… Entropy Coded MVs ,Q 28 Reconstructed Frame 1 Input Frame 2 A Generic Interframe Video Encoder
  • 29. 010011101001… Entropy Coded MVs ,Q 29 Reconstructed Frame 1 with MC Input Frame 2 A Generic Interframe Video Encoder
  • 30. Input Frame 2 Residual with MC (Frames 1&2) ,Q 30 Reconstructed Frame 1 with MC A Generic Interframe Video Encoder If the motion prediction is successful, the energy in the residual is lower than in the original frame and can be represented with fewer bits.
  • 31. Residual with MC (Frames 1&2) DCT ,Q 31 A Generic Interframe Video Encoder
  • 32. 010011101001… QuantizedDCT Residual with MC (Frames 1&2) ,Q 32 A Generic Interframe Video Encoder
  • 33. Reconstructed Residual with MC (Frames 1&2) QuantizedDCT Residual with MC (Frames 1&2) ,Q 33 A Generic Interframe Video Encoder
  • 34. ,Q 34 Reconstructed Residual with MC (Frames 1&2) Reconstructed Frame 1 with MC + Reconstructed Frame 2 with MC = A Generic Interframe Video Encoder
  • 36. 36 − All standard codecs follow the generic interframe codec of: DCT/DPCM/MC/VLC − Their main differences lie on the way these elements are employed • Block transform length and type • Block size for Motion estimation and its precision • Methods of VLC • Quantisation • Coding of quantised transform coefficients • Addressing of data • Preventing error propagation • Various types of coding each frame Generic Standard Codec
  • 37. 37 − An earlier digital video compression standard, its principle of MC-based compression is retained in all later video compression standards. − The standard was designed for videophone, video conferencing and other audiovisual services over ISDN. − The video codec supports bit-rates of p×64 kbps, where p ranges from 1 to 30 (Hence also known as p ' 64). − Require that the delay of the video encoder be less than 150 msec so that the video can be used for real- time bidirectional video conferencing. − Problems: • Error propagation • In case of errors, it needs updating Video Formats Supported by H.261 H.261 Standard
  • 38. 38 Some Image Formats Some Picture Formats Recall
  • 39. 39 H.261 Standard − The coding parameters of the compressed video signal are multiplexed and then combined with the audio, data and end-to-end signalling for transmission. − The transmission buffer controls the bit rate, either by changing the quantiser step size at the encoder or, in more severe cases, by requesting reduction in frame rate to be carried out at the preprocessor. A block diagram of an H.261 audio-visual encoder
  • 40. 40 1. Block (8x8) 3. Group of Blocks = 33 MBs GOB 33 Macroblocks (3× 11 Matrix) 1 2 --------------- 11 12 13 --------------- 22 23 24 --------------- 33 MB QCIF GOB 1 GOB 5 GOB 3 CIF GOB 1 GOB 2 GOB 3 GOB 4 GOB 5 GOB 6 GOB 7 GOB 8 GOB 9 GOB 10 GOB 11 GOB 12 4. Picture Layer H.261 Layer Structures 2. Macroblock (MB) 16 16 𝑪𝒓 𝑪𝒃 𝒀 𝒀 𝟎 𝒀 𝟏 𝒀 𝟐 𝒀 𝟑 8 8 8 8                                 B G R C C Y b r 500.0331.0169.0 081.0419.0500.0 114.0587.0299.0
  • 41. 41 Picture layer Group of Blocks (GOB) Macroblocks (MB) Blocks (CIF=352x288, QCIF=176x144) (GOB=176x48) (MB=16x16) H.261 Layer Structures GOBs within CIF GOBs within QCIF 352 288 176 144 16 16 1 3 5 7 9 11 2 4 6 8 10 12 macroblocks within a GOB 1 3 5 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 Y 0 Y3 Y1 Y2 8 8 A macroblock structure CrCb Preventing error propagation – Macroblock (MB) is the smallest Coding Unit of video – In the standard codecs, we only define how a MB is coded – How many Luma/Chroma blocks in an MB, depends on picture format (B=8x8)
  • 42. MBA CODE MBA COOE 1 1 17 0000 0101 10 2 011 18 0000 0101 01 3 010 19 0000 0101 00 4 0011 20 0000 0100 11 5 0010 21 0000 0100 10 6 0001 1 22 0000 0100 011 7 0001 0 23 0000 0100 010 8 0000 111 24 0000 0100 001 9 0000 110 25 0000 0100 000 10 0000 1011 25 0000 0011 111 11 0000 1010 27 0000 0011 110 12 0000 1001 28 0000 0011 101 13 0000 1000 29 0000 0011 100 14 0000 0111l 30 0000 0011 011 15 0000 0110 31 0000 0011 010 16 0000 0101 11 32 0000 0011 001 33 0000 0011 000 MBA Stuffing 0000 0001 111 Start code 0000 0000 0000 0001 Macroblock Addressing (MBA) MBA stuffing: − An extra codeword in the table for bit stuffing immediately after a GOB header or a coded macroblock. − This codeword should be discarded by decoders. 42
  • 44. 44 Intra Motion Vector Coding Control qz Picture Memory + - Inter DCT Q RLC+VLC Motion Estimation Video Input + Q -1 IDCT + Loop Filter H.261 Standard When too much data accumulates in transmission buffer, the rate controller raises the quantization level to low down quality!
  • 45. 45 H.261 Standard H.261 Frame Sequence
  • 46. 46 H.261 Standard COMP: a comparator for deciding inter/intra coding mode for an MB Th: threshold, to extend the quantisation range T: transform coding blocks of 8 8 pixels Q: quantisation of DCT coefficients P: picture memory with motion-compensated variable delay F: loop filter p: flag for inter/intra t: flag for transmitted or not q: quantisation index for transform coefficients qz: quantiser indication v: motion vector information f: switching on/off of the loop filter
  • 47. − For DC coefficients in Intra mode: − For all other coefficients: • scale — an integer in the range of [1, 31]) 47 H.261 Standard A uniform quantiser with threshold
  • 48. Example: Th=q=16 83 12 21 7 –10 7 –10 35 11 5 –31 –5 15 12 10 –24 5 83 83 88 5 12 0 0 0 –10 0 0 0 –5 0 0 0 35 35 40 2 21 21 24 1 7 0 0 0 11 0 0 0 15 0 0 0 10 0 0 0 5 0 0 0 –24 –24 –24 –1 12 0 0 0 5 0 0 0 –10 0 0 0 7 0 0 0 –31 –31 –24 –1 Raw coefficients New coefficients Quantised values Index Events to be transmitted: (run, index) (0,5) (3,2) (0,1) (5,–1) (4,–1) Quantization and Entropy Coding 48
  • 49. 49 − In interframe coding in the event of channel error, the error propagates into the subsequent frames. If that part of the picture is not updated, the error can persist for a long time. − The variance of intraframe MB is compared with that of the variance of interframe MB (motion compensated or not) in previous frame. The smallest is chosen. • For large variances, no preference between the two modes. • For smaller variances, interframe is preferred. − The reason is that, in intra mode, the DC coefficients of the blocks have to be quantised with a quantiser without a dead zone and with 8-bit resolutions. This increases the bit rate compared to that of the interframe mode, and hence interframe is preferred. MC/NO_MC mode decision in H.261 Inter/Intra Switch (Intraframe AC energy) (Interframe AC energy)
  • 50. 50 P-frame Motion estimation in H.261 was optional Macro-block and Motion Vector Range
  • 51. 51 BD –Block Difference DBD – Displaced Block Difference X X 3 2.7 MC No MC 256 DBD y  x  256 BD 1.5 0.5 1 DBD   c[x, y] r[x  dx, y  dy] 256 MB 1 BD   c[x, y] r[x, y] 256 MB 1 𝑦 = 𝑥/1.1 Motion Compensation Decision Characteristic – Not all blocks are motion compensated – The one which generates less bits are preferred.
  • 52. Macro-block – Motion estimation of a macroblock involves finding a 16×16-sample region in a reference frame that closely matches the current macroblock. – Luminance: 16x16, four 8x8 blocks – Chrominance: two 8x8 blocks – Motion estimation only performed for luminance component Motion Vector Range – [ -15, 15] – MB: 16 x 16 15 15 15 15 Search Area in Reference Frame MB 52 Macro-block and Motion Vector Range 𝑪𝒓 𝑪𝒃 𝒀 𝒀 𝟎 𝒀 𝟏 𝒀 𝟐 𝒀 𝟑
  • 53. − Integer pixel ME search only − Motion vectors are differentially & separately encoded − 11-bit VLC for MVD (Motion Vector Delta) Example MV = 2 2 3 5 3 1 -1 MVD = 0 1 2 -2 -2 -2… − Binary: 1 010 0010 0011 0011 0011… ]1[][ ]1[][   nMVnMVMVD nMVnMVMVD yyy xxx 53 MVD VLC … … -2 & 30 0011 -1 011 0 1 1 010 2 & -30 0010 3 & -29 0001 0 Addressing of Motion Vectors
  • 54. 54 1) Motion Estimation for each Marco Block (MB) MB: 16 x 16 Search range (Motion Vector Range): ±15 2) Select a compression mode DBD = Displace Block Difference = 𝑓(𝑥, 𝑦, 𝑡) − 𝑓(𝑥+Δ𝑥, 𝑦 + Δ𝑦, 𝑡 − 1) 3) Process each MB to generate a header followed by a data bit stream that is consistent with the compression mode chosen. H.261 Motion Estimation and Compression Modes ]1[][ ]1[][   nMVnMVMVD nMVnMVMVD yyy xxx
  • 55. 55 Selection Considerations:  Variance of Macroblock  Macroblock Difference (DB)  Macroblock Displaced Macroblock Difference (DBD) Determination Rules: (a) If variance of DBD is smaller than BD Inter + MC (Selected) (Motion vector must be transmitted) otherwise: Motion vector will not be transmitted (b) Small variance : Intra Large variance : Inter (Motion vector=8) (c) Prediction error can be chosen to be modified by a 2-D spatial filter for each 8×8 block. (separable coefficients with 1/4 1/2 1/4) H.261 Mode Selection
  • 56. 56 H.261 Mode Selection Forced Updating − The intraframe coded MB increases the resilience of H.261 codec to channel errors. − In case in inter/intra MB decision, no intra mode is chosen, some of the MBs in a frame are forced to be intra coded. − The specification recommends that an MB should be updated at least once every 132 frames. − This means that for CIF pictures with 396 MBs/frame, on average 3 MBs of every frame are intraframe coded.
  • 57. 57 Decision tree for macroblock type Types of Macroblocks 1. Inter coded: interframe coded MBs with no motion vector or with a zero motion vector. 2. MC coded: motion-compensated MB, where the MC error is significant and needs to be DCT coded. 3. MC not coded: these are motion-compensated error MBs, where the motion- compensated error is insignificant. Hence, there is no need to be DCT coded. 4. Intra coded: intraframe coded MBs. 5. Skipped (not coded, fixed): • If all the six blocks in an MB without MC have an insignificant energy, they are not coded. These MBs are sometimes called skipped, not coded or fixed MBs. • These types of MBs normally occur at the static parts of the image sequence. Fixed MBs are therefore not transmitted, and at the decoder they are copied from the previous frame. • Since the quantiser step sizes are determined at the beginning of each GOB or row of GOBs, they have to be transmitted to the receiver. • Hence, the first MBs have to be identified with a new quantiser parameter. • Therefore, we can have some new MB types: 6. Inter coded + Q 7. MC coded + Q 8. Intra + Q
  • 58. 58 Addressing of Blocks Once the type of an MB is identified and variable length coded, its position inside the GOB should also be determined. − The quantity of the combinations of the coded/noncoded blocks. • Since an MB has six blocks, there will be 26 = 64 different state. • Except the one with all six blocks not coded (fixed MB), the remaining 63 are identified within 63 different patterns. − The pattern information consists of a set of 63 Coded Block Pattern (CBP) indicating coded/noncoded blocks within an MB. − With a coding order of Y0, Y1, Y2, Y3, Cb and Cr, the block pattern information or pattern number is defined as Pattern number Where the coded and noncoded blocks are assigned 1 and 0, respectively. 𝑪𝒓 𝑪𝒃 𝒀 𝒀 𝟎 𝒀 𝟏 𝒀 𝟐 𝒀 𝟑 𝑷𝒂𝒕𝒕𝒆𝒓𝒏 𝑵𝒖𝒎𝒃𝒆𝒓 = 𝟑𝟐𝒀 𝟎 + 𝟏𝟔𝒀 𝟏 + 𝟖𝒀 𝟐 + 𝟒𝒀 𝟑 + 𝟐𝑪𝒃 + 𝑪 𝒓
  • 59. 59 Addressing of Blocks Examples of bit pattern for indicating the coded/not- coded blocks in an MB (black, coded; white, not coded) 𝑷𝒂𝒕𝒕𝒆𝒓𝒏 𝑵𝒖𝒎𝒃𝒆𝒓 = 𝟑𝟐𝒀 𝟎 + 𝟏𝟔𝒀 𝟏 + 𝟖𝒀 𝟐 + 𝟒𝒀 𝟑 + 𝟐𝑪𝒃 + 𝑪 𝒓The pattern information is not transmitted for Intracoded MB − Each pattern number is variable length coded. − It should be noted that if an MB is intracoded, its pattern information is not transmitted. − This is because, in intraframe coded MB, all blocks have significant energy and will be definitely coded. − In other words, there will not be any noncoded blocks in an intra coded MB. EX: CBP = 1100112 = Transmitting Y1, Y2, Cr, Cb = 4110
  • 60. 60 CBP CODE CBP CODE 60 111 35 0001 1100 4 1101 13 0001 1011 8 1100 49 0001 1010 16 1011 21 0001 1001 32 1010 41 0001 1000 12 1001 1 14 0001 0111 48 1001 0 50 0001 0101 40 1000 0 42 0001 0100 28 0111 1 15 0001 0011 44 0111 0 51 0001 0010 52 0110 1 23 0001 0001 56 0110 0 43 0001 0000 1 0101 1 5 0000 1111 61 0101 0 37 0000 1110 2 0100 1 26 0000 1101 62 0100 0 38 0000 1100 CBP CODE CBP CODE 24 0011 11 29 0000 1011 36 0011 10 45 0000 1010 3 0011 01 53 0000 1001 63 0011 00 57 0000 1000 5 0010 111 30 0000 0111 9 0010 110 46 0000 0110 17 0010 101 54 0000 0101 33 0010 100 53 0000 0100 6 0010 011 31 0000 0011 1 10 0010 010 47 0000 0011 0 18 0010 001 55 0000 0010 1 34 0010 000 59 0000 0010 0 7 0001 1111 27 0000 0001 1 11 0001 1110 39 0000 0001 0 19 0001 1101 VLC Table for Coded Block Pattern (CBP) Addressing of Blocks
  • 61. 61 Addressing of Blocks Relative addressing of coded MB − The overhead information for addressing of the positions of the coded MB is minimised if they are relatively addressed to each other. − Numbers represent the relative addressing value of the number of fixed MBs preceding a nonfixed MB. − The GOB start code indicates the beginning of the GOB. − These relative addressing numbers are finally variable length coded.
  • 62. 62 Loop Filter − At low bit rates the quantiser step size is normally large that can force many DCT coefficients to zero. − If only the DC and a few AC coefficients remain, then the reconstructed picture appears blocky. − When the positions of blocky areas vary from one frame to another, it appears as a high-frequency noise, commonly referred to as mosquito noise. − The blockiness degradations at the slant edges of the image appear as staircase noise. − Coarse quantisation of the coefficients that results in the loss of high-frequency components implies that compression can be modelled as a low-pass filtering process. − These artefacts are to some extent reduced by using the loop filter. The low-pass filter removes the highfrequency and block boundary distortions.
  • 63. 63 Loop Filter − Loop filtering is introduced after the motion compensator to improve the prediction. − It should be noted that the loop filter has a picture blurring effect. − It should be activated only for blocks with motion, otherwise, nonmoving parts of the pictures are repeatedly filtered in the following frames, blurring the picture. − The filtering should be applied for coding rates less than 6×64 kbit/s (six DCT blocks of an MB) and switched off otherwise. Coded pictures with loop filter: (a) 128 kbit/s and (b) 64 kbit/s H.261 coded at (a) 128 kbit/s and (b) 64 kbit/s
  • 64. 64 Mode VLC codes Mquant MVD CBP T COFF Intra 0001 × Intra 0000 001 × × Inter 1 × × Inter 00001 × × × Inter+MC 0000 00001 × Inter+MC 0000 0001 × × × Inter+MC 0000 000001 × × × × Inter+MC+Filter 001 × Inter+MC+Filter 01 × × × Inter+MC+Filter 000001 × × × × H.261 Compression Modes Summary T COFF =Transformed Coefficient Mquant=Quantization step size for MB CBP=Coded block pattern (6 bits) 𝑪𝒓 𝑪𝒃 𝒀 𝒀 𝟎 𝒀 𝟏 𝒀 𝟐 𝒀 𝟑 MVD=Motion Vector Delta
  • 65. 65 Bit-Stream Syntax − The Picture layer: Picture Start Code (PSC) delineates boundaries between pictures. TR (Temporal Reference) provides picture time-stamp. − The GOB layer: H.261 pictures are divided into regions of 11×3 macroblocks, each of which is called a Group of Blocks (GOB). (GQuant indicates the Quantizer to be used in the GOB) − The Macroblock layer: Each Macroblock (MB) has its own Address indicating its position within the GOB, Quantizer (MQuant: Quantizer for Macroblock), and six 8×8 image blocks (4 Y, 1 Cb, 1 Cr). − The Block layer: For each 8×8 block, the bitstream starts with DC value, followed by pairs of length of zero-run (Run) and the subsequent non-zero value (Level) for ACs, and finally the End of Block (EOB) code. The range of Run is [0, 63]. Level reflects quantized values — its range is [−127, 127] and Level )= 0.
  • 66. 66 Bit-Stream Syntax Picture layer GOB layer Macroblock layer Block layer
  • 67. Date format for Picture Layer PSC TR Ptype PEI GOB 20 bits 5 bits 6 bits 1 bit Variable 1. PSC: Picture Start Code 2. TR: Temporal Reference 3. Ptype: Picture Type 4. PEI: Picture Extra Insertion 5. GOB Layer (Variable Length Codes) VLC: Variable Length Coding FLC: fixed length coding Data Format of H.261 67
  • 68. PSC: Picture Start Code: 20 bits 0000 0000 0000 0001 0000 (one code happen once in a picture) TR: Temporal Reference: 5 bits (0-31) Since the last transmitted picture, it is formed by incrementing its value in the previously transmitted picture header by one plus the number of non-transmitted pictures. (Each picture unit time: 1/30 or 1/29.97 second) Format for Picture Layer PSC TR Ptype PEI GOB 20 bits 5 bits 6 bits 1 bit Variable 68
  • 69. Ptype: Information about the complete picture Bit 1: Split screen indicator, "0" off; "1" on. Bit 2: Document camera indicator, "0" off; "1" on. Bit 3: Freeze Picture Release, "0" off; "1" on. Bit 4: Source Format, "0" QCIF; "1" CIF. Bit 5-6: Spare PEI: Picture Extra Insertion Information (1 bit) Bit 1: ,"0" No Pspare; "1" Pspare. To determine if Pspare 1: + 8-bit Pspare 0: GOB; (usually PEI=0) Pspare: Picture Spare Information ( 0/8/16 … bits ) • If PEI is set to "1", then 9 bits follow consisting of 8 bits of data (Pspare) and then another PEI bit to indicate if further 9 bits follow and so on. • Encoder must not insert Pspare until specified by the CCITT. • Decoder must specify future "backward" compatible additions in SPARE Format for Picture Layer PSC TR Ptype PEI GOB 20 bits 5 bits 6 bits 1 bit Variable 69
  • 70. PSC TR Ptype PEI Pspare PEI=0 PEI=1 For 3 (12) GOBs, it will go 3 (12) times Next Picture GOB Picture Layer Loop Structure PSC TR Ptype PEI GOB 20 bits 5 bits 6 bits 1 bit Variable 70
  • 71. 1. GBSC: Group of Block Start Code 2. GN: Group Number 3. Gquant: GOB quantization number 4. GEI: Group Extra Insertion 5. Gspare: GOB Spare 6. MB Data: Macroblock Data (Variable Length Code) GOB Date structure 16 bits 4bits 5bits 1bit 0/8/16..bits GBSC GN Gqunat GEI Gspare MB Data Format for GOB Layer 71
  • 72. GBSC: Group of Block Start Code 0000 0000 0000 0001 It is fixed and all codes will not occur again, otherwise the picture crash by finding the start code. GN: Group Number 4 bits 0000 Reserved for PSC (should not be used) 13, 14, 15Reserved for future use Gquant: 5 bits A fixed length codeword which indicates the quantizer to be used in the group of block until overridden by any subsequent Mquant. GEI: Picture Extra Insertion Information (1 bit) Bit 1: ,"0" No Gspare; "1" Gspare. Gspare: Picture Spare Information ( 0/8/16 … bits ) Same as Pspare Format for GOB Layer 16 bits 4bits 5bits 1bit 0/8/16..bits GBSC GN Gqunat GEI Gspare MB Data 72
  • 73. GBSC GN Gquant GEI Gspare MB Layer Could run for at most 33 times! GOP Layer Loop Structure 16 bits 4bits 5bits 1bit 0/8/16..bits GBSC GN Gqunat GEI Gspare MB Data 73
  • 74. Data structure of MB layer 1. MBA: Marcoblock Address 2. Mtype: Marcoblock Type 3. Mquant: Marcoblock quantization level 4. MVD: Motion Vector Difference 5. CBP: Coded block pattern 6. Block Data 5bits MBA Mtype Mquant CBPMVD Block Data Format for MB Layer 74
  • 75. Macroblock MBA: Macroblock Address A variable length codeword indicating the position of a macroblock within a group of blocks to indicateg the position of a macroblock in the GOB. GOB 16 Y Cr Cb 16 8 8 88 Format for MB Layer 5bits MBA Mtype Mquant CBPMVD Block Data 1 2 12 13 23 24 11 22 33 75
  • 76. MBA CODE MBA COOE 1 1 17 0000 0101 10 2 011 18 0000 0101 01 3 010 19 0000 0101 00 4 0011 20 0000 0100 11 5 0010 21 0000 0100 10 6 0001 1 22 0000 0100 011 7 0001 0 23 0000 0100 010 8 0000 111 24 0000 0100 001 9 0000 110 25 0000 0100 000 10 0000 1011 25 0000 0011 111 11 0000 1010 27 0000 0011 110 12 0000 1001 28 0000 0011 101 13 0000 1000 29 0000 0011 100 14 0000 0111l 30 0000 0011 011 15 0000 0110 31 0000 0011 010 16 0000 0101 11 32 0000 0011 001 33 0000 0011 000 MBA Stuffing 0000 0001 111 Start code 0000 0000 0000 0001 Macroblock Addressing (MBA) MBA stuffing: − An extra codeword in the table for bit stuffing immediately after a GOB header or a coded macroblock. − This codeword should be discarded by decoders. 76
  • 77. Mtype: Marcoblock Type Mquant: (5 bits) - fixed length Mquant signify the quantizer to be used for this and any following blocks in the GOB until overridden by any Mquant: 1. Use for coding control 2. Can be adjusted to meet the bit rate required 3. Used to control image quality MVD: Motion Vector Data: (Variable Length) MVD is included for all MC macroblocks. MVD is obtained from the macroblock by subtracting the vector of the preceding macroblock , except : 1. MVD for macroblocks #1, 12, 23 2. MBA does not represent a difference of 1 3. Mtype of the previous marcoblock was not MC Mquant and MVD Codes 5bits MBA Mtype Mquant CBPMVD Block Data 77
  • 78. 78 VLC Table for MVD MVD CODE -16 & 16 0000 0011 001 -15 & 17 0000 0011 011 -14 & 18 0000 0011 101 -13 & 19 0000 0011 111 -12 & 20 0000 0100 001 -11 & 21 0000 0100 011 -10 & 22 0000 0100 11 -8 & 24 0000 0101 11 -7 & 25 0000 0111 -6 & 25 0000 1001 -5 & 27 0000 1011 -4 & 28 0000 111 -3 & 29 0001 l -2 & 30 0011 -1 011 0 1 MVD CODE 1 010 2 & -30 0010 3 & -29 0001 0 4 & -28 0000 110 5 & -27 0000 1010 6 & -26 0000 1000 7 & -25 0000 0110 8 & -24 0000 0101 10 9 & -23 0000 0101 00 10 & -22 0000 0100 10 11 & -21 0000 0100 010 12 & -20 0000 0100 000 13 & -19 0000 0011 110 14 & -18 0000 0011 100 15 & -17 0000 0011 010
  • 79. CBP: is present if indicated by Mtype. The codeword gives a pattern number signifying those blocks in the macroblock for which at least one transform coefficient is transmitted. The pattern number is Where the coded and noncoded blocks are assigned 1 and 0, respectively. CBP: Coded Block Pattern (Variable length) 5bits MBA Mtype Mquant CBPMVD Block Data 𝑪𝒓 𝑪𝒃 𝒀 𝒀 𝟎 𝒀 𝟏 𝒀 𝟐 𝒀 𝟑 𝑷𝒂𝒕𝒕𝒆𝒓𝒏 𝒏𝒖𝒎𝒃𝒆 = 𝟑𝟐𝒀 𝟎 + 𝟏𝟔𝒀 𝟏 + 𝟖𝒀 𝟐 + 𝟒𝒀 𝟑 + 𝟐𝑪𝒃 + 𝑪 𝒓 79
  • 80. MBA Mtype Mquant MVD MVD CBP CBP Block Layer MBA STUFFING 5 6 3/4 1/2 MB Layer MB Layer Loop Structure 5bits MBA Mtype Mquant CBPMVD Block Data 80
  • 81. 81 − A macroblock comprises four luminance blocks and one of each of the two colour difference blocks OR − Data for a block consists of codewords for transform coefficients followed by an end of block marker. − The order of clock transmission is as 1 2 3 4 5 6 𝑌 𝐶𝑟 𝐶𝑏 TCOEFF EOB 𝑪𝒓 𝑪𝒃 𝒀 𝒀 𝟎 𝒀 𝟏 𝒀 𝟐 𝒀 𝟑 Block layer EOB: End of Block TCOFF EOB Block Layer Loop Structure
  • 82. 16 )12( cos 16 )12( cos),( 4 )()( ),(    vyux yxf vCuC vuF IDCT: Inverse Discrete Cosine Transform DCT: Discrete Cosine Transform 16 )12( cos 16 )12( cos),()()( 4 1 ),( 7 0 7 0       vyux vuFvCuCyxf u v 2 1 )()( vCuC 1)()( vCuC 𝑖𝑓 𝑢 = 𝑣 = 0 𝑜𝑡ℎ𝑒𝑟 DCT and IDCT 82
  • 83. − For Intra blocks the DC coefficient linearly quantized with a step size of 8 without dead-zone. − The DC coefficient of all Intra Blocks are fixed length coded (FLC) with 8 bits. − A nominally black block will give 0001 0000 and a nominally white one 1110 1011. − The codes 0000 0000 and1000 0000 are not used. − For Intra DC one, the Reconstruction Levels (RECs) are as following table: Intra DC Coefficient Inverse Quantization Reconstruction level (REC) into inverse transform 0000 0001 (1) 8 0000 0010 (2) 16 0000 0011 (3) 24 0111 1111 (127) 1016 1111 1111 (255) 1024 1000 0001 (129) 1032 1111 1101 (253) 2024 1111 1110 (254) 2032 FLC (Fixed Length Coding) 83
  • 84. − For all coefficients other than the Intra DC one, the Reconstruction Levels (RECs) are in the range -2048 to 2047 and are given by clipping the results of the following formulae: − Note: QUANT ranges from l to 31 and is transmitted by either Gquant or Mquant. QUANT =“Odd” REC = QUANT*(2*LEVEL+1); LEVEL > 0 REC = QUANT*(2*LEVEL1); LEVEL < O QUANT =“Even” REC = QUANT*(2*LEVEL+1)1; LEVEL > O REC = QUANT*(2*LEVEL1)1; LEVEL < O REC = 0; LEVEL=O DCT Coefficient (except Intra DC) Inverse Quantization 84
  • 85. QUANT LEVEL 1 2 3 4 … 8 9 … 17 18 … 30 31 -127 -255 -509 -765 -1019…-2039 -2048 …-2048 -2048 … -2048 -2048 -126 -253 -505 -759 -1011…-2023 -2048 …-2048 -2048 … -2048 -2048 -2 -5 -9 -15 -19 … -39 -45 … -85 -89 … -149 -155 -1 -3 -5 -9 -11 … -23 -27 … -51 -53 … -89 -93 0 0 0 0 0 … 0 0 … 0 0 … 0 0 1 3 5 9 11 … 23 27 … 51 53 … 89 93 2 5 9 15 19 39 45 … 85 89 … 149 155 3 7 13 21 27 … 55 63 … 119 125 … 209 217 4 9 17 27 35 … 71 81 … 153 161 … 269 279 5 11 21 33 43 … 87 99 … 187 197 … 329 341 56 113 225 339 451 … 903 1017 … 1921 2033 … 2047 2047 57 115 229 345 459 … 919 1035 … 1955 2047 … 2047 2047 58 117 233 351 467 … 935 1053 … 1989 2047 … 2047 2047 59 119 237 357 475 … 951 1071 … 2023 2047 … 2047 2047 60 121 241 363 483 … 967 1089 … 2047 2047 … 2047 2047 125 251 501 753 1003 … 2007 2047 … 2047 2047 … 2047 2047 126 253 505 759 1011 … 2023 2047 … 2047 2047 … 2047 2047 127 255 509 765 1019 … 2039 2047 … 2047 2047 … 2047 2047 Reconstruction Levels (REC) 85
  • 86. 86 1 2 6 7 15 16 28 29 3 5 8 14 17 27 30 43 4 9 13 18 26 31 42 44 10 12 19 25 32 41 45 54 11 20 24 33 40 46 53 55 21 23 34 39 47 52 56 61 22 35 38 48 51 57 60 62 36 37 49 50 58 59 63 64 − Transform coefficient data is always present for all six blocks in a macroblock when MTYPE indicates Intra. − In other cases MTYPE and CBP signal which blocks-have coefficient data transmitted for them. − The quantized transform coefficients are sequentially transmitted according to the zig zag scan sequence as follows. Ordering of DCT Coefficients or Transform Coefficient (TCOEFF)
  • 87. 87  The most commonly occurring combinations of (RUN, LEVEL) are encoded with Variable Length Codes.  The least commonly occurring combinations of (RUN, LEVEL) are encoded with a 20 bit word consisting of 6 bits ESCAPE, 6 bits RUN and 8 bits LEVEL. − There are two code tables for VLC: • One being used for the first transmitted LEVEL in “Inter” and “Inter + MC” blocks • One being used for all other LEVELs except DC in Intra blocks witch is fixed length coded with 8 bits. DCT Coefficients Coding
  • 88. 88 RUN LEVEL CODE EOB 10 0 1 1s IF FIRST COEFFICIENT 0 1 11s NOT FIRST COEFFICIENT 0 2 0100 s 0 3 0010 1s 0 4 0000 1110 s 0 5 0010 0110 s 0 6 0010 0001 s 0 7 0000 0010 10 s 0 8 0000 0001 1101 s 0 9 0000 0001 1000 s 0 10 0000 0001 0011 s 0 11 0000 0001 0000 s 0 12 0000 0000 1101 0s 0 13 0000 0000 1100 1s 0 14 0000 0000 1100 0s 0 15 0000 0000 1011 1s RUN LEVEL CODE 1 1 011s 1 2 0001 10s 1 3 0010 0101 s 1 4 0000 0011 00s 1 5 0000 0001 1011 s 1 6 0000 0000 1011 0s 1 7 0000 0000 1010 1s 2 1 0101 s 2 2 0000 100s 2 3 0000 0010 11s 2 4 0000 0001 0100 s 2 5 0000 0000 1010 0s 3 1 0011 1s 3 2 00l0 0100 s 3 3 0000 0001 1100 s 3 4 0000 0000 1001 1s 4 1 0011 0s 4 2 0000 0011 11s 4 3 0000 0001 0010 s VLC Table for TCOEFF (1) End of Block (EOB) − It is in this set. − Because CBP indicates those blocks with no coefficient data, the EOB cannot occur as the first coefficient. − Hence, the EOB can be removed from the VLC table for the first coefficient
  • 89. RUN LEVEL CODE 5 1 0001 11s 5 2 0000 0010 01s 5 3 0000 0000 1001 0s 6 1 0001 01s 6 2 0000 0001 1110 s 7 1 000l 00s 7 2 0000 0001 0101 s 8 1 0000 111s 8 2 0000 0001 0001 9 1 0000 101s 9 2 0000 0000 1000 1s 10 1 0010 0111 s 10 2 0000 0000 1000 0s 11 1 0010 0011s 12 1 0010 0010 s 13 1 0010 0000 s RUN LEVEL CODE 14 1 0000 0011 10s 15 1 0000 0011 01s 16 1 0000 0010 00s 17 1 0000 0001 1111 18 1 0000 0001 1010 s 19 1 0000 0001 1001 s 20 1 0000 0001 0111 21 1 0000 0001 0110 s 22 1 0000 0000 1111 1s 23 1 0000 0000 1111 0s 24 1 0000 0000 1110 1s 25 1 0000 0000 1110 0s 26 1 0000 0000 1101 1s ESCAPE 0000 01 VLC Table for TCOEFF (2) 89
  • 90.  The least commonly occurring combinations of (RUN, LEVEL) are encoded with a 20 bit word consisting of 6 bits ESCAPE, 6 bits RUN and 8 bits LEVEL. Fixed Length Coding Table for TCOEFF RUN is a 6-bit LEVEL is an 8-bit fixed length code fixed length code RUN CODE LEVEL CODE 0 0000 00 -128 FORBIDDEN 1 0000 01 -127 1000 0001 2 0000 10 -2 1111 1110 -1 1111 1111 63 1111 11 0 FORBIDDEN 1 0000 0001 2 0000 0010 127 0111 1111 The last bit "s" denotes the sign of the level, "0” for positive,"1" for negative 90
  • 91. 91 Bit-Stream Syntax, FLC and VLC Loop Structures Summary
  • 92. Examples of FLC (Fixed Length Coding) − PSC: Picture Start Code, 20 bits − TR: Temporal Reference, 5-bit − PTYPE: Picture Type, 6 bits − PEI: Extra insertion information (1 bit) – set if PSPARE to follow. − PSPARE: Extra information (0/8/16. . .bits) – not used, always followed by PEI. − GBSC: GOB Start Code, 16 bits − GN: Group Number, 4 bits, indexing 12 GOBs − GQUANT: Group Quantization information, 5 bits − MQUANT: MB Quantization information, 5 bits − EOB: End-of-Block 92 Bit-Stream Syntax, FLC and VLC Loop Structures Summary Examples of VLC (Variable Length Coding) − MBA: MB Address, indexing MBs within a GOP, 11 bits max − MTYPE: MB Type information − GEI: Same function and size as PEI. − GSPARE: Same function and size as PSPARE. − MVD: Motion Vector Data, 11 bits max, 32 VLCs − CBP: Coded Block Pattern, 9 bits max, 63 VLCs − TCOEFF: Transform Coefficients
  • 93. 93 − The Problem: H.261 is typically used to send data over a constant bit rate channel, such as ISDN (e.g. 384kbps). − The encoder output bit rate varies depending on amount of movement in the scene. − Therefore, a rate control mechanism is required to map this varying bit rate onto the constant bit rate channel. Rate Control
  • 94. 94 − The encoded bitstream is buffered and the buffer is emptied at the constant bit rate of the channel − An increase in scene activity will result in the buffer filling up • The quantization step size in the encoder is increased which increases the compression factor and reduces the output bit rate − If the buffer starts to empty, then the quantization step size is reduced which reduces compression and increases the output bit rate. − The compression, and the quality, can vary considerably depending on the amount of motion in the scene • Relatively "static" scenes lead to low compression and high quality • “Active" scenes lead to high compression and lower quality Encoder Rate Ctrl Channel Buffer Video Sequence Rate Control
  • 95. − Even when channel coding is used, some residual (transmission) errors may end at the source decoder. − Residual errors may be detected at the source detector due to syntactical and semantic inconsistencies. − For digital video, the most basic error concealment techniques imply: − Repeating the co-located data from previous frame − Repeating data from previous frame after motion compensation − Error concealment for non-detected errors may be performed through post-processing. 95 Error Concealment
  • 96. 96 Error Concealment and Post-Processing, Examples Error Concealment
  • 98. What Is MPEG? – MPEG is an encoding and compression system for digital multimedia content defined by the Motion Pictures Expert Group (MPEG). – MPEG reduces the amount of data needed to represent video many times over, but still manages to retain very high picture quality. – MPEG can compress both audio & video – Similar to the reference model in H.261, software-based reference codecs for laboratory testing have also been thought for MPEG-1 and MPEG-2. For these codecs, the reference codec is called the Test Model (TM). 98
  • 99. − Coding of moving pictures and associated audio for digital storage media (Standard ISO/IEC 11172-2 (1991)) − The MPEG-1 video coding algorithm is largely an extension of H.261, and many of the features are common. Their bitstreams are, however, incompatible, although their encoding units are very similar. − MPEG-1 is the first generation of video codecs proposed by the MPEG as a standard to provide video coding for digital storage media or DSM (other than the conventional analogue video cassette recorders (VCRs)) − Since coding for digital storage can be regarded as a competitor to VCRs, MPEG-1 video quality at the rate of 1–1.5 Mbit/s is expected to be comparable to VCRs. 99 MPEG-1 Standard
  • 100. − Designed for up to 1.5 Mbit/sec (Although in most applications the MPEG-1 video bit rate is in the range of 1–1.5 Mbit/s, the international standard does not limit the bit rate, and higher bit rates might be used for other applications) − A popular standard for video on the Internet, transmitted as .mpg files. − Standard for the compression of moving pictures and audio. − Level 3 of MPEG-1 is the most popular standard for digital compression of audio--known as MP3. − Optimized & used for storing movies on CD ROM − Supports progressive images, non-interlaced video (Interlaced sources have to be converted to a non-interlaced format before coding.) 100 MPEG-1 Standard
  • 101. Video − Optimized for bitrates around 1.5 Mbit/s − Originally optimized for SIF picture format, but not limited to it: • 352x240 pixels a 30 frames/sec [ NTSC based ] • 352x288 pixels at 25 frames/sec [ PAL based ] − Progressive frames only - no direct provision for interlaced video applications, such as broadcast television Audio − Joint stereo audio coding at 192 kbit/s (layer 2) System − Mainly designed for error-free digital storage media − Multiplexing of audio, video and data Applications − CD-I, digital multimedia, and video database (e.g. video-on-demand) 101 MPEG-1 Standard (Standard ISO/IEC 11172-2 (1991))
  • 102. Source Input − Supports only 352 * 240 resolution − All the three main picture types, I, P and B, have the same SIF size with 4:2:0 format. − (In SIF-625, the luminance part of each picture has 360 pixels, 288 lines and 25 Hz, and those of each chrominance are 180 pixels, 144 lines and 25 Hz) − Before we describe how I-frames are encoded, we should describe our input. − 3 planes of Y, U, V • 8 bits per pixel. • Y range [0,255]. • U and V range [-128,127] (U and V biased by 128 to put in range [0,255]) − Planes are all of the same size. − Pixels colocated between frames. MPEG-1 Standard 102
  • 103. 103 MPEG-1 Standard H.261 MPEG-1 Sequential Access Random Access One basic frame rate Flexible frame rate OCIF and CIF images only Flexible image size I and P frame only I, P, and B frames MC over 1 frame MC over 1 or more frame 1 pixel MV accuracy 1/2 pixel MV accuracy 121 filter in the loop No filter Variable threshold + Uniform quantiz. Quantization Matrix No GOP structure GOP structure GOB structure Slice structure
  • 104. − The MPEG-1 standard gives the syntax description of how audio, video and data are combined into a single data stream. This sequence is formally termed as the ISO 11172 stream. − It consists of a compression layer and a systems layer. 104 Systems Coding Outline To support the combination of video and audio elementary streams Multiplexing of elementary audio, video and data
  • 105. − The MPEG-1 systems standard defines a packet structure for multiplexing coded audio and video into one stream and keeping it synchronised. − A pack consists of a pack header that gives the systems clock reference (SCR) and the bit rate of the multiplexed stream followed by one or more packets. − Each packet has its own header that conveys essential information about the elementary data that it carries. − The basic functions in systems layer are as follows: • Synchronised presentation of decoded streams • Construction of the multiplexed stream • Initialisation of buffering for playback start-up • Continuous buffer management • Time identification 105 Systems Coding Outline
  • 106. Multiplexing elementary streams − The multiplexing of elementary stream (ES) of audio, video and data is performed at the packet level. − Each packet thus contains only one elementary data type. − The systems layer syntax allows up to 32 audio, 16 video and 2 data streams to be multiplexed together. − If more than two data streams are needed, substreams may be defined. 106 Systems Coding Outline
  • 107. 107 Systems Coding Outline ES Packetization process into MPEG-1 PS Stream (Packs) Packet Header Packet Payload Pack Header Pack Payload
  • 108. 108 Systems Coding Outline MPEG-1 PS bitstream and its time related fields SCR: Systems Clock Reference STD: System Target Decoder PTS: Presentation Time Stamp DTS: Decoding Time Stamp
  • 109. Synchronisation − Prototypical encoder and decoder of MPEG-1, illustrating end-to-end synchronisation • STC: Systems Time Clock • SCR: Systems Clock Reference • PTS: Presentation Time Stamp • DSM: Digital Storage Media 109 Systems Coding Outline
  • 110. Synchronisation − Multiple elementary streams are synchronised by means of Presentation Time Stamps (PTS) in the ISO 11172 bit stream (by recording time stamps during capture of raw data) − The receivers will then make use of these PTS in each associated decoded stream to schedule their presentations. − Playback synchronisation is pegged onto a master time base, which may be extracted from one of the elementary streams, DSM, channel or some external source. − The occurrences of PTS and other information such as SCR and systems headers will also be essential for facilitating random access of the MPEG-1 bitstream. − This set of access codes should therefore be located near to the part of the elementary stream where decoding can begin. In the case of video, this site will be near the head of an intraframe. 110 Systems Coding Outline
  • 111. 111 Structure of the Coded Bit-Stream
  • 112. • Intraframe Compression – Frames marked by (I) denote the frames that are strictly intraframe compressed. – The purpose of these frames, called the "I pictures", is to serve as random access points to the sequence. I Frames 112
  • 113. • P Frames use motion-compensated forward predictive compression on a block basis. – Motion vectors and prediction errors are coded. – Predicting blocks from closest (most recently decoded) I and P pictures are utilised. Forward Prediction P Frames 113
  • 114. • B frames use motion-compensated bi-directional predictive compression on a block basis. – Motion vectors and prediction errors are coded. – Predicting blocks from closest (most recently decoded) I and P pictures are utilised. Forward Prediction Bi-Directional Prediction B Frames 114 Backward Prediction
  • 115. • Relative number of (I), (P), and (B) pictures can be arbitrary. • Group of Pictures (GOP) is the Distance from one I frame to the next I frame 1 2 3 4 5 6 7 8 9 10 11 12 1 GOP = 12 Group of Pictures 115
  • 116. 1 2 3 4 5 6 7 8 9 10 11 12 1 Source and Display Order Transmission Order 116 Structure of the Coded Bit-Stream, Example
  • 117. I-pictures • They are coded without reference to the previous picture. • They provide access points to the coded sequence for decoding (intraframe coded as for JPEG) P-pictures • They are predictively coded with reference to the previous I- or P-coded pictures. • They themselves are used as a reference (anchor) for coding of the future pictures. B-pictures • Bidirectionally coded pictures, which may use past, future or combinations of both pictures in their predictions. D-pictures • As intraframe coded, where only the DC coefficients are retained. • Hence, the picture quality is poor and normally used for applications like fast forward. • D-pictures are not part of the GOP; hence, they are not present in a sequence containing any other picture types. 117 Structure of the Coded Bit-Stream
  • 118. Group of pictures and Reordering − I and P pictures are called “anchor” pictures − A GOP is a series of one or more pictures to assist random access into the picture sequence. − The GOP length is normally defined as the distance between I-pictures, which is represented by parameter N in the standard codecs. − The distance between the anchor I/P and P-pictures is represented by M. − The encoding or transmission order of pictures differs from the display or incoming picture order. − This reordering introduces delays amounting to several frames at the encoder (equal to the number of B- pictures between the anchor I- and P-pictures). − The same amount of delay is introduced at the decoder in putting the transmission/ decoding sequence back to its original. This format inevitably limits the application of MPEG-1 for telecommunications. − A GOP, in coding, must start with an I picture and in display order, must start with an I or B picture and must end with an I or P picture 118 Structure of the Coded Bit-Stream
  • 119. 119 Structure of the Coded Bit-Stream Video Sequence ... ... Group of Pictures Picture Slice Macroblock 8 pixels 8 pixels Block
  • 120. Video Sequence – Begins with a sequence header and ends with an end-of-sequence code. – It includes one or more groups of pictures. Group of Pictures (GOP) – A Header and a series of one or more pictures intended to allow random access into the sequence. 120 Structure of the Coded Bit-Stream Video Sequence ... ... Group of Pictures Picture Slice Macroblock 8 pixels 8 pixels Block
  • 121. Picture • The primary coding unit of a video sequence. A picture consists of three rectangular matrices representing luminance (Y) and two chrominance (Cb and Cr) values. Slice • Each picture is divided into a group of macroblocks, called slices. Slices can have different sizes within a picture, and different division in pictures. • The reason for defining a slice is resetting the variable length code (VLC) to prevent channel error propagation into the picture. Each slice is coded independently from the other slices of the picture. • Slice are important in the handling of errors. If the bit stream contains an error, the decoder can skip to the next slice. 121 Structure of the Coded Bit-Stream Video Sequence ... ... Group of Pictures Picture Slice Macroblock 8 pixels 8 pixels Block
  • 122. − If the coded data are corrupted, and the decoder detects it, then it can search for the new slice, and the decoding starts from that point. − Each slice starts with a slice start code and is followed by a code that defines its position and a code that sets the quantisation step size. 122 Structure of the Coded Bit-Stream
  • 123. − To optimise the slice structure, that is, to give a good immunity from channel errors and at the same time to minimise the slice overhead, one might use short slices for macroblocks with significant energy (such as intra MB) and long slices for less significant ones (e.g. macroblocks in B-pictures). 123 Structure of the Coded Bit-Stream Short slices for macroblocks with significant energy
  • 124. − The division of slices may vary from picture to picture. − If "restricted slice structure" is applied, the slices must cover the whole pictures. − If "restricted slice structure" is not applied, the decoder will have to decide what to do with that part of the picture, which is not covered by a slice. 124 Structure of the Coded Bit-Stream Restricted Slice StructureGeneral Slice Structure A B C G E D F H I A B C GE D F H I J K OM L N A I A C G E D F H B I
  • 125. Macro block • A portion of image that consists of 16x16 pixels and comprises 4 blocks of luminance component and 1 block each of the 2 chrominance components. • At this layer, motion compensation and prediction are performed. • Since a slice has a raster scan structure, macroblocks are addressed in a raster scan order. • The top left macroblock in a picture has address 0, the next one on the right has address 1 and so on. 125 Structure of the Coded Bit-Stream Video Sequence ... ... Group of Pictures Picture Slice Macroblock 8 pixels 8 pixels Block
  • 126. Macro block • To reduce the address overhead, macroblocks are relatively addressed by transmitting the difference between the current macroblock and the previously coded macroblock. • This difference is called macroblock address increment. • In I-pictures, since all the macroblocks are coded, the macroblock address increment is always 1. • The first and last macroblocks of a slice, shall not be skipped macroblocks. 126 Structure of the Coded Bit-Stream Video Sequence ... ... Group of Pictures Picture Slice Macroblock 8 pixels 8 pixels Block
  • 127. Block and Color Sampling 127 4:2:0 Block • A matrix of 8x8 elements. • One of the ways rate control is achieved is by increasing the quantisation step size in blocks which would otherwise have a higher entropy.
  • 128. 128 YUV Y Only YUV YUV YUV Sampling Points 13.5 MHz 4:2:2 4:4:4 Recall, 4:4:4 & 4:2:2 Sampling
  • 129. 129 YUV Y Only Y Only Y Only 4:2:0 YUV Sampling Points 13.5 MHz 4:1:1 Y V Y Y U Y JPEG/JFIF H.261 MPEG-1 Recall, 4:1:1 & 4:2:0 MPEG-1 Sampling
  • 130. 130 YUV Y Only Y Only Y Only 4:2:0 YUV Sampling Points 13.5 MHz 4:1:1 YV Y Only YU Y Only Co-sited Sampling MPEG-2 Recall, 4:1:1 & 4:2:0 MPEG-2 Sampling
  • 131. 131 4:2:0 YV Y Only YU Y Only Co-sited Sampling MPEG-2 4:2:0 Sampling in MPEG-1 and MPEG-2 4:2:0 Y V Y Y U Y JPEG/JFIF H.261 MPEG-1 Downsize chrominance Components. • 4:2:0 (with chrominance samples centered) • Requires bilinear interpolation
  • 132. Structure of the Coded Bit-Stream, Summary • Sequence layer: picture dimensions, pixel aspect ratio, picture rate, minimum buffer size, DCT quantization matrices • GOP layer: will have one I picture, start with I or B picture, end with I or P picture, has closed GOP flag, timing info, user data • Picture layer: temporal ref number, picture type, synchronization info, resolution, range of motion vectors • Slices: position of slice in picture, quantization scale factor • Macroblock: position, H and V motion vectors, which blocks are coded and transmitted GOP-1 GOP-2 GOP-n I B B B P B B.. Slice-1 Slice-2 … Slice-N MB-1 MB-2 MB-n 0 1 2 3 4 5 Sequence layer GOP layer Picture layer Slice layer Macroblock layer 8x8 block 132
  • 133. 133 Headers in Structure of the Coded Bit-Stream
  • 134. Seq. Header • Width • Height • Frame Rate • Buffer Control GOP Header • Time Code Picture Header • Temporal Ref • Picture Type • Motion Vector Parameters Picture Data Seq. End Code • All headers begin with 23 zeroes followed by 9 bits that indicate header type. • Encoding process will never produce 23 zeroes. Headers in Structure of the Coded Bit-Stream 134
  • 135. 135 Motion Estimator MC Mode Decision Picture Predictor & Store MVMC Modes Residual DCT Q Q-1 IDCT Decoded Picture Prediction Lossless Coder (RLC+VLC) Rate Control Buffer Coded Video Bit Steam Ordered Source Pictures _+ ++ MPEG Video Encoding Simplified MPEG Encoder
  • 136. The main differences between this encoder and H.261 Frame reordering: at the input of the encoder, coding of B-pictures is postponed to be carried out after coding the anchor I- and P-pictures. Quantisation: intraframe coded macroblocks are subjectively weighted to emulate perceived coding distortions. Motion estimation: not only is the search range extended but the search precision is increased to half a pixel. B-pictures use bidirectional motion compensation. No loop filter. Frame store and predictors: to hold two anchor pictures for prediction of B-pictures. Rate regulator: here there is more than one type of picture, each generating different bit rates. 136 MPEG-1 Encoder
  • 137. − Within each picture, macroblocks are coded in a sequence from left to right. − Since 4:2:0 image format is used, the six blocks of 8×8 pixels, four luminance and one of each chrominance components are coded in turn. − First, for a given macroblock, the coding mode is chosen. This depends on the picture type, the effectiveness of motion-compensated prediction in that local region and the nature of the signal within the block. − Second, depending on the coding mode, a motion-compensated prediction of the contents of the block based on the past and/or future reference pictures is formed. This prediction is subtracted from the actual data in the current macroblock to form an error signal. − Third, this error signal is divided into 8×8 blocks and a DCT is performed on each block. The resulting DCT coefficients is quantised and is scanned in zigzag order to convert into a one-dimensional string of quantised DCT coefficients. − Fourth, the side information for the macroblock, including the type, block pattern, motion vector and address alongside the DCT coefficients are coded (The DCT coefficients are run length coded) 137 MPEG-1 Encoder
  • 138. − The insensitivity of the human visual system to high-frequency distortions can be exploited for further bandwidth compression. − The DCT coefficients, prior to quantisation (-2047 to +2047), are divided by the weighting matrix. − Weighted coefficients are then quantised by the quantisation step size, and at the decoder, reconstructed quantised coefficients are then multiplied to the weighting matrix to reconstruct the coefficients. 138 Default Intra and Inter Quantisation Weighting Matrices DCT Coefficients Weighting Matrix Quantisation by Quantisation Step Size ÷
  • 139. Intra Quantisation Weighting Matrix − Experience has shown that for SIF pictures, a suitable distortion weighting matrix for the intra-DCT coefficients is the one shown in Figure. This intra matrix is used as the default quantisation matrix for intraframe coded macroblocks. Inter (or Nonintra) Quantisation Weighting Matrix (A flat matrix) − The different weightings may not be used for interframe coded macroblocks. − This is because high-frequency interframe error does not necessarily mean high spatial frequency. (It might be due to poor motion compensation or block boundary artefacts). 139 Default Intra and Inter Quantisation Weighting Matrices
  • 140. The strategy for motion estimation in this codec is different from the H.261 in four main respects: 1. Motion estimation is an integral part of the codec. • The motion estimation in H.261 was optional. 2. Motion search range is much larger (larger search area). • H.261 is normally used for head-and-shoulders pictures, where the motion speed is normally very small. • In contrast, MPEG-1 is used mainly for coding of films with much larger movements and activities. 3. Higher precision of motion compensation is used. • Motion estimation with half-pixel precision 4. B-pictures can benefit from bidirectional motion compensation. • When B-pictures are present, due to various distances between a picture and its anchor, it is expected that the search range for motion estimation to be different for different picture types. • For normal scenes, the maximum search range for P-pictures is usually taken as 11 pixels/3 frames, and the forward and backward motion range for B1-pictures are 3 pixels/frame and 7 pixels/2 frames, respectively. These values for B2-pictures become 7 and 3. 140 Motion Estimation
  • 141. Motion estimation with half-pixel precision − The normal block matching with integer pixel positions is carried out first. − Then eight new positions, with a distance of half a pixel around the final integer pixel, are tested. 141 Motion Estimation Motion-compensated prediction error (a) with and (b) without half-pixel precision
  • 142. Coding of Pictures change MQUANT no change to MQUANT I picture change MQUANT no change to MQUANT coded not coded interframe change MQUANT no change to MQUANT intraframe motion comp. A motion vector set to 0 P picture A Fwd motion compensation A Bwd motion compensation A interpolated compensation B picture Picture Type 142 A MQUANT: MB Quantization information
  • 143. In I-pictures, all the macroblocks are intra coded. − There are two intra macroblock types: intra-d: one that uses the current quantiser scale • Variable length coded with 1 • The default value when the quantiser scale is not changed • no quantiser scale is transmitted and the decoder uses the previously set value. intra-q: and the other that defines a new value for the quantiser scale, intra-q • Variable length coded with 01 • The macroblock overhead should contain an extra 5 bits to define the new quantiser scale between 1 and 31 • In I-pictures of MPEG-1, an intra-q can be any of the macroblocks. 143 I-pictures Coding
  • 144. DC indices are coded losslessly by DPCM (DC_DIFF) − The quantiser step size is different for different coefficients and may change from MB to MB. − The only exception is the DC coefficients, which are treated differently. This is because the eye is sensitive to large areas of luminance and chrominance errors; then the accuracy of each DC value should be high and fixed. − The quantiser step size for the DC coefficient is fixed to eight. Since in the quantisation weighting matrix, the DC weighting element is eight, then the quantiser index for the DC coefficient is always 1, irrespective of the quantisation index used for the remaining AC coefficients. − Because of the strong correlation between the DC values of blocks within a picture, the DC indices are coded losslessly by DPCM (DC_DIFF). − Such a correlation does not exist among the AC coefficients, and hence they are coded independently. 144 I-pictures Coding
  • 145. DC indices are coded losslessly by DPCM (DC_DIFF) − The prediction for the DC coefficients of luminance blocks follows the coding order of blocks within a macroblock and the raster scan order. − For example, in the macroblocks of 4:2:0 format pictures shown in Figure, the DC coefficient of block Y2 is used as a prediction for the DC coefficient of block Y3. − The DC coefficient of block Y3 is a prediction for the DC coefficient of Y0 of the next macroblock. − For the chrominance, we use the DC coefficients of the corresponding value of the block in the previous macrobloc 145 I-pictures Coding 𝑪𝒓 𝑪𝒃 𝒀 𝒀 𝟎 𝒀 𝟏 𝒀 𝟐 𝒀 𝟑
  • 146. DC term is expressed as difference from previous DC term (DC_DIFF) Encoded as two parts: – Size of difference (i.e., log(DC_DIFF)) – Size number of bits that provides the value. Size is encoded as a Huffman code. AC terms are given as (run,value) pairs. Encoded in one of two ways: – Huffman code for (run, abs(value)) followed by single bit for sign of value. – Special Huffman code indicating ESCAPE, followed by 6 bits for run and either 8 or 16 bits for value. • 6 bits for run simply encode 0 through 63 • First 8 bits of value put value at –128 to 127. • If first 8 bits is -128, next 8 bits provide codes for –128 through –255 • If first 8 bits is 0, next 8 bits provide codes for 128 through 255. DC and AC Terms Coding Macroblock Block to be encoded 8 8 8 8 DCT Q sz DPCM DC AC ZigZag Scanning Runlength Encoding VLC sz: Step Size JPEG encoded DC JPEG encoded AC 146
  • 147. Similar to those of H.261 − 8 types of macroblocks for P-frames: • intra-d and intra-q: the same as used in I-frames • pred-m: the macroblock is forward-predictive encoded (difference from the previous frame) using a forward motion vector • pred-c: the macroblock is encoded using a coded pattern; a 6-bit coded block pattern is transmitted as a variable-length code and this tells the decoder which of the 6 blocks in the macroblock are coded (1) and which are not coded (0) • pred-mc: the macroblock is forward-predictive encoded using a forward motion vector and also a 6-bit coded pattern is included • pred-cq: a pred-c macroblock with a new quantization scale • pred-mcq: a forward-predictive macroblock encoded using a coded pattern with a new quantization scale • skipped: they have a zero motion vector and no code; the decoder copies the corresponding macroblock from the previous frame into the current frame 147 P-pictures Coding
  • 148. − The encoder has more decisions to make than in the case of P-pictures. − These are how to divide the picture into slices; determine the best motion vectors to use; decide whether to use forward, backward or interpolated motion compensation or to code intra; and how to set the quantiser scale. − The encoder first calculates the best forward motion-compensated macroblock from the previous anchor picture for forward motion compensation. − It then calculates the best motion-compensated macroblock from the future anchor picture, as the backward motion compensation. − Finally, the average of the two motion-compensated errors is calculated to produce the interpolated macroblock. It then selects one that had the smallest error difference with the current macroblock. − In the event of a tie, an interpolated mode is chosen. 148 B-pictures Coding
  • 149. 149 B-pictures Coding 12 types of macroblocks for B-frames • intra-d, intra-q: the same as used for I-frames • pred-i: bidirectionally-predictive encoded macroblock with forward motion vector and backward motion vector • pred-ic: a pred-c macroblock encoded using a 6-bit coded pattern • pred-b: backward-predictive encoded macroblock with backward motion vector • pred-bc: a pred-b macroblock encoded using a 6-bit coded pattern • pred-f: forward-predictive encoded macroblock with forward motion vector • pred-fc: a pred-b macroblock encoded using a 6-bit coded pattern • pred-icq: a pred-ic macroblock with a new quantization scale • pred-fcq: a pred-fc macroblock with a new quantization scale • pred-bcq: a pred-bc macroblock with a new quantization scale • skipped: the same as for P-frames.
  • 150. 150 Video Sequence l Sequence 2 Picture (I) Picture (B) Picture (B) Picture (I) Slice 1 Slice 2 Slice N…... MB 1 MB 2 MB6…... Block 1 Block 2 Block 6…... GOP l GOP2 GOP 12 GOP 13 Video Sequence Structure
  • 151. 151 Layers of MPEG-1 Video Bit stream
  • 152. 152 Layers of MPEG-1 Video Bit stream • Video Sequence Layer Header contains: the picture size (horizontal and vertical), pel aspect ratio, picture rate, bit rate, minimum decoder buffer size, constraint parameters flag, control for loading 64-bit values for intra and nonintra quantization tables and user data • GOP layer header contains: the time interval from the start of the video sequence, the closed GOP flag (decoder needs frames from previous GOP or not?), broken link flag and user data • Picture layer header contains: the temporal reference of the picture, picture type (I,P,B,D), decoder buffer initial occupancy, forward motion vector resolution and range for P- and B-frames, backward motion vector resolution and range for B-frames and user data • Slice layer header contains: vertical position where the slice starts and the quantizer scale for this slice • Macroblock layer header contains: optional stuffing bits, macroblock address increment, macroblock type, quantizer scale, motion vector, coded block pattern • A block contains: 8x8 coded DCT coefficients
  • 153. 153 MPEG-1 Bit Stream Organization Seq. Header Block Data MB Header Slice Header Picture Header GOP Header
  • 154. 154 Coded Video Bit Steam Picture Frame Buffer MVs MV Mode Q-1 IDCT Ordered Source Pictures Lossless Decoder VLC/RLC Motion Compensation Simplified MPEG Decoder
  • 156. − The incoming bitstream is stored in the buffer and is demultiplexed into the coding parameters such as DCT coefficients, motion vectors, macroblock types and addresses. − They are then variable length decoded using the locally provided tables. − The DCT coefficients after inverse quantisation are inverse DCT transformed and added to the motion- compensated prediction (as required) to reconstruct the pictures. − The frame stores are updated by the decoded I- and P-pictures. − Finally, the decoded pictures are reordered to their original scanned form. − At the beginning of the sequence, the decoder will decode the sequence header, including the sequence parameters. 156 MPEG-1 Decoder
  • 157. Picture Header Picture Data Row Major Scan of Encoded Macroblocks Macroblock Address Increment (1-bit) Macroblock Type (1 or 2 bits) Q Scale (5 bits) Luminance Blocks U Block V Block Stepping Back a Bit in Decoder DC Size (2-7 bits) DC Bits (0-8 bits) First Non-zero AC Coeff. (variable bit length) Last Non-zero AC Coeff. (variable bit length) EOB (2 bits) 157
  • 158. Encoder Output buffer Decoder Input buffer Filled at a variable rate because the encoder output bit rate is variable (depends on how much change is going on between frames) If a fixed bit rate channel is used, then buffering is required. Emptied at a constant rate by the channel. • Feedback mechanism detects when buffer is at risk of over-flowing or under-flowing. • This is used to adjust the degree of quantisation – and hence the quality of the images being transmitted. Buffering 158
  • 159. − A coded bitstream contains different types of pictures, and each type ideally requires a different number of bits to encode. − In addition, the video sequence may vary in complexity with time, and it may be desirable to devote more coding bits to one part of a sequence than to another. − For constant bit rate coding, varying the number of bits allocated to each picture requires that the decoder has a buffer to store the bits not needed to decode the immediate picture. − The extent to which an encoder can vary the number of bits allocated to each picture depends on the size of this buffer (i.e. decoder buffer). − large buffer → greater variations → increasing the picture quality → increasing the decoding delay − The delay is the time taken to fill the input buffer from empty to its current level − An encoder needs to know the size of the decoder’s input buffer in order to determine to what extent it can vary the distribution of coding bits among the pictures in the sequence. 159 Video Buffer Verifier (VBV)
  • 160. − The decoder will display the decoded pictures at their specific rate. − If the display clock is not locked to the channel data rate, and this is typically the case, then any mismatch between the encoder and channel clock and the display clock will eventually cause a buffer overflow or underflow. Model Decoder − The model decoder is defined to resolve three problems: – It constrains the variability in the number of bits that may be allocated to different pictures; – It allows a decoder to initialise its buffer when the system is started; – It allows the decoder to maintain synchronisation while the stream is played. 160 Video Buffer Verifier (VBV)
  • 161. The definition of the parameterised model decoder is known as Video Buffer Verifier (VBV). − The parameters used by a particular encoder are defined in the bitstream. − This really defines a model decoder that is needed if encoders are to be assured that the coded bitstream they produce will be decodable. • A fixed rate channel is assumed to put bits at a constant rate into the buffer, at regular intervals, set by the picture rate • The picture decoder instantaneously removes all the bits pertaining to the next picture from the input buffer (Practical decoders may differ). • If there are too few bits in the input buffer, that is, all the bits for the next picture have been received, then the input buffer underflows, and there is an underflow error. • If during the time between the picture starts, the capacity of the input buffer is exceeded, then there is an overflow error. 161 Video Buffer Verifier (VBV)
  • 162. − Practical decoders may differ from this model in several important ways. − They may not remove all the bits required to decode a picture from the input buffer instantaneously. − They may not be able to control the start of decoding very precisely as required by the buffer fullness parameters in the picture header, and they take a finite time to decode. − They may also be able to delay decoding for a short time to reduce the chance of underflow occurring. − But these differences depend in degree and kind on the exact method of implementation. − To satisfy the requirements of different implementations, the MPEG video committee chose a very simple model for the decoder. − Practical implementations of decoders must ensure that they can decode the bitstream constrained in this model. − In many cases, this will be achieved by using an input buffer that is larger than the minimum required and by using a decoding delay that is larger than the value derived from the buffer fullness parameter. − The designer must compensate for any differences between the actual design and the model in order to guarantee that the decoder can handle any bitstream that satisfies the model. − Encoders monitor the status of the model to control the encoder so that overflow does not occur. − The calculated buffer fullness is transmitted at the start of each picture so that the decoder can maintain synchronisation. 162 Video Buffer Verifier (VBV)
  • 163. − The encoder must make sure that the input buffer of the model decoder is neither overflowed nor underflowed by the bitstream. − Since the model decoder removes all the bits associated with a picture from its input buffer instantaneously, it is necessary to control the total number of bits per picture. − The encoder could control the bit rate by simply checking its output buffer content. As the buffer fills up, the quantiser step size is raised to reduce the generated bit rate, and vice versa. − This situation in MPEG-1, because of the existence of three different picture types, where each generates a different bit rate, is slightly more complex. − First, the encoder should allocate the total number of bits among the various types of picture within a GOP, so that the perceived image quality is suitably balanced. − The distribution will vary with the scene content and the particular distribution of I-, P- and B-pictures within a GOP. 163 Rate Control and Adaptive Quantisation
  • 164. − Investigations have shown that for most natural scenes, each P-picture might generate as many as two to five times the number of bits of a B-picture, and an I-picture three times those of the P-picture. − If there is little motion and high texture, then a greater proportion of the bits should be assigned to I- pictures. − Similarly, if there is strong motion, then a proportion of bits assigned to P-pictures should be increased. − In both cases, lower quality from the B-pictures is expected to permit the anchor I- and P-pictures to be coded at their best possible quality. − Our investigations with variable bit rate (VBR) video, where the quantiser step size is kept constant (no rate control), show that the ratios of generated bits are 6:3:2, for I-, P- and B-pictures, respectively. − Of course, at these ratios, because of the fixed quantiser step size, the image quality is almost constant, not only for each picture (in fact, slightly better for B-pictures due to better motion compensation) but throughout the image. − Again, if we lower the expected quality for B-pictures, we can change that ratio in favour of I- and P- pictures (it is possible to make the encoder intelligent enough to learn the best ratio). 164 Rate Control and Adaptive Quantisation
  • 166. − Following the universal success of H.261 and (MPEG)-1 video codecs, there was a growing need for a video codec to address a wide variety of applications. − Considering the similarity between H.261 and MPEG-1, ITU-T and ISO/IEC made a joint effort to devise a generic video codec. − Joining the study was a special group in ITU-T, Study Group 15 (SG15), who were interested in coding of video for transmission over the future broadband integrated services digital networks (BISDN) using asynchronous transfer mode (ATM) transport. − The devised generic codec was finalised in 1995 and takes the name of MPEG-2/H.262, though it is more commonly known as MPEG-2. − It has error resilience for broadcasting, and ATM networks. − It delivers multiple programmes simultaneously without requiring them to have a common time base. These require that the MPEG-2 transport packet length should be short and fixed. 166 MPEG-2 Standard
  • 167. At the time of the development, the following applications for the generic codec were foreseen: • BSS broadcasting satellite service (to the home) • CATV cable TV distribution on optical networks, copper, etc. • CDAD cable digital audio distribution • DAB digital audio broadcasting (terrestrial and satellite) • DTTB digital terrestrial television broadcast • EC electronic cinema • ENG electronic news gathering (including satellite news gathering (SNG)) • FSS fixed satellite service (e.g. to head ends) • HTT home television theatre • IPC interpersonal communications (videoconferencing, videophone, etc.) • ISM interactive storage media (optical discs, etc.) • MMM multimedia mailing • NCA news and current affairs • NDS networked database services (via ATM, etc.) • RVS remote video surveillance • SSM serial storage media (digital VTR, etc.) 167 Application of MPEG-2 Coded
  • 168. − Part 1, Systems : synchronization and multiplexing of audio and video − Part 2, Video − Part 3, Audio (an extension of the MPEG 1 audio standards) − Part 4, Testing Compliance − Part 5, Software Simulation − Part 6, extensions for Digital Storage Media Command and Control (DSM-CC) (eg. rewind forward etc) − Part 7, Advanced Audio Coding (AAC) (a 2nd audio standard there are even more parts) − [Part 8 withdrawn due to lack of industry interest ] − Part 9, Extensions for Real Time Interfaces − Part 10, Conformance Extensions for DSM-CC − Part 11, Intellectual Property Management and Protection 168 MPEG-2 Parts (MPEG-2 Related Standards)
  • 169. Video • 2-15 or 16-80 Mbit/s bit rate ( target bit rate: 4…9 Mbit/sec ) • TV and HDTV picture formats • Supports interlaced material • MPEG-2 consists of profiles and levels • Main Profile, Main Level (MP@ML) refers to 720x480 resolution video at 30 frames/sec, at bit rates up to 15 Mbit/sec for NTSC video (typical ~4 Mbit/sec) • Main Profile, High Level (MP@HL) refers to HDTV resolution of 1920x1152 pixels at 30 frames/sec, at a bit rate up to 80 Mbit/sec (typical ~15 Mbit/sec) Audio • Compatible multichannel extension of MPEG-1 audio System • Video, audio and data multiplexing defines tow presentations: • Program Stream for applications using near error free media • Transport Stream for more error prone channels Applications • Satellite, cable, and terrestrial broadcasting, digital networks, and digital VCR 169 MPEG-2 Audio, Video, System and Application Parts
  • 170. 170 Comparison Between MPEG-1 and MPEG-2 MP@ML Video Specifications MPEG-2 MP@ML MPEG-1 Video Format 720x480x30(NTSC) 320x240x30(NTSC) 720x576x25(PAL) 320x288x25(PAL) Coded Data 4-6Mbps for CCIR601 1.8Mbps Max Speed 15Mbps Max Coded Picture Frame, Picture Frame Prediction Inter Frame, Field Interframe DCT Frame, Field Frame Resolution 12 bits 9 bits VLC Resol. 8, 9,10 bits 8 bits Quantization Non-linear Mapping Linear Mapping Pan, Scan Yes No
  • 171. 171 MPEG-1 MPEG-2 Video format SIF progressive SIF, 4:2:0, 4:2:2, 4:4:4 progressive/interlaced Picture quality VHS Distribution/contribution Bit rate Variable (  1.856 Mbps) Variable up to 100Mbps Low delay mode < 150 ms < 150 ms (no B pictures) Accessibility Random access Random access/channel hopping Scalability SNR, spatial, temporal, simulcast, data partitioning Compatibility Forward, backward, upward, and downward Transmission error Error protection Error resilience Editing bit stream Yes Yes DCT Noninterlaced Field (progressive) or frame (interlaced) Motion estimation Noninterlaced Field, frame, and dual-prime based. Top (168) block and bottom (168) block Motion vectors Motion vectors for P, B picture only Concealment motion vectors for I pictures besides MV for P & B Scanning of DCT coefficients Zigzag scan Zigzag scan, alternate scan for interlaced video Functional Comparison Between MPEG-1 and MPEG-2 Video
  • 172. − Picture resolutions vary from SIF to HDTV − Frame and Field DCT Coding in MPEG-2 − Both Linear and Nonlinear Quantisation in MPEG-2 − All Chroma Channels Subsampling in MPEG-2 − Search range can be larger (distance between P-frames is larger than B1 and B2) − A new range of macroblock (MB) types in the MPEG-2 standard, by combining of various picture formats and the interlaced/progressive option create. • While each MB in a progressive mode has 6 blocks in the 4:2:0 format, the number of blocks in the 4:4:4 image format is 12. − Macroblock size can be 16 x 8 pixels • The dimensions of the unit of blocks used for motion estimation/compensation can change. • In the interlaced pictures, since the number of lines per field is half the number of lines per frame, with equal horizontal and vertical resolutions for motion estimation, it might be appropriate to choose blocks of 16 × 8, that is, 16 pixels over eight lines. These types of sub-MBs have half the number of blocks of the progressive mode. − Scalability • The scalable modes of MPEG-2 are intended to offer interoperability among different services or to accommodate the varying capabilities of different receivers and networks upon which a single service may operate. 172 Main difference between MPEG-2 and MPEG-1
  • 173. MPEG-1 and MPEG-2 syntax differences − All MPEG-2 decoders that comply with currently defined profiles and levels are required to decode MPEG-1 constrained bit streams: − MPEG-2 syntax can be made to be very close to MPEG-1, by using particular values for the various MPEG-2 syntax elements that do not exist in MPEG-1 syntax − The IDCT mismatch control − The run level values in VLC − The constraint parameter flag mechanism in MPEG-1 is replaced by the profile and level structures in MPEG-2. − The concept of the GOP layer is slightly different. • GOP in MPEG-2 may indicate that certain B-pictures at the beginning of an edited sequence comprise a broken link, which occurs if the forward reference picture needed to predict the current B-pictures is removed from the bitstream by an editing process. • It is an optional structure for MPEG-2 but mandatory for MPEG-1. − The slices in MPEG-2 must always start and end on the same horizontal row of MBs. • This is to assist the implementations in which the decoding process is split into some parallel operations along horizontal strips within the same pictures. 173 Main difference between MPEG-2 and MPEG-1
  • 174. • IDCT Mismatch Control • Macroblock stuffing • Run-level escape syntax • Chrominance samples horizontal position (co-locate with luminance in MPEG-2, half the way between luminance samples in MPEG-1 • Slices (in MPEG-2 slices start on the same horizontal row of macroblocks, in MPEG-1 its possible to have all macroblocks of a picture in one slice, for example • D-pictures (not permitted in MPEG-2; in MPEG-1 only Intra-DC-coefficient, special end_of_macroblock code) • Full-pel Motion Vectors (in MPEG-1 full-pel motion vectors possible, in MPEG-2 always half-pel motion vectors) • Aspect Ratio Information (MPEG-1 specifies pel aspect ratio, MPEG-2 specifies display aspect ratio and pel aspect ratio can be calculated from this and from frame size and display size) • Forward_f_code and backward_f_code (differencies in parameter location and contents) • Constrained_parameter_flag and maximum horizontal_size (MPEG-2 has profile and level mechanism) • Bit_rate and vbv_delay (fixed values are reserved for variable bit rate in MPEG-1, other values are for constant bit rate; in MPEG-2 semantics for bit_rate are changed, etc.) • VBV (in MPEG-1 VBV is only defined for constant bit rate operation; in MPEG-2 VBV is only defined for variable bit rate and constant bit rate is assumed to be a special case of variable bit rate) • temporal_reference (a small difference between MPEG-1 and MPEG-2) 174 Details of MPEG-2 and MPEG-1 Differences
  • 175. 175 Motion Estimator MC Mode Decision Picture Predictor & Store MVMC Modes Residual DCT Q Q-1 IDCT Decoded Picture Prediction Lossless Coder (RLC+VLC) Rate Control Buffer Coded Video Bit Steam Ordered Source Pictures _+ ++ MPEG Video Encoding Simplified MPEG Encoder
  • 176. 176 Structure of the Coded Bit-Stream Video Sequence ... ... Group of Pictures Picture Slice Macroblock 8 pixels 8 pixels Block
  • 177. − All chroma channels subsampling! (4:4:4, 4:2:2 and 4:2:0 support) 177 All Chroma Channels Subsampling in MPEG-2
  • 178. 178 4:2:0 YV Y Only YU Y Only Co-sited Sampling MPEG-2 Co-sited 4:2:0 Sampling in MPEG-2 4:2:0 Y V Y Y U Y JPEG/JFIF H.261 MPEG-1 Downsize chrominance Components. • 4:2:0 (with chrominance samples centered) • Requires bilinear interpolation Co-sited
  • 179. 179 Luminance MB structure in frame-organized DCT coding (for slow moving) Luminance MB in field-organized DCT coding (for fast moving) Blocks (8×8)MB (16×16) Frame Type DCT vs. Field Type DCT Blocks (8×8)MB (16×16)
  • 180. 180 Frame and Field DCT Coding in MPEG-2
  • 181. − Interlacing! (Motion estimation is different from MPEG-1) − MPEG-2 can chose between Previous Frame and previous Field − The odd and even fields can be coded together as if it were a frame or the can be coded independently • if there is no motion then we can combine the two fields into a single image called a “frame-picture.” Better for compression efficiency. • if there is motion then the two fields are coded separately as if they were two pictures called “field- pictures”. 181 Frame and Field DCT Coding in MPEG-2 Odd Field-Picture Even Field-Picture Frame Picture
  • 182. − For interlaced pictures, since the vertical correlation in the field pictures is greatly reduced, should the field prediction be used, an alternate scan may perform better than a zigzag scan. 182 Frame and Field DCT Coding in MPEG-2
  • 183. 183 Five motion compensation modes in MPEG-2 16 8 16 816 16 Interlaced pictures Five motion compensation modes in MPEG-2 More information: Standard Codecs, Dr, Ghanbari, 8.4 MPEG-2 nonscalable coding modes
  • 184. − In motion compensation mode, a field of 16×16 pixel macroblocks is split into upper half and lower half 16×8 pixel blocks, and a separate field prediction is carried out for each. − Two motion vectors are transmitted for each P-picture macroblock and two or four motion vectors for the B-picture macroblock. − This mode of motion compensation may be useful in field pictures that contain irregular motion. − Here a field macroblock is split into two halves, and in the field prediction for frame pictures a frame macroblock is split into two top and bottom field blocks. − It should be noted that field pictures have some restrictions on I, P and B-picture coding type and motion compensation. − Normally, the second field picture of a frame must be of the same coding type as the first field. However, if the first field picture of a frame is an I-picture, then the second field can be either I or P. If it is a P-picture, the prediction macroblocks must all come from the previous I-picture, and dual prime cannot be used 184 Five motion compensation modes in MPEG-2
  • 185. − In this case the target macroblock in a frame picture is split into two top field and bottom field pixels. (For interlaced pictures, a target Macroblock can be split into two field macroblocks). − Field prediction is then carried out independently for each of the 16 x 8 pixel target macroblocks. − For P-pictures, two motion vectors are assigned for each 16×16 pixel target macroblock. − The 16×8 predictions may be taken from either of the two most recently decoded anchor pictures. − Note that the 16x8 field prediction cannot come from the same frame, as was the case in field prediction for field pictures. − For B-pictures, due to the forward and the backward motion, there can be two or four motion vectors for each target macroblock. − The 16×8 predictions may be taken from either field of the two most recently decoded anchor pictures. 185 Five motion compensation modes in MPEG-2
  • 186. − Motion Vectors are differentially coded wrt the vector for the previous macroblock (ie. to the left) • PMV – Previous Motion Vector. • MV – Motion Vector for the Current Macroblock. − Define 𝚫 = 𝜟 𝒙 𝜟 𝒚 = 𝟐 × 𝐌𝐕 − 𝐏𝐌𝐕 • Multiply by 2 as 0.5 pel quantisation used. • Δ 𝑥 and Δ 𝑦 are coded separately. Coding of Motion Vectors 186
  • 187. Coding Δx and Δy − The absolute value and sign of each component is coded separately. − The absolute value is broken down as 𝚫∗ = 𝒂 − 𝟏 𝟐 𝒃 + 𝒄 + 𝟏 𝑎 – is called the motion_code and ranges from 0 to 16. It is Huffman Coded 𝑏 – is called the size and effectively limits the range of motion vector. It ranges from 0 to 8. It is Fixed Length Coded (FLC) (4 bit binary value). 𝑐 – is the motion_residual. It ranges from 0 to 2 𝑏 − 1 It is Fixed Length Coded (FLC). It is a 𝑏-bit binary number. 187
  • 188. Coding Δx and Δy Δ∗ A table of how the choice of Size effects the range of difference that can be coded. • Size is set once at the start of each Picture Layer. (ie. it is the same over the entire picture). • It is common to choose larger size for P-frames cause motion is bigger. 188
  • 189. Coding Δx and Δy Size is chosen based on the range of motion vectors. EX: Say we limit search width to 10. • Then we could have a vector [10, 10] and a previous vector [-10 10]. • The max Δ 𝑥 or Δ 𝑦 is 2 × 10 + 10 = 40. • Therefore we need to choose 𝑏 = 2. • Given an MV [4.5, 3] and PMV [5, -1] then 𝚫 = 2 × 4.5 3 − 5 −1 = [−1 8] Then for 𝑏 = 2, Δ 𝑥 = 1 = 1 − 1 22 + 0 + 1 Δ 𝑦 = 8 = 2 − 1 22 + 3 + 1 𝑎 = 1, 𝑏 = 2, 𝑐 = 0 𝑎 = 2, 𝑏 = 2, 𝑐 = 3 189
  • 190. Huffman Codes for motion_code − s is 0 if the component is positive. − s is 1 if the component is negative. − Each vector is specified by a (motion_code, motion_residual) pair. • The Size value is specified at the start of the Picture Layer. − If Δ∗ = 0 then we set the motion_code to 0 (codeword is 1). There is no motion_residual. 190
  • 191. Example − if Δ 𝑥 = −1 then the motion_code is 1, the sign bit is 1 and the motion_residual is 0. Therefore the code 𝟎𝟏𝟏 𝟎 is inserted into the bitstream. − if Δ 𝑥 = −1 then the motion_code is 2, the sign bit is 0 and the motion_residual is 3. Therefore the code 𝟎𝟎𝟏𝟎 𝟏𝟏𝟏 is inserted into the bitstream. 191
  • 192. 192 Spatial Domain and Frequency Domain Blocks