SlideShare a Scribd company logo
1 of 250
Download to read offline
Dr. Mohieddin Moradi
mohieddinmoradi@gmail.com
1
Dream
Idea
Plan
Implementation
Section I
− Video Compression History
− A Generic Interframe Video Encoder
− The Principle of Compression
− Differential Pulse-Code Modulation (DPCM)
− Transform Coding
− Quantization of DCT Coefficients
− Entropy Coding
Section II
− Still Image Coding
− Prediction in Video Coding (Temporal and Spatial Prediction)
− A Generic Video Encoder/Decoder
− Some Motion Estimation Approaches
2
Outline
3
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
Why Compression?
SD-SDI 270 Mbps
HD-SDI 1.5Gbps, 3Gbps
4K-UHD 12Gbps
8K-UHD 48Gbps
4
5
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
Codec
6
Coded
video
Coded
audio
Video format
.264, .265, VP9…
Container format
MP4, MOV, WebM, MXF…
Audio format
.aac, .ogg, .mp3…
Codec and Container Format
− A container or wrapper format is a metafile format whose specification describes how
different elements of data and metadata coexist in a computer file.
− Wrappers serve two purposes mainly:
• To gather programme material and related information
• To identify those pieces of information
Container
7
Codec and Container (Wrapper) Format
Media
Format
Wrapper
CODEC
AVC-Intra Class 100
DNXHD, ProRes
AVC LongG
MXF OPAtom
MXF OP1B,
Quicktime,
DVI
P2, AVCHD
HDCAM, Mini DV,
SR
LTO, HDD,
BluRay Disc
P2 Card,
SD Card
8
Ex: MXF File Structure of AVC-LongG OP-1b and OP-1a
Goal of Standards
− Ensuring Interoperability
− Enabling communication between devices made by different manufacturers
− Promoting a technology or industry
− Reducing costs
9
The Scope of Video Standardization
Decoder
Bitstream
Encoder
Goal of Standards
− Ensuring Interoperability
− Enabling communication between devices made by different manufacturers
− Promoting a technology or industry
− Reducing costs
− Not the encoder, Not the decoder
− Just the bitstream syntax and the decoding process (e.g. use IDCT, but not how to implement
the IDCT)
10
The Scope of Video Standardization
Decoder
Bitstream
Scope of Standardization
Encoder
(Decoding Processes)
Only Specifications of the Bitstream, Syntax, and Decoding Processes are standardized:
• Enables improved encoding & decoding strategies to be employed in a standard-compatible manner
• Provides no guarantees of quality
• Permits optimization beyond the obvious
• Permits complexity reduction for implementability
Pre-Processing
Source
Destination
Post-Processing & Error
Recovery
Scope of Standard
Encoding
Decoding
11
CODEC (enCODer/DECoder)
Standard defines this
The Scope of Video Standardization
12
− This allows future encoders of better performance to remain compatible with existing decoders.
− Also allows for commercially secret encoders to be compatible with standard decoders
Today’s Ho-Hum Encoder
Tomorrow’s Nifty Encoder
Very Secret Encoder
Today’s Decoder
Today’s decoder
still works!
The Scope of Video Standardization
• The international standard does not specify the design of the video encoders and decoders.
• It only specifies the syntax and semantics of the bitstream and signal processing at the encoder/decoder interface.
• Therefore, options are left open to the video codec manufacturers to trade-off cost, speed, picture quality and coding
efficiency.
JTC1
IEC ISO
SC 29
RAAGM
AG
WG12WG11WG1
WG
JBIG
JPEG
SG
MHEG-5
Main- tenance
MHEG-6
SG
Audio
SNHC
System
Video
Requirements
Implementation Studies
Test
SG
Liaisons
Advisory Group (AG) on Management (AGM)
• To advise SC 29 and its WGs on matters of management that
affect their works.
Advisory Group (AG) on Registration Authority (RA)
WG1: Still images, JPEG and JBIG
• Joint Photographic Experts Group and
Joint Bi-level Image Group
WG11: Video, MPEG
• Motion Picture Experts Group
WG12: Multimedia, MHEG
• Multimedia Hypermedia Experts Group
International
Standardization
Organization
Subcommittee 29
Title: “Coding of Audio, Picture, Multimedia and Hypermedia Information”
Joint Technical Committee
ISO/IEC JTC 1/SC 29 Structure and MPEG
MPEG (Moving Picture Experts Group, 1988 )
To develop standards for coded representation of
digital audio, video, 3D Graphics and other data
International
Electrotechnical
Committee
13
Telecommunication Standardization
Advisory Group (TSAG)
WTSA
World Telecommunication
Standardization Assembly
SG
Workshops,
Seminars,
Symposia
…
IPRs (Intellectual
Property Rights)
WP
Questions: Develop Recommendations
SG
WP WP
Q
Focus
Group
VCEG (ITU-T SG16/Q6) )
• Study Group 16
Multimedia terminals, systems and
applications
• Working Party 3
Media coding
• Question 6
Video coding
Rapporteurs (R):
Mr Gary SULLIVAN, Mr Thomas WIEGAND
SG16
WP3
14
ITU-T structure and VCEG (Video Coding Experts Group or Visual Coding Experts Group)
Administrative Entities
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q6 VCEG
15
ITU, International Telecommunication Union structure
− Founded in 1865, it is the oldest specialized agency of the United Nations system
− ITU is an International organization where governments, industries, telecom operators, service providers
and regulators work together to coordinate global telecommunication networks and services
− Help the world communicate!
What does ITU actually do?
• Spectrum allocation and registration
• Coordinate national spectrum planning
• International telecoms/ICT standardization
• Collaborate in international tariff-setting
• Cooperate in telecommunications development assistance
• Develop measures for ensuring safety of life
• Provide policy reviews and information exchange
• Insure and extend universal Telecom access
16
ITU, International Telecommunication Union structure
− Plenipotentiary Conference: Key event, all ITU Member States decide on the future role of the organization
(Held every four years)
− ITU Council: The role of the Council is to consider, in the interval between Plenipotentiary Conferences,
broad telecommunication policy issues to ensure that the Union's activities, policies and strategies fully
respond to today's dynamic, rapidly changing telecommunication environment (held yearly)
17
ITU, International Telecommunication Union structure
− General Secretariat: Coordinates and manages the administrative and financial aspects of the Union’s activities
(provision of conference services, information services, legal advice, finance, personnel, etc.)
− ITU-R: Coordinates radio communications, radio-frequency spectrum management and wireless services.
− ITU-D: Technical assistance and deployment of telecom networks and services in developing and least developed
countries to allow the development of telecommunication.
− ITU-T: Telecommunication standardization on a world-wide basis. Ensures the efficient and on-time production of high
quality standards covering all fields of telecommunications (technical, operating and tariff issues). (The Secretariat of ITU-T
(TSB: Telecommunication Standardization Bureau) provides services to ITU-T Participants)
18
ITU, International Telecommunication Union structure
Telecommunication Standardization Bureau (TSB) (Place des Nations, CH-1211 Geneva 20)
− The TSB provides secretarial support for ITU-T and services for participants in ITU-T work (e.g. organization of meeting,
publication of Recommendations, website maintenance etc.).
− Disseminates information on international telecommunications and establishes agreements with many international SDOs.
Mission of ITU-T Standardization Sector of ITU
− Helping people all around the world to communicate and to equally share the advantages and opportunities of
telecommunication reducing the digital divide by studying technical, operating and tariff matters to develop
telecommunication standards (Recommendations) on a worldwide basis.
19
ITU, International Telecommunication Union structure
World Telecommunication Standardization Assembly (WTSA)
− WTSA sets the overall direction and structure for ITU-T, meets every four years and for the next four-year period:
• Defines the general policy for the Sector
• Establishes the study groups (SG)
• Approves SG work programmes
• Appoints SG chairmen and vice-chairmen
Telecommunication Standardization Advisory Group (TSAG)
− TSAG provides ITU-T with flexibility between WTSAs, and reviews priorities, programmes, operations, financial matters and
strategies for the Sector (meets ~~ 9 months )
• Follows up on accomplishment of the work programme
• Restructures and establishes ITU-T study groups
• Provides guidelines to the study groups
• Advises the TSB Director
• Produces the A-series Recommendations on organization and working procedures
• ISO/IEC MPEG = “Moving Picture Experts Group”
(ISO/IEC JTC 1/SC 29/WG 11 = International Standardization Organization and International Electrotechnical
Commission, Joint Technical Committee 1, Subcommittee 29, Working Group 11)
• ITU-T VCEG = “Video Coding Experts Group”
(ITU-T SG16/Q6 = International Telecommunications Union – Telecommunications Standardization Sector (ITU-T,
a United Nations Organization, formerly CCITT), Study Group 16, Working Party 3, Question 6)
• JVT = “Joint Video Team”
Collaborative team of MPEG & VCEG, responsible for developing AVC (discontinued in 2009)
• JCT-VC = “Joint Collaborative Team on Video Coding”
Team of MPEG & VCEG , responsible for developing HEVC (established January 2010)
• JVET = “Joint Video Experts Team”
Exploring potential for new technology beyond HEVC (established Oct. 2015 as Joint Video Exploration Team, renamed
Apr. 2018)
20
Video Coding Standardization Organizations
21
H.263/+/++
(1995-2000+)
MPEG-4
Visual
(1998-2001+)
MPEG-1
(1993)
ISO/IECITU-T
H.261
(1990+)
H.262 / 13818-2
(1994/95-1998+)
(2003-2018+) (2013-2018+)
H.120
(1984-1988)
Computer
SD HD
H.264 / 14496-10
AVC
4K UHD
H.265 / 23008-2
HEVC
It developed by
Joint Video Team (JVT)
It developed by
Joint Collaborative Team on
Video Coding (JCT-VC)
(MPEG-2)
(2020-...)
8K, 360, ...
H.26x / 23090-3
VVC
It will be developed by
Joint Video Experts Team (JVET)
1990 1994 2003 2013 2020
History of Video Coding Standardization (1985 ~ 2020)
Video telephony
22
ITU-T
Standard
Joint
ITU-T/MPEG
Standards
MPEG
Standard
1988 1990 1992 1994 1996 1998 2002 2004 20062000 2008 2010
H.261
(Version 1)
H.261
(Version 2)
H.263 H.263+ H.263++
H.262/MPEG-2 H.264/MPEG-4 AVC H.265/HVC
MPEG-1
MPEG-4
(Version 1)
MPEG-4
(Version 2)
H.261 Video Compression Standard
23
H series are low delay codecs for telecom applications (International Telecommunication Union (ITU-T)
developed several recommendations for video coding)
− H.261 (1990): the first video codec specification, “Video Codec for Audio Visual Services at p x 64kbps”
− H.262 (1995) : Infrastructure of audiovisual services—Coding of moving video
− H.263 (1996): next conf. solution, Video coding for low bit rate communications
− H.263+ (H.263V2) (1998)
− H.263++ (H.263V3)(2000), follow-on solutions
− H.26L: “long-term” solution for low bit-rate video coding for communication applications (Not backward
compatible to H.263+)
− H.264 (H.26L) completed in May 2003 and lead to H.264: known as advanced video coding (AVC)
− H.265/HEVC (2013) High Efficiency Video Coding
ITU H.26x History
24
Motion Picture Experts Group (MPEG) codecs are designed for storage/broadcast/streaming applications
MPEG-1 (1992)
• Started in 1988 by Lenardo Chiariglione
• Compression standard for progressive frame-based video in SIF (360x240) formats
• Applications: VCD
MPEG-2 (1994-5)
• Compression standard for interlaced frame-based video in CCIR-601 (720x480) and high definition (1920x1088i)
formats
• Applications: DVD, SVCD, DIRECTV, GA, DVB, HDTV Studio, DTV Broadcast, DVD, HD, video standards for
television and telecommunications standards
MPEG-4 (1999)
• Multimedia standard for object-based video from natural or synthetic source
• Applications: Internet, cable TV, virtual studio, home LAN etc..
• Object-oriented
• Over-ambitious?
MPEG History
MPEG 21
MPEG-2
MPEG-1
MPEG-4
MPEG-7
25
Motion Picture Experts Group (MPEG) codecs are designed for storage/broadcast/streaming applications
MPEG-7, 2001
• Standardized descriptions of multimedia information, formally called “Multimedia Content Description
Interface”
• Metadata for audio-video streams
• Applications: Internet, video search engine, digital library
MPEG-21, 2002
• Intellectual right protection propose
• Distribution, exchange, user access of multimedia data and intellectual property management
AVC (2003), also known as MPEG-4 version 10
• Conventional to HD
• Emphasis on compression performance and loss resilience
HEVC (2013) High Efficiency Video Coding
MPEG History
MPEG 21
MPEG-2
MPEG-1
MPEG-4
MPEG-7
26
ITU and MPEG (ISO/IEC) have also worked together for joint codecs:
− MPEG-2 is also called H.262
− H.26L has lead to a codec now is called:
• H.264 in telecom
• MPEG-4 (version 10) in broadcast
• AVC (Advanced Video Coding) in broadcast
• Joint Video Team (JVT) Codec
− H.265/HEVC (2013) High Efficiency Video Coding
Joint ITU/MPEG
27
The Story of MPEG and VCEG
28
ITU and MPEG (ISO/IEC) have also worked together for joint codecs:
Joint ITU/MPEG
50% bitrate saving – Direct-to-home
30% bitrate saving – Contribution
50% bitrate saving – Direct-to-home
30% bitrate saving – Contribution
2020
VVC
2020
≈50% bitrate saving – Direct-to-home
≈30% bitrate saving – Contribution
29
Wrapper TypeBitrate (Mbps)
1920×1080, 4:2:2, 10 bit
Codec NameCodec Brand
MXF367 (50p)DNxHD 365x
AVID MXF184 (50i)DNxHD 185x
MXF174 (50i) /345 (50p), (12 bit)DNxHR HQX
MOV38(50i)/76(50p)ProRes 422 Proxy
APPLE
MOV85(50i)/170(50p)ProRes 422 LT
MOV122(50i)/245(50p)ProRes 422
MOV184(50i)/367(50p)ProRes 422 HQ
MXF/MP4112 (50i)/ 223 (50p) [MXF]XAVC Intra Class 100
SONY
MXF227 (50i)/ 454 (50p)XAVC Intra Class 200
MXF/MP4
50 (50i,50p) [MXF]
Max Bit Rate=80 Mb/s
XAVC Long GOP 50
MXF/MP4
35 (50i,50p) [MXF]
Max Bit Rate=80 Mb/s
XAVC Long GOP 35
MXF/MP4
25 (50i) [MXF]
Max Bit Rate=80 Mb/s
XAVC Long GOP 25
MXF226 (50i)/452 (50p)AVC-Intra 200
PANASONIC
MXF111 (50i)/222 (50p)AVC-Intra 100
MXF50 (50i)AVC-LongG 50
MXF25 (50i)/50 (50p)AVC-LongG 25
Some Famous Codecs for HD
30
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
31
Spatial Domain
− Elements are used “raw” in suitable combinations.
− The frequency of occurrence of such combinations is used to influence the design of the
coder so that shorter codewords are used for more frequent combinations and vice versa
(entropy coding).
Transform Domain
− Elements are mapped onto a different domain (i.e. the frequency domain).
− The resulting coefficients are quantised and entropy-coded.
Hybrid
− Combinations of the above.
Classification of Compression Techniques
Current Stage
Used since early days of video compression
standards, e.g. MPEG-1/-2/-4, H.264/AVC, HEVC and
also in most proprietary codecs (VC1, VP8 etc.)
Input Frame 1
,Q
32
A Generic Interframe Video Encoder
Input Frame 1 DCT
,Q
33
A Generic Interframe Video Encoder
Quantized
010011101001…
Input Frame 1 DCT
,Q
34
A Generic Interframe Video Encoder
QuantizedInput Frame 1 DCT
010011101001…
Reconstructed
Frame 1
,Q
35
A Generic Interframe Video Encoder
Input Frame 2
,Q
36
Reconstructed
Frame 1
A Generic Interframe Video Encoder
010011101001…
Entropy Coded MVs
,Q
37
Reconstructed
Frame 1
Input Frame 2
A Generic Interframe Video Encoder
010011101001…
Entropy Coded MVs
,Q
38
Reconstructed Frame 1 with MC
Input Frame 2
A Generic Interframe Video Encoder
Input Frame 2 Residual with MC (Frames 1&2)
,Q
39
Reconstructed Frame 1 with MC
A Generic Interframe Video Encoder
If the motion prediction is successful, the energy
in the residual is lower than in the original frame
and can be represented with fewer bits.
Residual with MC
(Frames 1&2)
DCT
,Q
40
A Generic Interframe Video Encoder
010011101001…
QuantizedDCT
Residual with MC
(Frames 1&2)
,Q
41
A Generic Interframe Video Encoder
Reconstructed Residual with
MC (Frames 1&2)
QuantizedDCT
Residual with MC
(Frames 1&2)
,Q
42
A Generic Interframe Video Encoder
,Q
43
Reconstructed Residual with
MC (Frames 1&2)
Reconstructed Frame 1
with MC
+
Reconstructed Frame 2
with MC
=
A Generic Interframe Video Encoder
44
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
− Spatial Redundancy Reduction (pixels inside a picture are similar)
− Temporal Redundancy Reduction (Similarity between the frames)
− Statistical Redundancy Reduction (more frequent symbols are assigned short code words and
less frequent ones longer words)
The Principle of Compression
45
46
− It arises when parts of a picture are often replicated within a single frame of video (with minor
changes).
Spatial Redundancy in Still Images
This area
is all blue
This area is half
blue and half green
Sky Blue
Sky Blue
Sky Blue
Sky Blue
Sky Blue
Sky Blue
Sky Blue
Sky Blue
− Take advantage of similarity between successive frames
− It arises when successive frames of video display images of the same scene.
47
Temporal Redundancy in Moving Images
This picture is the
same as the previous
one except for this
area
All signals & data have some redundancy and some entropy.
– Data is compressed by keeping entropy and throwing away redundancy if possible!
– Redundancy is the useless stuff.
– Redundancy can be thrown away
– More redundancy in simple signals & data
• Black & burst, colour bars, flat scenery, talking heads, quiet music, 1kHz sine test tone, bitmap
images, database files, text files.
– Entropy is the useful stuff.
– Entropy is a term often used for ‘activity’ or ‘chaos’.
– More entropy in complex signals & data
• Multiburst and pathological test signals, football match, white noise, executables(computer file that
can be executed), DLL files.
48
The Principle of Compression
• .
Simple Complex
Dataorbandwidth
Max
Redundancy
Entropy
2:1 compression
Lost entropy
49
Redundancy & Entropy
High compression ratio could be lead to lost of entropy
• .
Simple Complex
Dataorbandwidth
Max
Redundancy
Entropy
Lost entropy
4:1 compression
50
Redundancy & Entropy
High compression ratio could be lead to lost of entropy
Spatial Redundancy Reduction
51
Spatial Redundancy Reduction
Transform coding
Discrete Sine
Transform (DST)
Discrete Wavelet
Transform (DWT)
Hadamard
Transform(HT)
Discrete Cosine
Transform (DCT)
Differential Pulse Code Modulation
(DPCM)
52
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
PCM was invented by the British engineer Alec Reeves in 1937 in France.
− Pulse code modulation (PCM) is produced by analog-to-digital conversion process.
− As in the case of other pulse modulation techniques, the rate at which samples are taken and
encoded must conform to the Nyquist sampling rate.
− The sampling rate must be greater than, twice the highest frequency in the analog signal,
𝒇 𝒔 > 𝟐𝒇 𝒎𝒂𝒙
Pulse Code Modulation (PCM)
53
Encoding in PCM
54
AllowedQuantizationLevel
1.52 → 1.5
1.08 → 1.1
0.92 → 0.9
0.56 → 0.6
0.28 → 0.3
0.27 → 0.3
0.11 → 0.1
Pulse Code Modulation
55
Regeneration (re-amplification, retiming, reshaping)
Regeneration
56
Advantages of PCM
• Robustness to noise and interference
• Efficient regeneration
• Efficient SNR and bandwidth trade-off
• Uniform format
• Ease add and drop
• Secure
DS0
• A basic digital signaling rate of 64 kbit/s.
• To carry a typical phone call, the audio sound is digitized at an 8 kHz sample rate using 8-bit
pulse-code modulation.
Advantages of PCM
57
− Encode information in terms of signal transition; a transition is used to designate Symbol 0.
− Symbol 0→ Transition (0→1, 1→0)
Differential Encoding
58
− Usually PCM has the sampling rate higher than
the Nyquist rate.
− The encode signal contains redundant
information.
− DPCM can efficiently remove this redundancy.
− Prediction error of m[n] : 𝒆 𝒏 = 𝒎 𝒏 − ෝ𝒎 𝒏
− Quantized value of m[n] is: 𝒎 𝒒 𝒏 = 𝒆 𝒒 𝒏 + ෝ𝒎 𝒏
− Quantization error of 𝒆 𝒏 is defines as:
𝒒 𝒏 ≜ 𝒆[𝒏] − 𝒆 𝒒 𝒏
− We can proof that:
𝒎 𝒏 − 𝒎 𝒒 𝒏 =
( ෝ𝒎 𝒏 + 𝒆 𝒏 )- (𝒆 𝒒 𝒏 + ෝ𝒎 𝒏 )=𝒆 𝒏 − 𝒆 𝒒 𝒏 = 𝒒 𝒏
Differential Pulse-Code Modulation (DPCM)
59
𝒎 𝒏 − 𝒎 𝒒 𝒏 = 𝒒 𝒏
𝒎[𝒏 + 𝟏]
ෝ𝒎 [𝒏 + 𝟏]
ෝ𝒎 [𝒏 + 𝟏]
𝒎 𝒒 [𝒏]
𝒎 𝒒 [𝒏]
𝒆 𝒒 [𝒏 + 𝟏]
𝒎 𝒏 − 𝒎 𝒒 𝒏 = 𝒆 𝒏 − 𝒆 𝒒 𝒏 = 𝒒 𝒏 means that:
− The pointwise coding error in the input sequence
is exactly equal to q(n) that is equal to the
quantization error in e(n)
− With a reasonable predictor the mean square
value of the differential signal e(n) is much
smaller than that of m(n)
− For the same mean square quantization error, e[n]
requires fewer quantization bits than m[n]
⇒ The number of bits required for transmission
has been reduced while the quantization
error is kept the same.
Differential Pulse-Code Modulation (DPCM)
60
− An important aspect of DPCM is that the prediction
is based on the output (the quantized samples)
rather than the input (the unquantized samples).
− This results in the predictor being in the “feedback
loop” around the quantizer, so that the quantizer
error at a given step is fed back to the quantizer
input at the next step.
− This has a “stabling effect” that prevents DC drift
and accumulation of error in the reconstructed
signal 𝒎 𝒒 𝒏 .
Differential Pulse-Code Modulation (DPCM)
61
   
)(minimizeGmaximizefilter topredictionaDesign
GGain,Processing
)SNR(
isrationoiseonquantizati-to-signaltheand
errorspredictiontheofvariancetheiswhere
)SNR(
))(((SNR)
and0)]][[(ofvariancesareandwhere
(SNR)
issystemDPCMtheof(SNR)The
2
2
2
2
2
2
2
2
2
2
o
22
2
2
o
o
Ep
E
M
p
Q
E
Q
E
Qp
Q
E
E
M
QM
Q
M
G
nqnmEnm



















Processing Gain
62
63
Predictive Coding (from previous symbol)
Predictive Coding (generalised)
− Prediction is based on combination of previous symbols
− Prediction template needs to be “causal” i.e. template should
contain only “previous” elements w.r.t the direction of scanning
(shown with arrows).
− This is important for coding applications as the decoder will need
to have decoded the template elements first to perform the
prediction of the current element.
64
Predictive Coding (from previous symbol)
Predictive Coding (previous symbol)
− Previous symbol used as a prediction of current symbol
− Prediction error coded in a memoryless fashion
− Prediction error alphabet and codebook have twice the size
− i.e. symbol alphabet {1, 2, 3, 4} prediction alphabet {-3, -2, -1, 0, 1, 2, 3}
− A good predictor will minimise the error (most occurrence will be zero)
− If the frame is processed in raster order, then pixels
A, B and C in the current and previous rows are
available in both the encoder and the decoder
since these should already have been decoded
before X.
− The decoder forms the same prediction and adds
the decoded residual to reconstruct the pixel.
65
Predictive Image Coding
Pixel X to be encoded
P(X) is a prediction of X using A,B and C
Residual R(X) = X − P(X)
R(X) is encoded and transmitted
1
•Encoder forms a prediction for X based on
some combination of previously coded pixels
2
•Then subtracts this prediction from X
3
•Then encodes the residual (the result of the
subtraction)
Example
− Encoder prediction P(X) = (2A + B + C)/4
− Residual R(X) = X − P(X) is encoded and transmitted.
− Decoder decodes R(X) and forms the same prediction: P(X) = (2A + B + C)/4
− Reconstructed pixel X = R(X) + P(X)
66
Predictive Image Coding
Spatial prediction (DPCM)
1
•Encoder forms a prediction for X based on
some combination of previously coded
pixels
2
•Then subtracts this prediction from X
3
•Then encodes the residual (the result of the
subtraction)
By Encoder
By Decoder
− If the encoding process is lossy, i.e. if the residual is quantized (𝑹′
𝒙 )
• Then the decoded pixels 𝐴′
, 𝐵′
and 𝐶′
may not be identical to the original A, B and C due to losses
during encoding and so the above process could lead to a cumulative mismatch or ‘drift’ between
the encoder and decoder.
− Hence the encoder uses decoded pixels 𝑨′
, 𝑩′
and 𝑪′
to form the prediction, i.e. P(X) = (2𝑨′
+ 𝑩′
+ 𝑪′
) / 4 in
the above example.
− The compression efficiency of this approach depends on the accuracy of the prediction P(X).
67
Predictive Image Coding
To avoid this, the encoder should itself decode the residual 𝑹′
𝒙 and reconstruct each pixel.
In this way, both encoder and decoder use the same prediction P(X) and drift is avoided.
Quantizer 𝑹′
𝒙𝑹 𝒙 = 𝑿 − 𝑷(𝑿)
− If the prediction is successful, the energy in the residual is lower than in the original frame and the residual
can be represented with fewer bits (Motion compensation is an example of predictive coding).
− Spatial Prediction involves predicting an image sample or region from previously-transmitted samples in
the same image or frame and is sometimes described as ‘Differential Pulse Code Modulation’ (DPCM).
68
Predictive Image Coding
Spatial Prediction in a Frame=DPCM
69
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
Fourier Series Recall
70
 



1
0
)sin()cos(
2
)(
n
nn nxbnxa
a
tf
0)
2
cos()(
2 2
2
 

ndx
T
xn
xf
T
a
T
T
n

0)
2
sin()(
2 2
2
 

ndx
T
xn
xf
T
b
T
T
n




2
2
0 )(
1
T
T
dxxf
T
a
71
Fourier Series Recall
72
Fourier Series Recall
73
Fourier Series Recall
74
Fourier Series Recall
75
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
-1.5
-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10
t
squaresignal,sw(t)
Ideally need infinite terms.
Fourier Series Recall
How transform coding can lead to data compression?
− Although each pixel 𝑥1 or 𝑥2 may take any value uniformly
between 0 (black) and its maximum value 255 (white), since
there is a high correlation (similarity) between them, then it is
most likely that their joint occurrences lie mainly on a 45-degree
line.
− The joint occurrences on the new coordinates have a uniform
distribution along the 𝒚 𝟏 axis, but are highly peaked around
zero on the 𝒚 𝟐 axis.
− The 𝒚 𝟏 is called the average or DC value of 𝒙 𝟏 and 𝒙 𝟐
− The 𝒚 𝟐 represents residual differences of 𝒙 𝟏 and 𝒙 𝟐
− The normalization factor of
1
2
makes sure that the signal
energy due to transformation is not changed (Parseval
theorem).
76
Transform Coding
Joint occurrences of a
pair of pixels in one frame
− Transform domain coding is mainly used to remove the spatial redundancies in images by
mapping the pixels into a transform domain prior to data reduction.
− The strength of transform coding in achieving data compression is that the image energy of
most natural scenes is mainly concentrated in the low-frequency region, and hence into a few
transform coefficients.
− These coefficients can then be quantized with the aim of discarding insignificant coefficients,
without significantly affecting the reconstructed image quality.
77
Transform Coding
78
Transform Coding
Through transformation, a group of correlated pixels are
converted into a group of none correlated coefficients.
− Only one coefficient becomes important, and the
rest carry non-significant energy.
− The larger the number of pixels transformed together,
the better compression efficiency
− If pixels intensity variations match to the
transformation basis vectors, then only one
coefficient (apart from DC) becomes significant
(unitary/orthonormality).
79
Transform Coding
The choice of transform depends on a number of criteria:
1. Data in the transform domain should be decorrelated, i.e. separated into components
within minimal inter-dependence, and compact, i.e. most of the energy in the transformed
data should be concentrated into a small number of values.
2. The transform should be reversible.
3. The transform should be computationally tractable, e.g. low memory requirement,
achievable using limited-precision arithmetic, low number of arithmetic operations, etc.
80
The Choice of Transform Coding
− A group of U pixels in each line are
1-D transformed.
− This is repeated for V lines.
− A group of V coefficients in the
vertical directions are transformed.
− This is repeated for U columns.
− The final output is UV 2-D transform
coefficients.
− Transform coefficients are quantized
for compression.
− Compressed coefficients are inverse
transformed to reconstruct the
image.
81
What Is a Two Dimensional Transform?
One-dimensional transformation
in the Horizontal direction
One-dimensional transformation
in the Vertical direction
U
V
Normally U=V
2D Coeff.
1D Coeff.
− No reduction in data, just replacement (Replaces the original pixel samples with coefficients).
− Coefficients describe how the samples are changing.
− Helps to separate entropy from redundancy.
− DCT always performed on a block of samples.
Discrete Cosine Transform
82
Discrete Cosine Transform
Smallest DCT block is a 2x2 block.
• Top left coefficient is the DC coefficient → Describes the average of the 4 samples.
• Top right coefficient is the horizontal coefficient → Describes how the 4 samples are changing horizontally.
• Bottom left coefficient is the vertical coefficient → Describes how the 4 samples are changing vertically.
• Bottom right coefficient is the diagonal coefficient → Describes how the 4 samples are changing diagonally.
83
Original
pixel
samples
Original
pixel
samples
DCT Inverse
DCT
DC
Horizontal
coefficient
Diagonal
coefficient
Vertical
coefficient
255 255
0 0
255 0
255- 0
255 255-
0 0
84
Discrete Cosine Transform
255 0
255 0
255 255
=
σ𝑖=1
4
𝑃𝑖
2
W to B → 127.5
B to W → -127.5
127.5 127.5-
127.5 127.5-
127.5 127.5
127.5 127.5
127.5 127.5
127.5- 127.5-
127.5 127.5-
127.5- 127.5
85
Discrete Cosine Transform
– DCT always performed on a block of samples.
86
Discrete Cosine Transform
87
Detail in a Block vs. DCT Coefficients Transmitted
Discrete Cosine Transform
Most compression systems use an 8x8 DCT block.
• The top left coefficient is the DC coefficient.
• Top row are horizontal coefficients → Low frequency changes to the left, high to the right.
• Left column are vertical coefficients → Low frequency changes at the top, high at the bottom.
• The other coefficients for different angle/frequencies → Low frequency to the top left, & high to the bottom right.
Discrete Cosine Transform
88
Pixel Domain Frequency Domain
55 55
5555
109
55
109
109
55
5555
55
109 109 109 109
109 109 109 109 109
109 109 109 109 109 109
109 109 109 109 109 109
55 55 55 55 55 55
55 55 55 55 55 55
55 55 55 55 55 55
55 55 55 55 55 55
55
55
55
55
55
55
55
m = 0 1 2 3 4 5 6 7
n = 0
n = 1
n = 2
n = 3
n = 4
n = 5
n = 6
n = 7
f(m,n)
Spatial
8 x 8
Pixel
Values
Discrete Cosine Transform
89
NINT[ ]
NINT = Nearest INteger Truncation
602 -69
-63147
-50
0
-24
-45
0
22-52
0
0 16 21 14
-22 0 15 19 12
0 0 0 0 0 0
16 8 0 -5 -7 -4
0 0 0 0 0 0
-11 5 0 3 4 3
0 0 0 0 0 0
9 4 0 -3 -4 -2
0
-15
0
12
34
0
-29
u = 0 1 2 3 4 5 6 7
v = 0
v = 1
v = 2
v = 3
v = 4
v = 5
v = 6
v = 7
F(u,v)
Frequency
Domain
8 x 8
Transform
Values
Discrete Cosine Transform
90
91
DCT
Big number
somewhere
here
Discrete Cosine Transform
DCT
Big number
somewhere
here
92
Discrete Cosine Transform
DCT
Big number
somewhere
here
93
Discrete Cosine Transform
94
Discrete Cosine Transform
95
Discrete Cosine Transform
96
Discrete Cosine Transform
DCT
Big number
somewhere
here
DCT
Big number
somewhere
here
97
Discrete Cosine Transform
− The Forward DCT (FDCT) of an N × N sample block is given by
− The Inverse DCT (IDCT) is given by
− A is an N × N transform matrix. The elements of A are
− FDCT and IDCT may be written in summation form:
98
Discrete Cosine Transform
− Ex: The transform matrix A for a 4 × 4 DCT is:
99
Discrete Cosine Transform
− The output of a 2-dimensional FDCT is a set of N × N
coefficients representing the image block data in the DCT
domain which can be considered as ‘weights’ of a set of
standard basis patterns.
− The basis patterns for the 4 × 4 DCT are shown.
− The basis patterns are composed of combinations of
horizontal and vertical cosine functions.
− Any image block may be reconstructed by combining all
N × N basis patterns, with each basis multiplied by the
appropriate weighting factor (coefficient).
100
Discrete Cosine Transform
101
u = 0 u = 1 u = 2 u = 3
v = 0
v = 1
v = 2
v = 3
Discrete Cosine Transform (4×4 basis patterns)
102
Discrete Cosine Transform
𝑁 = 8 𝑎𝑛𝑑 𝑖, 𝑗 = 0, . . , 7
𝐼𝑛 𝑟𝑒𝑎𝑙 𝑐𝑜𝑑𝑒𝑐𝑠: −2048 ≤ 𝐷(𝑖, 𝑗) ≤ +2047
103
Discrete Cosine Transform
𝑁 = 8 𝑎𝑛𝑑 𝑖, 𝑗 = 0, . . , 7
Discrete Cosine Transform DCT:
− Basis Vectors:
𝐶𝑂𝑆
𝑘𝜋 2𝑛 + 1
2𝑁
𝑘 & 𝑛 = 0, … … . , 𝑁 − 1
− For orthonormality, transform coefficients
are divided by 𝑵
− Both transforms are orthonormal, but DCT
has a smooth varying basis vector that
matches natural images better.
104
DCT and Hadamard 8x8 Matrices
1 1 1 1 1 1 1 1
cos x cos3x sin3x sin x - sin x - sin3x - cos3x - cos x
cos2x sin2x -sin2x -cos2x -cos2x -sin2x sin2x cos2x
cos3x -sin x -cos x -sin3x sin3x cos x sin x -cos3x
1 -1 -1 1 1 -1 -1 1
sin3x -cos x sin x cos3x -cos3x -sin x cos x -sin3x
sin2x -cos2x cos2x -sin2x -sin2x cos2x -cos2x sin2x
sin x -sin3x cos3x -cos x cos x -cos3x sin3x -sin x
2
2
2
2
2
2
1 1 1 1 1 1 1 1
1 1 1 1 -1 -1 -1 -1
1 1 -1 -1 -1 -1 1 1
1 1 -1 -1 1 1 -1 -1
1 -1 -1 1 1 -1 -1 1
1 -1 -1 1 -1 1 1 -1
1 -1 1 -1 -1 1 -1 1
1 -1 1 -1 1 -1 1 -1










11
11
nn
nn
n
HH
HH
H
With H0=1
Hadamanrd Transform
By subtracting 128 from each array.
Because the DCT is designed to work on
pixels values ranging from -128 to 127.
D TMT 
105
Discrete Cosine Transform Implementation Example
− DCT calculations are mathematically intensive.
− Easier to use simple matrix manipulation and a “look-up” matrix.
− “Look-up” matrix act like a key or look-up table.
− This “look-up” matrix is called the basis pictures.
− For a 2x2 DCT block the basis pictures are 4x4.
− For an 8x8 DCT block the basis pictures are 64x64.
Discrete Cosine Transform
106
1078 × 8 DCT basis patterns
Discrete Cosine Transform (8×8 DCT basis patterns)
− The basis patterns for the 8 × 8 DCT are shown.
− The basis patterns are composed of
combinations of horizontal and vertical cosine
functions.
− Any image block may be reconstructed by
combining all N × N basis patterns, with each
basis multiplied by the appropriate weighting
factor (coefficient).
108
Discrete Cosine Transform (8×8 DCT basis patterns)
109
8x8 DCT Example
110
8x8 DCT Example
111
8x8 DCT Example
Note that:
– Low-low coefficients are much larger than high-high coefficients
– While pixel values change at all positions, DCT values are mainly larger at low frequency.
8x8 DCT Example
112
8x8 pixels are coded and the lowest N out of 64 coefficients are retained for inverse DCT
8x8 DCT Example
113
DCT coding with increasingly coarse quantization, block size 8x8
Typical DCT Coding Artifacts
Quantizer Stepsize For AC Coefficients: 25 Quantizer Stepsize For AC Coefficients: 100 Quantizer Stepsize For AC Coefficients: 200
114
115
Discrete Cosine Transform (4×4 DCT basis patterns)
4 × 4 DCT basis patterns
116
Image section showing 4 × 4 block
Original block DCT coefficients
4x4 DCT Example
117
Original block DCT coefficients
Block reconstructed from 1, 2, 3, 5 coefficients
4x4 DCT Example
118
4 × 4 DCT basis patterns
8 × 8 DCT basis patterns
Discrete Cosine Transform (DCT basis patterns Comparison)
Top Field and Bottom Field Pixels
119
120
Luminance MB structure in frame-organized DCT coding (for slow moving)
Luminance MB in field-organized DCT coding (for fast moving)
Blocks (8×8)MB (16×16)
Frame Type DCT vs. Field Type DCT
Blocks (8×8)MB (16×16)
121
Frame Type DCT vs. Field Type DCT
− The significant DCT coefficients of a block of
image or residual samples are typically the ‘low
frequency’ positions around the DC (0,0)
coefficient.
− Figure plots the probability of non-zero DCT
coefficients at each position in an 8 × 8 block.
− The non-zero DCT coefficients are clustered
around the top-left (DC) coefficient and the
distribution is roughly symmetrical in the
horizontal and vertical directions.
122
DCT Coefficient Distribution
8 × 8 DCT coefficient distribution (Frame)
− Histograms for 8x8 DCT coefficient amplitudes
measured for natural images (from
Mauersberger).
− DC coefficient is typically uniformly distributed.
− For the other coefficients, the distribution
resembles a Laplacian pdf.
123
Amplitude Distribution of the DCT Coefficients
− Figure plots the probability of non-zero DCT
coefficients for a residual field.
− The coefficients are clustered around the DC
position but are ‘skewed’, i.e. more non-zero
coefficients occur along the left-hand edge of
the plot.
− This is because a field picture may have a
stronger high-frequency component in the
vertical axis due to the subsampling in the
vertical direction, resulting in larger DCT
coefficients corresponding to vertical
frequencies.
124
DCT Coefficient Distribution
8 × 8 DCT coefficient distribution (Field)
− The zig-zag scan may not be ideal for a field block because of the skewed coefficient distribution, and a
modified scan order may be more effective for some field blocks, in which coefficients on the left hand
side of the block are scanned before the right hand side.
125
DCT Coefficient Scan
Zigzag scan example : frame block Zigzag scan example : field block
126
DCT Coefficient Scan, Ex.
• .
127
Discrete Cosine Transform
• .
Normally small numbers
Normally big numbers
Discrete Cosine Transform
128
• .
Normally big numbers
Normally small numbers
Redundancy
Entropy
129
Discrete Cosine Transform
130
3-Dimensional DCT
− Remove spatiotemporal correlation
− Good for low motion video
− Bad for high motion video
− Frame storage → Large delay
1 1 1
3
0 0 0
8 (2 1) (2 1) (2 1)
( , , ) ( ) ( ) ( ) ( , , )cos cos cos
2 2 2
N N N
t x y
x u y v t w
F x y t C u C v C w x y t
N N N N
    
  
       
            

for 0,..., 1 , 0,..., 1 and 0,..., 1
1/ 2 for 0
where 8 and ( )
1 otherwise
u N v N w N
k
N C k
     
 
  

The transform should
– Minimize the correlation among resulting coefficients, so that scalar quantization can be employed
without losing too much in coding efficiency compared to vector quantization
– Compact the energy into as few coefficients as possible
Optimal transform
− Karhunan Loeve Transform (KLT)
• Signal statistics dependent
• It is an optimum transform, for complete decorrelation
Suboptimal transform
− Discrete Cosine transform (DCT): nearly as good as KLT for common image signals
− Hadamard transform with all elements of +1, -1.
131
Why DCT? What Block Size?
Properties of the DCT:
− Smoothly varying basis vector that matches natural
images better (better than Hadamard)
− Basis vectors are not sparse (better than DFT, that
has many zero valued coefficient at small block
sizes)
− Basis vectors closely match natural scenes as KLT,
but uses a fix and a fast transformation algorithm
(better than KLT).
132
Why DCT? What Block Size?
5%
4%
3%
2%
1%
4x4 8x8 16x16 32x32 64x64
Block size
Mean-squared-error
DFT
HT
KLT & DCT

Equal number of retained coefficients
&
Properties of the DCT:
− Efficiency as a function of block size NxN,
measured for 8 bit quantization in the original
domain and equivalent quantization in the
transform domain
− Block size 8x8 is a good compromise.
133
Efficiency
Why DCT? What Block Size?
− Wavelet is a non-periodic element, i.e. a mini wave.
− Uses a set of ‘mother wavelets’.
− Scale and transform actions possible.
− Better at high frequency capture.
− Less visual degradation than DCT.
− Graceful degradation at high compression.
− Good for audio compression.
Wavelet Coding
134
135
Wavelet
The ‘wavelet transform’ is based on sets of filters with coefficients that are equivalent to discrete wavelet
functions
− A pair of filters is applied to the signal to decompose it into a low frequency band (L) and a high
frequency band (H).
− Each band is subsampled by a factor of two, so that the two frequency bands each contain N/2 samples.
− With the correct choice of filters, this operation is reversible.
136
Wavelet
− This approach may be extended to apply to a 2-dimensional signal such as an intensity
image.
− Each row of a 2D image is filtered with a low-pass and a high-pass filter (Lx and Hx)
− The output of each filter is down-sampled by a factor of two to produce the intermediate
images L and H.
− L is the original image low-pass filtered and downsampled in the x-direction and H is the
original image high-pass filtered and downsampled in the x-direction.
− Each column of these new images is filtered with low- and high-pass filters (Ly and Hy)
− The output of each filter is down-sampled by a factor of two to produce four sub-images LL,
LH, HL and HH.
137
Wavelet
• ‘LL’ is the original image, low-pass filtered in
horizontal and vertical directions and subsampled
by a factor of two.
• ‘HL’ is high-pass filtered in the vertical direction and
contains residual vertical frequencies
• ‘LH’ is high-pass filtered in the horizontal direction
and contains residual horizontal frequencies
• ‘HH’ is high-pass filtered in both horizontal and
vertical directions.
− Between them, the four sub-band images contain all
of the information present in the original image but
the sparse nature of the LH, HL and HH sub-bands
makes them amenable to compression.
138
Wavelet
− In an image compression application, the 2-dimensional wavelet decomposition is applied again to the
‘LL’ image, forming four new sub-band images.
− The resulting low-pass image, always the top-left sub-band image, is iteratively filtered to create a tree of
sub-band images.
139
Wavelet
− Many of the samples (coefficients) in the higher-frequency sub-band images are close to zero, shown
here as near-black, and it is possible to achieve compression by removing these insignificant coefficients
prior to transmission.
− At the decoder, the original image is reconstructed by repeated up-sampling, filtering and addition,
reversing the order of operations.
140
Wavelet
141
Wavelet
LEVEL 3
LEVEL 2
LEVEL 1
LEVEL 0
Many coefficients in higher sub-bands, towards the bottom-right of
the figure, are near zero and may be quantized to zero without
significant loss of image quality.
− Non-zero coefficients tend to be related to structures in the
image; for example, the violin bow appears as a clear
horizontal structure in all the horizontal and diagonal sub-
bands.
− When a coefficient in a lower-frequency sub-band is non-zero,
there is a strong probability that coefficients in the
corresponding position in higher frequency sub-bands will also
be non-zero.
142
Wavelet Coefficient Scan
A typical distribution of 2D wavelet coefficients
We may consider a ‘tree’ of non-zero quantized coefficients,
starting with a ‘root’ in a low-frequency sub-band.
− A single coefficient in the LL band of layer 1 has one
corresponding coefficient in each of the other bands of layer
1, i.e. these four coefficients correspond to the same region
in the original image.
− The layer 1 coefficient position (parent coefficient) maps to
four corresponding child coefficient positions in each sub-
band at layer 2.
− Recall that the layer 2 sub-bands have twice the horizontal
and vertical resolution of the layer 1 sub-bands.
143
Wavelet Coefficient Scan
LL
child coefficient
child coefficient
child coefficient
root
Parent
coefficient
Parent
coefficient
Parent
coefficient
− Idea: Conditional coding of all descendants (incl.
children)
− significant coefficients: Coefficient magnitude > Threshold
− Four cases (The coefficients are coded by symbol P, N,
ZTR, or IZ)
• ZTR (Zero Tree Root): coefficient and all descendants are not
significant
• IZ (Isolated Zero): coefficient is not significant, but some
descendants are significant
• POS: POSitive significant (greater than the given threshold)
• NEG: NEGative significant (greater than the given threshold )
144
Zero Tree Encoding (Embedded Zero-tree Wavelet Algorithm)
− It is desirable to encode the non-zero wavelet coefficients as compactly as possible prior to entropy
coding.
− An efficient way of achieving this is to encode each tree of non-zero coefficients starting from the lowest
or root level of the decomposition.
− A coefficient at the lowest layer is encoded, followed by its child coefficients at the next layer up, and so
on. The encoding process continues until the tree reaches a zero-valued coefficient.
− Further children of a zero-valued coefficient are likely to be zero themselves and so the remaining children
are represented by a single code that identifies a tree of zeros (zero tree).
− The decoder reconstructs the coefficient map starting from the root of each tree; non-zero coefficients
are decoded and reconstructed and when a zerotree code is reached, all remaining ‘children’ are set to
zero.
− This is the basis of the embedded zero tree (EZW) method of encoding wavelet coefficients.
145
Zero Tree Encoding
146
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
Transformation does not result in compression by its own
− Due to linearity of transformation, energy in pixel domain= energy in transform domain
− But transformation concentrate the energy in a few transform coefficients
− It is the Quantisation of transform coefficients that lead to compression (bit rate reduction)
− Small valued transform coefficients are set to zero
147
Quantisation of DCT Coefficients
− A quantizer maps a signal with a range of values X to a quantized signal with a reduced range
of values Y.
− It should be possible to represent the quantized signal with fewer bits than the original since
the range of possible values is smaller.
− A scalar quantizer maps one sample of the input signal to one quantized output value
148
Quantisation
Quantizer
(Mapping)
X Y (with reduced range)
Y is presented with fewer bits
− A more general example of a uniform quantizer is:
149
Scalar Quantization
Quantizer
(Mapping)
𝑋 𝑌 = 𝐹𝑄. 𝑄𝑃
𝐹𝑄 = 𝑅𝑜𝑢𝑛𝑑 (
𝑋
𝑄𝑃
)
𝑄𝑃: 𝑎 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 ‘𝑠𝑡𝑒𝑝 𝑠𝑖𝑧𝑒’
𝐹𝑄: 𝑓𝑜𝑟𝑤𝑎𝑟𝑑 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑒𝑟
− In image and video compression CODECs, the quantization operation is usually made up of two parts, a
forward quantizer FQ in the encoder and an ‘inverse quantizer’ or ‘rescaler’ (IQ) in the decoder.
− If the step size is large, the range of quantized values is small and can therefore be efficiently represented
and hence highly compressed during transmission, but the re-scaled values are a crude approximation
to the original signal.
− If the step size is small, the re-scaled values match the original signal more closely but the larger range of
quantized values reduces compression efficiency.
150
Quantization
Encoder
(FQ: Forward Quantizer )
Decoder
(IQ: Inverse Quantizer)
𝑌 = 𝐹𝑄. 𝑄𝑃
𝐹𝑄 = 𝑅𝑜𝑢𝑛𝑑 (
𝑋
𝑄𝑃
)
𝑋
151
Linear and non-Linear Scalar Quantizer
− A vector quantizer maps a set of input data such as a block of image samples to a single value
(codeword) and at the decoder, each codeword maps to an approximation to the original set of input
data, a ‘vector’.
− The set of vectors are stored at the encoder and decoder in a codebook.
152
Vector Quantization
Vector Quantizer
(Mapping)
A Set Of Input Data A Single Value
(Codeword)
1. Partition the original image into regions such as N × N pixel blocks.
2. Choose a vector from the codebook that matches the current region as closely as possible.
3. Transmit an index that identifies the chosen vector to the decoder.
4. At the decoder, reconstruct an approximate copy of the region using the selected vector.
153
A typical application of Vector Quantization
− Here, quantization is applied in the image
(spatial) domain, i.e. groups of image
samples are quantized as vectors
− But it can equally be applied to motion
compensated and/or transformed data.
Key issues: the design of the codebook and
efficient searching of the codebook to find
the optimal vector.
154
1
2
n
.
.
.
source image
codebook
1
2
n
.
.
.
codebook
i index of nearest codeword
decoded image
Vector Quantization
155
Quantization
Equal distances between adjacent decision levels and between adjacent reconstruction levels
𝒕𝒍 − 𝒕𝒍−𝟏 = 𝒓𝒍 − 𝒓𝒍−𝟏 = 𝒒
• Parameters of Uniform Quantization
– R: Bit Resolution
– L: Levels (𝑳 = 𝟐 𝑹)
– B: Dynamic Range of input 𝑩 = 𝒇 𝒎𝒂𝒙 – 𝒇 𝒎𝒊𝒏
– q: Quantization interval (step size)
• Quantization function
156
𝒒 =
𝑩
𝑳
= 𝑩. 𝟐−𝑹
Uniform Quantization
q
q
Input signal is continuous
• The output of a Charge-Coupled Device (CCD) camera
is in the range of 0.0 to 5 volt.
• 𝑳 = 𝟐𝟓𝟔
– 𝑞 = 5 / 256
– The output value in the interval (𝒍 × 𝒒, (𝒍 + 𝟏) × 𝒒) is
represented by index 𝑙, 𝒍 = 𝟎, … , 𝟐𝟓𝟓.
– The reconstruction level
𝑸 𝒇 =
𝒇−𝒇 𝒎𝒊𝒏
𝒒
× 𝒒 +
𝒒
𝟐
+ 𝒇 𝒎𝒊𝒏 → 𝒓𝒍 = 𝒍 × 𝒒 +
𝒒
𝟐
𝒍 = 𝟎, … , 𝟐𝟓𝟓.
157
Example 1 of Uniform Quantizer
Input signal is discrete
• Digital Image of 256 gray levels is quantize it into 4
levels
– 𝑞 =
256
4
= 64
– The reconstruction level
𝑸 𝒇 =
𝒇 − 𝒇 𝒎𝒊𝒏
𝒒
× 𝒒 +
𝒒
𝟐
+ 𝒇 𝒎𝒊𝒏 → 𝑸 𝒇 =
𝒇
𝟔𝟒
× 𝟔𝟒 + 𝟑𝟐
158
Example 2 of Uniform Quantizer
159
Uniform Quantization on Images
160
Uniform Threshold Quantiser (UTQ)
− The class of quantiser that has
been used in all standard video
codecs is based around the so-
called Uniform Threshold Quantiser
(UTQ).
− It has equal step sizes with
reconstruction values pegged to
the centroid of the steps.
− The centroid value is typically
defined midway between
quantisation intervals.
q q
161
Uniform Threshold Quantiser (UTQ) and Bit Rate Control
− The DC coefficient has a fairly uniform
distribution.
− Although AC transform coefficients
have nonuniform characteristics, and
hence can be better quantised with
nonuniform quantiser step sizes, but bit
rate control would be easier if they
were quantised linearly.
− Hence, a key property of UTQ is that
the step sizes can be easily adapted
to facilitate bit rate control.
q q
162
Uniform Threshold Quantiser (UTQ)
Uniform Threshold Quantiser (UTQ) (a) with and (b) without dead zone
UTQ-DZ UTQ
163
Uniform Threshold Quantiser (UTQ)
− Typically, UTQ is used for quantising intraframe DC, F(0, 0), coefficients, while UTQ-DZ is used for the AC
and the DC coefficients of interframe prediction error.
− This is intended primarily to cause more nonsignificant AC coefficients to become zero, thus increasing
the compression.
For quantising intraframe DC, F(0, 0), coefficientsFor quantizing AC and the DC coefficients of interframe prediction error.
164
Uniform Threshold Quantiser (UTQ)
− Both quantisers are derived from the generic quantiser, where in UTQ, th is set to zero, but in UTQ-DZ, it is
set to q/2, and in the most inner region the th is allowed to vary between q/2 to q, just to increase the
number of zero-valued outputs. → Thus, the dead zone length can be from q to 2q.
− In some implementations (e.g. H.263 or MPEG-4), the decision and/or the reconstruction levels of the
UTQ-DZ quantiser might be shifted by q/4 or q/2.
th is allowed to vary
between q/2 to q
− In practice, rather than transmitting a quantised coefficient (= 𝑭(𝒖, 𝒗)) to the decoder, its ratio
to the quantiser step size, called Quantisation Index, I, is transmitted:
− The reason for defining the quantisation index is that it has a much smaller entropy than the
quantised coefficient. At the decoder, the reconstructed coefficients, 𝐹 𝑞
(𝑢, 𝑣), after inverse
quantisation, are given by
− If required, depending on the polarity of the index, an addition or subtraction of half the
quantisation step is required to deliver the centroid representation, reflecting the quantisation
characteristics in previous slide.
165
Quantization Index
Quantizer
(Mapping)
𝑫𝑪𝑻 𝒄𝒐𝒆𝒇𝒇𝒊𝒄𝒊𝒆𝒏𝒕 𝑭 𝒖, 𝒗 = 𝑰 𝒖, 𝒗 . 𝒒
− For the standard codecs, the quantiser step size q is fixed at 8
for UTQ, but varies from 2 to 62, in even step sizes, for the UTQ-
DZ (2,4,6,8,…,60,62).
− Hence, the entire quantiser range, or the quantiser parameter
Qp, can be defined with 5 bits (1–31).
− Uniform quantisers with and without dead zone can also be
used in DPCM coding of pixels. Here, threshold is set to zero,
th=0, and the quantisers are usually identified with odd and
even number of levels, respectively.
166
Quantization Step Size
even number of levels
odd number of levels
One of the main problems of linear quantisers in DPCM is that for lower bit
rates, the number of quantisation levels is limited and hence the quantiser
step size is large.
In coding of plain areas of the picture (In plane areas DPCM output is near
zero):
− If a quantiser with even number of levels is used, then the reconstructed
pixels oscillate between -q/2 and +q/2.
− This type of noise at these areas, in particular at low luminance levels, is
visible and is called granular noise.
− Larger quantiser step sizes with the odd number of levels (dead zone)
reduce the granular noise, but cause loss of pixel resolution at the plain
areas.
− This type of noise when the quantiser step size is relatively large is
annoying and is called the contouring noise. 167
Granular and Contouring Noises
even number of levels
odd number of levels
Banding, Contouring
Granular noise
− It can be seen that when the original analog input signal has a relatively constant amplitude, the
reconstructed signal has variations that were not present in the original signal.
168
8 bits 256 Levels 10 bits 1024 Levels
Granular and Contouring Noises
• .
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
238 -43 -12 -14 -6 0 -4 -8
39 12 -9 13 4 -2 -3 -4
-16 12 10 8 -3 7 5 0
-3 -7 1 -3 5 1 -1 0
-7 -12 8 -8 -1 -3 0 2
4 5 -7 1 5 -4 -1 0
-5 -4 2 -3 2 0 1 0
-1 7 -3 -2 1 0 0 0
169
÷ =
DifferentStep-sizes(Q)
Quantisation Matrix Quantised DCT CoefficientsDCT Coefficients
238 -43 -12 -14 -6 0 -4 -8
39 12 -9 13 4 -2 -3 -4
-16 12 10 8 -3 7 5 0
-3 -7 1 -3 5 1 -1 0
-7 -12 8 -8 -1 -3 0 2
4 5 -7 1 5 -4 -1 0
-5 -4 2 -3 2 0 1 0
-1 7 -3 -2 1 0 0 0
• .
Quantisation
238 -43 -12 -14 -6 0 -4 -8
39 12 -9 13 4 -2 -3 -4
-16 12 10 8 -3 7 5 0
-3 -7 1 -3 5 1 -1 0
-7 -12 8 -8 -1 -3 0 2
4 5 -7 1 5 -4 -1 0
-5 -4 2 -3 2 0 0 0
-1 7 -3 -2 1 0 0 0
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 2
1 1 1 1 1 1 2 2
1 1 1 1 1 2 2 2
238 -43 -12 -14 -6 0 -4 -8
39 12 -9 13 4 -2 -3 -4
-16 12 10 8 -3 7 5 0
-3 -7 1 -3 5 1 -1 0
-7 -12 8 -8 -1 -3 0 2
4 5 -7 1 5 -4 -1 0
-5 -4 2 -3 2 0 1 0
-1 7 -3 -2 1 0 0 0
170
÷ =
DifferentStep-sizes(Q)
Quantisation Matrix Quantised DCT CoefficientsDCT Coefficients
• .
Quantisation
238 -43 -12 -14 -6 0 -4 -8
39 12 -9 13 4 -2 -3 -4
-16 12 10 8 -3 7 5 0
-3 -7 1 -3 5 1 0 0
-7 -12 8 -8 -1 -1 0 1
4 5 -7 1 2 -2 0 0
-5 -4 2 -1 1 0 0 0
-1 7 -1 -1 0 0 0 0
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 2
1 1 1 1 1 1 2 2
1 1 1 1 1 2 2 2
1 1 1 1 2 2 2 4
1 1 1 2 2 2 4 4
1 1 2 2 2 4 4 4
238 -43 -12 -14 -6 0 -4 -8
39 12 -9 13 4 -2 -3 -4
-16 12 10 8 -3 7 5 0
-3 -7 1 -3 5 1 -1 0
-7 -12 8 -8 -1 -3 0 2
4 5 -7 1 5 -4 -1 0
-5 -4 2 -3 2 0 1 0
-1 7 -3 -2 1 0 0 0
171
Quantisation Matrix Quantised DCT CoefficientsDCT Coefficients
÷ =
DifferentStep-sizes(Q)
Zig-zag Scanning for Separating Redundancy and Entropy
• .
Quantisation
238 -43 -12 -14 -6 0 -2 -4
39 12 -9 13 4 -1 -1 -2
-16 12 10 8 -1 3 2 0
-3 -7 1 -1 2 0 0 0
-7 -12 4 -4 0 0 0 0
4 2 -3 0 1 -1 0 0
-2 -2 1 0 0 0 0 0
0 3 0 0 0 0 0 0
1 1 1 1 1 1 2 2
1 1 1 1 1 2 2 2
1 1 1 1 2 2 2 4
1 1 1 2 2 2 4 4
1 1 2 2 2 4 4 4
1 2 2 2 4 4 4 8
2 2 2 4 4 4 8 8
2 2 4 4 4 8 8 8
238 -43 -12 -14 -6 0 -4 -8
39 12 -9 13 4 -2 -3 -4
-16 12 10 8 -3 7 5 0
-3 -7 1 -3 5 1 -1 0
-7 -12 8 -8 -1 -3 0 2
4 5 -7 1 5 -4 -1 0
-5 -4 2 -3 2 0 1 0
-1 7 -3 -2 1 0 0 0
172
Quantisation Matrix Quantised DCT CoefficientsDCT Coefficients
÷ =
DifferentStep-sizes(Q)
Zig-zag Scanning for Separating Redundancy and Entropy
• .
Quantisation
1 1 1 2 2 2 4 4
1 1 2 2 2 4 4 4
1 2 2 2 4 4 4 8
2 2 2 4 4 4 8 8
2 2 4 4 4 8 8 8
2 4 4 4 8 8 8 16
4 4 4 8 8 8 16 16
4 4 8 8 8 16 16 16
238 -43 -12 -7 -3 0 -1 -2
39 12 -4 6 2 0 0 -1
-16 6 5 4 0 1 1 0
-1 -3 0 0 1 0 0 0
-3 -6 2 -2 0 0 0 0
2 1 -1 0 0 0 0 0
-1 -1 0 0 0 0 0 0
0 1 0 0 0 0 0 0
238 -43 -12 -14 -6 0 -4 -8
39 12 -9 13 4 -2 -3 -4
-16 12 10 8 -3 7 5 0
-3 -7 1 -3 5 1 -1 0
-7 -12 8 -8 -1 -3 0 2
4 5 -7 1 5 -4 -1 0
-5 -4 2 -3 2 0 1 0
-1 7 -3 -2 1 0 0 0
173
Quantisation Matrix Quantised DCT CoefficientsDCT Coefficients
÷ =
DifferentStep-sizes(Q)
Zig-zag Scanning for Separating Redundancy and Entropy
• .
Quantisation
1 2 2 4 4 4 8 8
2 2 4 4 4 8 8 8
2 4 4 4 8 8 8 16
4 4 4 8 8 8 16 16
4 4 8 8 8 16 16 16
4 8 8 8 16 16 16 32
8 8 4 16 16 16 32 32
8 8 16 16 16 32 32 32
238 -21 -6 -3 -1 0 0 -1
19 6 -2 3 1 0 0 0
-8 3 2 2 0 0 0 0
0 -1 0 0 0 0 0 0
-1 -3 1 -1 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
238 -43 -12 -14 -6 0 -4 -8
39 12 -9 13 4 -2 -3 -4
-16 12 10 8 -3 7 5 0
-3 -7 1 -3 5 1 -1 0
-7 -12 8 -8 -1 -3 0 2
4 5 -7 1 5 -4 -1 0
-5 -4 2 -3 2 0 1 0
-1 7 -3 -2 1 0 0 0
174
Quantisation Matrix Quantised DCT CoefficientsDCT Coefficients
÷ =
DifferentStep-sizes(Q)
Zig-zag Scanning for Separating Redundancy and Entropy
• .
Zig-zag Scanning
175
DC and low frequency coefficients are first
and the high frequency coefficients are last.
• .
Zig-zag Scanning for Separating Redundancy and Entropy
176
RedundancyEntropy
DC and low frequency coefficients are first
and the high frequency coefficients are last.
− Use uniform quantiser for each coefficient
− Different coefficients are quantized with different step-sizes (Q):
− Human eye is more sensitive to low frequency components
• Low frequency coefficients with a smaller Q
• High frequency coefficients with a larger Q
− Specified in a normalization matrix (Standard Quantization Matrix)
− Normalization matrix can then be scaled by a scale factor
177
Different Step-sizes (Q)
(JPEG Standard Quantization Matrix)
In JPEG we have quality level from 1 to 100.
− With a quality level 50 we get high compression and excellent
decompressed image quality (Standard Quantization Matrix).
− For a quality level grater than 5o (less compression, higher image
quality), the standard quantization matrix is multiplied by
𝟏𝟎𝟎 − 𝑸𝒖𝒂𝒍𝒊𝒕𝒚 𝑳𝒆𝒗𝒆𝒍
𝟓𝟎
− For a quality level less than 50 (more compression, lower image quality),
the standard quantizarion matrix is multiplied by
𝟓𝟎
𝑸𝒖𝒂𝒍𝒊𝒕𝒚 𝑳𝒆𝒗𝒆𝒍
− The quantization matrix is then rounded and clipped to have positive
integer values ranging from 1 to 255.
178
Ex: Different Quality Level in JPEG by Quantization Matrix
(JPEG Standard Quantization Matrix)
By subtracting 128 from each array.
Because the DCT is designed to work on
pixels values ranging from -128 to 127.
D TMT 
179
Ex: Different Quality Level in JPEG by Quantization Matrix
180
Ex: Different Quality Level in JPEG by Quantization Matrix
(Standard Quantization Matrix)
181
(Standard Quantization Matrix)
Ex: Quantization with Matrix Q50 in JPEG
182
Ex: Inverse Quantitation in JPEG
(Standard Quantization Matrix)
183
Ex: Inverse DCT and adding 128 in JPEG
N=
184
Ex: Comparison between Original and Decompressed Block
185
Ex: JPEG
DCT
186
Ex: JPEG
Quantized
187
Ex: JPEG
Original Quality 50, 84% Zeros
Example: Quantized Indices
Default Normalization
Matrix in JPEG
188
(Standard Quantization Matrix)
QM(i,j)
Quantized coefficients ratios to their quantizer step sizes give the indices
Previous matrix elements
(Normalization Matrix)
Example: Quantized Indices
189
Multiple of indices to the step size results
in quantized coefficients values to be
used for inverse transform
Example: Quantized Coefficients
190
OriginalCompressed
Compare pixel wise
Example: Reconstructed Image
191
192
Quantization Noise and Bit Resolution
193
Quantization Noise
Zoom in of Staircase
− Pink dots show that analog range that maps to an ADC Value.
− Black arrows show the Quantization error for 2 points.
PDF of Quantization Error
Slope=1
− Quantization error is uniformly distributed.
− Integrates to 1
194
Quantization Noise
Slope=1
− RMS value for a full scale sinusoidal input is:
− Then
195
Quantization Noise and SQNR
2 𝑁. ∆
= 𝑺𝑸𝑵𝑹 (𝒅𝑩)
196
𝑆𝑄𝑁𝑅 = 10 log
𝑅𝑀𝑆 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟
𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟
= 6𝐵 + 1.78
𝑃𝑆𝑁𝑅 = 10 log
𝑃𝑒𝑎𝑘 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟
𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟
=?
𝑃𝑒𝑎𝑘 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟
𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟
=
𝑃𝑒𝑎𝑘 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟
𝑅𝑀𝑆 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟
×
𝑅𝑀𝑆 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟
𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟
𝑃𝑒𝑎𝑘 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟
𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟
=(
(2𝐴)2
(
𝐴
2
)2
)×
𝑅𝑀𝑆 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟
𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟
=8
𝑅𝑀𝑆 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟
𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟
𝑷𝑺𝑵𝑹 = 10 log 8 ×
𝑅𝑀𝑆 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟
𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟
= 10 log 8 + (6𝐵 + 1.78) ≈ 𝟔𝑩 + 𝟏𝟏 (𝒅𝑩)
PSNR for a Sin Waveform
2𝐴
197
Video Source
Decompress
(Decode)
Compress
(Encode)
Video Display
Coded
video
ENCODER + DECODER = CODEC
198
Elementary Information Theory
− How much information does a symbol convey?
− Intuitively, the more unpredictable or surprising it is, the more information is conveyed.
− Conversely, if we strongly expected something, and it occurs, we have not learnt very much
199
Elementary Information Theory
− If p is the probability that a symbol will occur
− The amount of information, I, conveyed is:
− The information, I, is measured in bits
− It is the optimum code length for the symbol
− The entropy, H, is the average information per symbol
− Provides a lower bound on the compression that can be achieved
𝑰 = 𝐥𝐨𝐠 𝟐
𝟏
𝒑
𝐻 = ෍
𝑠
𝑝 𝑠 log2
1
𝑝(𝑠)
200
Elementary Information Theory
A simple example
− Suppose we need to transmit four possible weather conditions:
1. Sunny
2. Cloudy
3. Rainy
4. Snowy
− If all conditions are equally likely, p(s)=0.25→H=2
– i.e. we need a minimum of 2 bits per symbol
201
Elementary Information Theory
A simple example
− Suppose we need to transmit four possible weather conditions:
1. Sunny 0.5 of the time
2. Cloudy 0.25 of the time
3. Rainy 0.125 of the time
4. Snowy 0.125 of the time
− Then the entropy is
− i.e. we need a minimum of 1.75 bits per symbol
75.175.05.05.0
3125.02225.015.0
125.0
1
log125.02
25.0
1
log25.0
5.0
1
log5.0 222



H
H
H
– It reduces amount of data or bit rate.
– Truly lossless.
– Different types
• Fractal Coding
• Run Length Coding (RLC) or Run Level Encoding
• Variable Length Coding (VLC)– [ie Huffman/Arithmetic]
• Wavelet Coding
– Compression systems often do not use all of them together.
– Some systems combine different types.
Entropy Coding
202
− Resulting from studies by Benoit Mandlebrot.
− Images are self similar.
− Self similar shapes are called fractals.
− Scale, stretch, rotate, mirror, skew actions possible.
− Computationally intensive.
− Requires multiple sweeps.
− Difficult to do on video in real time.
Fractal Coding
203
2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 19[2]
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 24[0]
2 3 4 5 6 7 8 9 10 11 12 13 14 15 2:15
8 7 4 5 6 8 6 8 0 2 3 8 9 3 0 2 3 5 4 7 2 1 5 2 5326214
5472 = 1 152 = 4
023 = 2 8745 = 5
6868 = 3 893 = 6
• Replaces runs of
the same number
with a code …
… or
• Particular strings
of numbers with
a code.
204
Run Length Coding
205
Run Length Coding
Sample Block Zigzag Scanning (MPEG-2) for doing RLC
206
Run Length Coding
Sample Block Run-length Encoding (MPEG-2)
The output of the re-ordering process of transform coefficient is an array that typically contains
one or more clusters of non-zero coefficients near the start, followed by strings of zero coefficients.
− The large number of zero values may be encoded to represent them more compactly.
− The array of re-ordered coefficients are represented as (run,level) pairs where
run: indicates the number of zeros preceding a non-zero coefficient.
level: indicates the magnitude of the non-zero coefficient.
207
Run-Level Encoding
Example
1. Input array: 16,0,0,−3,5,6,0,0,0,0,−7
2. Output values: (0,16),(2,−3),(0,5),(0,6),(4,−7)
3. Each of these output values (run , level) is encoded as a separate symbol by the entropy encoder.
Three-dimensional’ Run-level Encoding
If ‘three-dimensional’ run-level encoding is used, each symbol encodes three quantities, run, level and
last.
In the example above, if –7 is the final non-zero coefficient, the 3-D values are:
(0, 16, 0), (2, −3, 0), (0, 5, 0), (0, 6, 0), (4, −7, 1)
The 1 in the final code indicates that this is the last non-zero coefficient in the block.
208
Run-Level Encoding
No. Code
0 = 0
+1 = 101
-1 = 100
+2 = 1101
-2 = 1100
+3 = 11101
-3 = 11100
+4 = 111101
-4 = 111100
+5 = 1111101
-5 = 1111100
. .
. .
Code table
Original numbers
Codes
+1 -3 0 0 +4 -5 +2 -1 0 +1 +3
101101111001011110001011110000101111000011110101111000011110111111001101100010111101
11 x 8bits
= 88 bits
39 bits
Variable Length Coding
209
Commonly
occurring numbers
Rare occurring
numbers
• .
Code table
Regenerated numbers
Codes 101111000011110111111001101100010111101
+1+1 -3+1 -3 0+1 -3 0 0+1 -3 0 0 +4+1 -3 0 0 +4 -5 +2 -1 0 +1 +3
No. Code
0 = 0
+1 = 101
-1 = 100
+2 = 1101
-2 = 1100
+3 = 11101
-3 = 11100
+4 = 111101
-4 = 111100
+5 = 1111101
-5 = 1111100
. .
. .
Variable Length Coding
210
Commonly
occurring numbers
Rare occurring
numbers
211
Variable Length Coding
Variable-length Encoding of Sample Block Coefficients (MPEG-2)
– True data reduction.
– Totally lossless.
– Replaces numbers with codes.
• Run length coding can also be called entropy coding.
– Commonly occurring numbers have a small code & rare numbers have a bigger code.
– Relies on common numbers occurring a lot.
Variable Length Coding
212
− The lengths of the codes should vary inversely with the probability of occurrences of the various symbols in
VLC.
− The bit rate required to code these symbols is the inverse of the logarithm of probability, p, at base 2 (bits),
that is,𝐥𝐨𝐠 𝟐 𝟏/𝐩.
− Hence, the entropy of the symbols, which is the minimum average bits required to code the symbols, can
be calculated as
There are two types of VLC, Huffman and Arithmetic coding.
− It is noted that Huffman coding is a simple VLC, but its compression can never reach as low as the entropy
due to the constraint that the assigned symbols must have an integral number of bits.
− However, the arithmetic coding can approach the entropy since the symbols are not coded individually.
213
Variable Length Coding
𝐻 𝑥 = ෍
𝑠
𝑝 𝑠 log2
1
𝑝(𝑠)
= − ෍
𝑖=0
𝑛
𝑃𝑖 log2 𝑃𝑖
Huffman Coding
− Huffman codes can be used to compress information
− Like WinZip – although WinZip doesn’t use the Huffman algorithm
− JPEGs do use Huffman as part of their compression process
− The basic idea is that instead of storing each character in a file as an 8-bit ASCII value, we
will instead store the more frequently occurring characters using fewer bits and less
frequently occurring characters using more bits
− On average this should decrease the filesize (usually ½)
214
Huffman Coding
− As an example, lets take the string:
“duke blue devils”
− First, a frequency count of the characters:
e:3, d:2, u:2, l:2, space:2, k:1, b:1, v:1, i:1, s:1
− Next, use a Greedy algorithm to build up a Huffman Tree
− We start with nodes for each character
e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1
215
Huffman Coding
To pick the nodes with the smallest frequency and combine them together to form a new node
– The selection of these nodes is the Greedy part
• The two selected nodes are removed from the set, but replace by the combined node
• This continues until we have only 1 node left in the set
216
e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1i,1 s,1
Huffman Coding
e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1
217
Huffman Coding
e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1
i,1 s,1
2
218
Huffman Coding
e,3 d,2 u,2 l,2 sp,2 k,1
b,1 v,1 i,1 s,1
22
219
Huffman Coding
e,3 d,2 u,2 l,2 sp,2
k,1 i,1 s,1
2
b,1 v,1
2
3
220
Huffman Coding
e,3 d,2 u,2
l,2 sp,2 k,1 i,1 s,1
2
b,1 v,1
2
34
221
Huffman Coding
e,3
d,2 u,2 l,2 sp,2 k,1 i,1 s,1
2
b,1 v,1
2
344
222
Huffman Coding
e,3
d,2 u,2 l,2 sp,2
k,1i,1 s,1
2
b,1 v,1
2
3
44 5
223
Huffman Coding
e,3
d,2 u,2
l,2 sp,2
k,1i,1 s,1
2
b,1 v,1
2
3
4
4
57
224
Huffman Coding
e,3
d,2 u,2 l,2 sp,2
k,1i,1 s,1
2
b,1 v,1
2
3
44 5
7 9
225
Huffman Coding
e,3
d,2 u,2 l,2 sp,2
k,1i,1 s,1
2
b,1 v,1
2
3
44 5
7 9
16
226
Huffman Coding
e 00
d 010
u 011
l 100
sp 101
i 1100
s 1101
k 1110
b 11110
v 11111
− Now we assign codes to the tree by placing
– 0 on every left branch
– 1 on every right branch
− A traversal of the tree from root to leaf give the Huffman
code for that particular leaf character
− Note that no code is the prefix of another code
227
e:3, d:2, u:2, l:2, space:2, k:1, b:1, v:1, i:1, s:1
0
1
0 0
0 0
0
1 1
1
1
Root
1
1
11
0
0
0
l 100
e,3
d,2 u,2 l,2 sp,2
k,1i,1 s,1
2
b,1 v,1
2
3
44 5
7 9
16
Huffman Coding
− These codes are then used to encode the string
− Thus, “duke blue devils” turns into:
010 011 1110 00 101 11110 100 011 00 101 010 00 11111 1100 100 1101
− When grouped into 8-bit bytes:
01001111 10001011 11101000 11001010 10001111 11100100 1101xxxx
− Thus it takes 7 bytes of space (as compressed)
− Compare it to 16 characters with 1 byte/char → 16 bytes uncompressed
228
e 00
d 010
u 011
l 100
sp 101
i 1100
s 1101
k 1110
b 11110
v 11111
0
1
0 0
0 0
0
1 1
1
1
Root
1
1
11
0
0
0
e,3
d,2 u,2 l,2 sp,2
k,1i,1 s,1
2
b,1 v,1
2
3
44 5
7 9
16
Huffman Coding
− Uncompressing works by reading in the file bit by bit
• After getting the first bit, start from the root of the tree
• If a 0 is read, head left
• If a 1 is read, head right
• When a leaf is reached decode that character and
start over again at the root of the tree
− Thus, we need to save Huffman table information as a
header in the compressed file
• Doesn’t add a significant amount of size to the file for
large files (which are the ones you want to compress
anyway)
• Or we could use a fixed universal set of codes /
frequencies.
229
e 00
d 010
u 011
l 100
sp 101
i 1100
s 1101
k 1110
b 11110
v 11111
0
1
0 0
0 0
0
1 1
1
1
Root
1
1
11
0
0
0
e,3
d,2 u,2 l,2 sp,2
k,1i,1 s,1
2
b,1 v,1
2
3
44 5
7 9
16
− Table lists the probabilities of the most commonly-occurring motion vectors in the encoded
sequence and their information content, 𝐥𝐨𝐠 𝟐 𝟏/𝐩.
− To achieve optimum compression, each value should be represented with exactly 𝐥𝐨𝐠 𝟐 𝟏/𝐩
bits.
− ‘0’ is the most common value and the probability drops for larger motion vectors.
230
Example : Huffman Coding, Sequence of Motion Vectors
1. Generating the Huffman code tree
− To generate a Huffman code table for this set of
data, the following iterative procedure is carried out.
The procedure is repeated until there is a single ‘root’
node that contains all other nodes and data items
listed ‘beneath’ it.
1. Order the list of data in increasing order of probability.
2. Combine the two lowest-probability data items into a
‘node’ and assign the joint probability of the data items
to this node.
3. Re-order the remaining data items and node(s) in
increasing order of probability and repeat step 2.
231
Example : Huffman Coding, Sequence of Motion Vectors
P=0.6
P=0.6
1. Generating the Huffman code tree (Cont.)
− Original list:
The data items are shown as square boxes. Vectors (−2) and (+2) have the lowest
probability and these are the first candidates for merging to form node ‘A’.
− Stage 1:
The newly-created node ‘A’, shown as a circle, has a probability of 0.2, from the
combined probabilities of (−2) and (2).
There are now three items with probability 0.2.
Choose vectors (−1) and (1) and merge to form node ‘B’.
− Stage 2:
A now has the lowest probability (0.2) followed by B and the vector 0; choose A and B
as the next candidates for merging to form ‘C’.
− Stage 3:
Node C and vector (0) are merged to form ‘D’. Final tree: The data items have all
been incorporated into a binary ‘tree’ containing five data values and four nodes.
Each data item is a ‘leaf’ of the tree.
232
Example : Huffman Coding, Sequence of Motion Vectors
P=0.6
2. Encoding
− Each ‘leaf’ of the binary tree is mapped to a variable-length code. To find this code, the tree is traversed
from the root node, D in this case, to the leaf or data item.
− For every branch, a 0 or 1 is appended to the code, 0 for an upper branch, 1 for a lower branch.
− The lengths of the Huffman codes, each an integral number of bits, do not match the ideal lengths given
by log2 1/p.
− For example, the series of vectors (1, 0, −2) would be transmitted as the binary sequence 0111000.
233
Example : Huffman Coding, Sequence of Motion Vectors
3. Decoding
− The decoder must have a local copy of the Huffman code tree or look-up table (Note that once the tree
has been generated in Encoding, the codes may be stored in a look-up table).
− This may be achieved by transmitting the look-up table itself or by sending the list of data and probabilities
prior to sending the coded data.
− Each uniquely-decodeable code is converted back to the original data.
234
Example : Huffman Coding, Sequence oF Motion Vectors
P=0.6
235
Example 2
− The Huffman coding process has two disadvantages for a practical video CODEC.
I. The encoder needs to transmit the information contained in the probability table before the decoder can decode
the bit stream and this extra overhead reduces compression efficiency, particularly for shorter video sequences.
II. The probability table for a large video sequence(to generate the Huffman tree) cannot be calculated until after
the video data is encoded which may introduce an unacceptable delay into the encoding process.
− For these reasons, image and video coding standards define sets of codewords based on the probability
distributions of ‘generic’ video material.
− The main differences from ‘true’ Huffman coding are
I. The codewords are pre-calculated based on ‘generic’ probability distributions
II. In the case of TCOEF (Transform coefficient), only 102 commonly-occurring symbols have defined codewords and
any other symbol encoded using a fixed-length code.
236
Pre-calculated Huffman-based Coding
237
Pre-calculated Huffman-based Coding
MPEG4 TCOEF VLCs (partial) (Some of the codes shown in left table
are represented in ‘tree’ form in this figure)
MPEG-4 Visual Transform Coefficient
(TCOEF) VLCs : partial, all codes < 9 bits
MPEG4 Motion Vector Difference (MVD) VLCs
− The following two examples of pre-calculated VLC tables are taken from MPEG-4 Visual (Simple Profile).
− Minimum assigned bit is 1, but for highly probable symbols it can be much less (e.g.
− 𝐥𝐨𝐠 𝟐 𝟎. 𝟗𝟓 ≈ 𝟎
− A scheme using an integral number of bits for each data symbol such as Huffman coding is unlikely to
come so close to the optimum number of bits
− The fractional bits can only be assigned, if symbols are coded together:
− Some with high bits and some with ZERO bits
− This is possible if ZERO bit is assigned to high probable symbols
− Arithmetic coding does this!
238
Problems with Huffman
– A form of variable length coding.
– Better than Huffman coding.
– Takes longer to do than Huffman coding.
– More delicate than Huffman coding.
– More limiting than Huffman coding.
– Subject to patents and royalty payments.
– IBM, AT&T, Mitsubishi.
Arithmetic Coding
239
The fundamental idea is to use a scale in which the coding intervals of real numbers between 0
and 1 are represented.
– This is in fact the cumulative probability density function of all the symbols which add up to 1.
– The interval needed to represent the message becomes smaller as the message becomes
longer, and the number of bits needed to specify that interval is increased.
– According to the symbol probabilities generated by the model, the size of the interval is
reduced by successive symbols of the message.
– The more likely symbols reduce the range less than the less likely ones and hence they
contribute fewer bits to the message.
Arithmetic Coding
240
– Once the symbol probability is known, each individual symbol needs to be assigned a portion of the [0, 1)
range that corresponds to its probability of appearance in the cumulative density function.
– The character range is [lower, upper).
– The most significant portion of an arithmetic coded message is the first symbol to be encoded.
– Ex: Message eaii! →
• The first symbol to be coded is e
• The symbol ! that is known by both decoder and encoder is used for the end of decoding symbol,
and the decoding process is terminated.
– After the first character is encoded, we know that the lower number and the upper number now bind our
range for the output.
– Each new symbol to be encoded will further restrict the possible range of the output number during the
rest of the encoding process.
Arithmetic Coding
241
Symbol Probability Range
a 0.2 [0.0, 0.2)
e 0.3 [0.2, 0.5)
i 0.1 [0.5, 0.6)
o 0.2 [0.6, 0.8)
u 0.1 [0.8, 0.9)
! 0.1 [0.9, 1.0)
New character Range
Initially: [0, 1)
After seeing a symbol: e [0.2, 0.5)
a [0.2, 0.26)
i [0.23, 0.236)
i [0.233, 0.2336)
! [0.23354, 0.2336)
Arithmetic Coding
242
Example:1, To code a set of symbols eaii!
– To explain how arithmetic coding works, a fixed-model arithmetic code is used in the example for easy
illustration.
– Suppose the alphabet is {a, e, i, o, u, !}, and the fixed model is used with the probabilities shown in Table.
– Ex: The final coded message has to be a number greater than or equal to 0.2 and less than 0.5 for e.
The final range, [0.23354, 0.2336), represents the message eaii!. This means that if we transmit any
number in the range of 0.23354 ≤ x < 0.2336, that number represents the whole message of eaii!.
!
o
u
e
i
a
Nothing
!
o
u
e
i
a
!
o
u
e
i
a
!
o
u
e
i
a
!
o
u
e
i
a
0.5
e a i !
!
o
u
e
i
a
i0.2360.26 0.2336
0.2
0.2336
0.2 0.23 0.233 0.23354
0.0+(0.2)x(1.0)
=0.2
0.0+(0.5)x(1.0)
=0.5
0.2+(0.0)x(0.3)
=0.2
0.2+(0.2)x(0.3)
=0.26
0.2+(0.5)x(0.06)
=0.23
0.2+(0.6)x(0.06)
=0.236
0.23+(0.5)x(0.006)
=0.233
0.23+(0.6)x(0.006)
=0.2336
0.233+(0.9)x(0.0006)
=0.23354
0.233+(1.0)x(0.0006)
=0.2336
Encode number =(0.5)x(0.00006)+0.23354=0.23355
243
Symbol Probability Range
a 0.2 [0.0, 0.2)
e 0.3 [0.2, 0.5)
i 0.1 [0.5, 0.6)
o 0.2 [0.6, 0.8)
u 0.1 [0.8, 0.9)
! 0.1 [0.9, 1.0)
Ex1: To code a set of symbols eaii!
To encode number of e 1.0
0.9
0.8
0.6
0.5
0.2
0.0
1.0
0.3
0.06
0.006
0.0006
0.00006
1.0
0.9
0.8
0.6
0.5
0.2
0.0
1.0
0.9
0.8
0.6
0.5
0.2
0.0
1.0
0.9
0.8
0.6
0.5
0.2
0.0
1.0
0.9
0.8
0.6
0.5
0.2
0.0
1.0
0.9
0.8
0.6
0.5
0.2
0.0
The final range, [0.23354, 0.2336), represents the message eaii!. This means that if we transmit any
number in the range of 0.23354 ≤ x < 0.2336, that number represents the whole message of eaii!.
!
o
u
e
i
a
1.0
0.0
!
o
u
e
i
a
!
o
u
e
i
a
!
o
u
e
i
a
a i !
!
o
u
e
i
a
i
0.120.2 0.112
0.0
0.112
0.1 0.11 0.1118
(0.0)x(1.0)
+0.0=0.0
(0.2)x(1.0)
+0.0=0.2
(0.5)x(0.2)
+0.0=0.1
(0.6)x(0.2)
+0.0=0.12
(0.5)x(0.02)
+0.1=0.11
(0.6)x(0.02)
+0.1=0.112
(0.9)x(0.002)
+0.11=0.1118
(1.0)x(0.002)
+0.1=0.112
Encode number =(0.2)x(0.0002)+0.1118=0.1119
To encode number of a
244
Ex2: To code a set of symbols aii!
0.9
0.8
0.6
0.5
0.2
Symbol Probability Range
a 0.2 [0.0, 0.2)
e 0.3 [0.2, 0.5)
i 0.1 [0.5, 0.6)
o 0.2 [0.6, 0.8)
u 0.1 [0.8, 0.9)
! 0.1 [0.9, 1.0)
The final range, [0.1118, 0.112), represents the message aii!. This means that if we transmit any
number in the range of 0.1118 ≤ x < 0.112, that number represents the whole message of aii!.
Arithmetic Coding
Decoding for Ex: 1
− In general, the decoding process can be formulated as:
• Where 𝑅 𝑛 is a code within the range of lower value 𝐿 𝑛 and upper value 𝑈 𝑛 of the nth symbol.
• 𝑅 𝑛+1 is the code for the next symbol.
Arithmetic Coding
245
𝑹 𝒏+𝟏 =
𝑹 𝒏 − 𝑳 𝒏
𝑼 𝒏 − 𝑳 𝒏
Corresponding
Range (𝑳 𝒏, 𝑼 𝒏] Output symbol
𝑅𝑒𝑐𝑖𝑒𝑣𝑒𝑑 𝐶𝑜𝑑𝑒= 0.23355 [0.2, 0.5) e
𝑹 𝒏+𝟏 =
0.23355−0.2
0.5−0.2
=0.11185
[0, 0.2) a
𝑹 𝒏+𝟏 =
0.11185−0
0.2−0
=0.55925
[0.5, 0.6) i
𝑹 𝒏+𝟏 =
0.55925 −0.5
0.6−0.5
=0.5925
[0.5, 0.6) i
𝑹 𝒏+𝟏 =
0.5925−0.5
0.6−0.5
=0.925
[0.9, 1) !
Symbol Probability Range
a 0.2 [0.0, 0.2)
e 0.3 [0.2, 0.5)
i 0.1 [0.5, 0.6)
o 0.2 [0.6, 0.8)
u 0.1 [0.8, 0.9)
! 0.1 [0.9, 1.0)
246
Example 3
Motion vectors, sequence 1: probabilities and sub-ranges
Sub-range example
New Range (1)
New Range (2)
New Range (3)
New Range (4)
New Range (5)
New Range (1) New Range (2) New Range (3) New Range (4) New Range (5)
247
Example 3
Encoding Procedure for Vector Sequence (0, −1, 0, 2)
New Range (1)
New Range (2)
New Range (3)
New Range (4)
+0.3×1
+0.7×1
+0.1×0.4
+0.3×0.4
+0.3×0.08
+0.7×0.08
+0.9×0.032
+1×0.032
New Range (5)
248
Example 3
Decoding Procedure
New Range (1)
New Range (2)
New Range (3)
New Range (4)
𝑹 𝒏+𝟏 =
𝑹 𝒏 − 𝑳 𝒏
𝑼 𝒏 − 𝑳 𝒏
Corresponding
Range (𝑳 𝒏, 𝑼 𝒏] Output symbol
𝑅𝑒𝑐𝑖𝑒𝑣𝑒𝑑 𝐶𝑜𝑑𝑒= 0.394 [0.3, 0.7) 0
𝑹 𝒏+𝟏 =
0.394−0.3
0.7−0.3
=0.235
[0.1, 0.3) -1
𝑹 𝒏+𝟏 =
0.235−0.1
0.3−0.1 =0.675
[0.3, 0.7) 0
𝑹 𝒏+𝟏 =
0.675 −0.3
0.7−0.3
=0.9375
[0.9, 1) +2
Reasonable Approach
Decoder do not have it!!
The principal advantage of arithmetic coding
− The transmitted number, 0.394 in this case, which may be
represented as a fixed-point number with sufficient accuracy
using 9 bits, is not constrained to an integral number of bits for
each transmitted data symbol.
− To achieve optimal compression, the sequence of data symbols
should be represented with:
−(𝐥𝐨𝐠 𝟐 𝑷 𝟎 + 𝐥𝐨𝐠 𝟐 𝑷−𝟏 + 𝐥𝐨𝐠 𝟐 𝑷 𝟎 + 𝐥𝐨𝐠 𝟐 𝑷 𝟐) = 𝟖. 𝟐𝟖 𝒃𝒊𝒕𝒔
− In this example, arithmetic coding achieves 9 bits, which is close
to optimum.
249
Example 3 (Cont.)
(0, −1, 0, 2)
Questions??
Discussion!!
Suggestions!!
Criticism!!
250

More Related Content

What's hot

An Introduction to Video Principles-Part 1
An Introduction to Video Principles-Part 1   An Introduction to Video Principles-Part 1
An Introduction to Video Principles-Part 1 Dr. Mohieddin Moradi
 
Video Compression, Part 2-Section 2, Video Coding Concepts
Video Compression, Part 2-Section 2, Video Coding Concepts Video Compression, Part 2-Section 2, Video Coding Concepts
Video Compression, Part 2-Section 2, Video Coding Concepts Dr. Mohieddin Moradi
 
An Introduction to HDTV Principles-Part 4
An Introduction to HDTV Principles-Part 4An Introduction to HDTV Principles-Part 4
An Introduction to HDTV Principles-Part 4Dr. Mohieddin Moradi
 
Latest Technologies in Production & Broadcasting
Latest  Technologies in Production & BroadcastingLatest  Technologies in Production & Broadcasting
Latest Technologies in Production & BroadcastingDr. Mohieddin Moradi
 
VIDEO QUALITY ENHANCEMENT IN BROADCAST CHAIN, OPPORTUNITIES & CHALLENGES
VIDEO QUALITY ENHANCEMENT IN BROADCAST CHAIN,   OPPORTUNITIES & CHALLENGESVIDEO QUALITY ENHANCEMENT IN BROADCAST CHAIN,   OPPORTUNITIES & CHALLENGES
VIDEO QUALITY ENHANCEMENT IN BROADCAST CHAIN, OPPORTUNITIES & CHALLENGESDr. Mohieddin Moradi
 
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2Dr. Mohieddin Moradi
 
An Introduction to Video Principles-Part 2
An Introduction to Video Principles-Part 2An Introduction to Video Principles-Part 2
An Introduction to Video Principles-Part 2Dr. Mohieddin Moradi
 
HDR and WCG Video Broadcasting Considerations
HDR and WCG Video Broadcasting ConsiderationsHDR and WCG Video Broadcasting Considerations
HDR and WCG Video Broadcasting ConsiderationsDr. Mohieddin Moradi
 
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video CodecsVideo Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video CodecsDr. Mohieddin Moradi
 
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1Dr. Mohieddin Moradi
 
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-based
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-basedDesigning an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-based
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-basedDr. Mohieddin Moradi
 
Video Compression Standards - History & Introduction
Video Compression Standards - History & IntroductionVideo Compression Standards - History & Introduction
Video Compression Standards - History & IntroductionChamp Yen
 
An Introduction to Eye Diagram, Phase Noise and Jitter
An Introduction to Eye Diagram, Phase Noise and JitterAn Introduction to Eye Diagram, Phase Noise and Jitter
An Introduction to Eye Diagram, Phase Noise and JitterDr. Mohieddin Moradi
 

What's hot (20)

An Introduction to Video Principles-Part 1
An Introduction to Video Principles-Part 1   An Introduction to Video Principles-Part 1
An Introduction to Video Principles-Part 1
 
Video Compression, Part 2-Section 2, Video Coding Concepts
Video Compression, Part 2-Section 2, Video Coding Concepts Video Compression, Part 2-Section 2, Video Coding Concepts
Video Compression, Part 2-Section 2, Video Coding Concepts
 
HDR and WCG Principles-Part 1
HDR and WCG Principles-Part 1HDR and WCG Principles-Part 1
HDR and WCG Principles-Part 1
 
Broadcast Lens Technology Part 2
Broadcast Lens Technology Part 2Broadcast Lens Technology Part 2
Broadcast Lens Technology Part 2
 
An Introduction to HDTV Principles-Part 4
An Introduction to HDTV Principles-Part 4An Introduction to HDTV Principles-Part 4
An Introduction to HDTV Principles-Part 4
 
Thinking about IP migration
Thinking about IP migration Thinking about IP migration
Thinking about IP migration
 
Latest Technologies in Production & Broadcasting
Latest  Technologies in Production & BroadcastingLatest  Technologies in Production & Broadcasting
Latest Technologies in Production & Broadcasting
 
HDR and WCG Principles-Part 3
HDR and WCG Principles-Part 3HDR and WCG Principles-Part 3
HDR and WCG Principles-Part 3
 
HDR and WCG Principles-Part 4
HDR and WCG Principles-Part 4HDR and WCG Principles-Part 4
HDR and WCG Principles-Part 4
 
VIDEO QUALITY ENHANCEMENT IN BROADCAST CHAIN, OPPORTUNITIES & CHALLENGES
VIDEO QUALITY ENHANCEMENT IN BROADCAST CHAIN,   OPPORTUNITIES & CHALLENGESVIDEO QUALITY ENHANCEMENT IN BROADCAST CHAIN,   OPPORTUNITIES & CHALLENGES
VIDEO QUALITY ENHANCEMENT IN BROADCAST CHAIN, OPPORTUNITIES & CHALLENGES
 
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 2
 
HDR and WCG Principles-Part 6
HDR and WCG Principles-Part 6HDR and WCG Principles-Part 6
HDR and WCG Principles-Part 6
 
An Introduction to Video Principles-Part 2
An Introduction to Video Principles-Part 2An Introduction to Video Principles-Part 2
An Introduction to Video Principles-Part 2
 
HDR and WCG Video Broadcasting Considerations
HDR and WCG Video Broadcasting ConsiderationsHDR and WCG Video Broadcasting Considerations
HDR and WCG Video Broadcasting Considerations
 
Video Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video CodecsVideo Compression, Part 3-Section 1, Some Standard Video Codecs
Video Compression, Part 3-Section 1, Some Standard Video Codecs
 
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1
Serial Digital Interface (SDI), From SD-SDI to 24G-SDI, Part 1
 
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-based
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-basedDesigning an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-based
Designing an 4K/UHD1 HDR OB Truck as 12G-SDI or IP-based
 
Video Compression Standards - History & Introduction
Video Compression Standards - History & IntroductionVideo Compression Standards - History & Introduction
Video Compression Standards - History & Introduction
 
An Introduction to Eye Diagram, Phase Noise and Jitter
An Introduction to Eye Diagram, Phase Noise and JitterAn Introduction to Eye Diagram, Phase Noise and Jitter
An Introduction to Eye Diagram, Phase Noise and Jitter
 
Broadcast Lens Technology Part 1
Broadcast Lens Technology Part 1Broadcast Lens Technology Part 1
Broadcast Lens Technology Part 1
 

Similar to Video Compression, Part 2-Section 1, Video Coding Concepts

ITU-T Study Group 16 Meeting Achievements
ITU-T Study Group 16 Meeting AchievementsITU-T Study Group 16 Meeting Achievements
ITU-T Study Group 16 Meeting AchievementsITU
 
Standard standardization protocol
Standard standardization protocolStandard standardization protocol
Standard standardization protocolSutanu Kandar
 
Video Teleconferencing (VTC) Technology at the National ...
Video Teleconferencing (VTC) Technology at the National ...Video Teleconferencing (VTC) Technology at the National ...
Video Teleconferencing (VTC) Technology at the National ...Videoguy
 
QoS for Media Networks
QoS for Media NetworksQoS for Media Networks
QoS for Media NetworksAmine Choukir
 
Digital Industry Standards
Digital Industry StandardsDigital Industry Standards
Digital Industry StandardsChuck Gary
 
Standardisation In Media Formats
Standardisation In Media FormatsStandardisation In Media Formats
Standardisation In Media FormatsFITT
 
ITU-T Study Group 9 Introduction
ITU-T Study Group 9 IntroductionITU-T Study Group 9 Introduction
ITU-T Study Group 9 IntroductionITU
 
VVC tutorial at ICIP 2020 together with Benjamin Bross
VVC tutorial at ICIP 2020 together with Benjamin BrossVVC tutorial at ICIP 2020 together with Benjamin Bross
VVC tutorial at ICIP 2020 together with Benjamin BrossMathias Wien
 
FITT Toolbox: Standardisation in Media Formats
FITT Toolbox: Standardisation in Media FormatsFITT Toolbox: Standardisation in Media Formats
FITT Toolbox: Standardisation in Media FormatsFITT
 
VVC tutorial at VCIP 2020 together with Benjamin Bross
VVC tutorial at VCIP 2020 together with Benjamin BrossVVC tutorial at VCIP 2020 together with Benjamin Bross
VVC tutorial at VCIP 2020 together with Benjamin BrossMathias Wien
 
en_ETSI_302769v010101v
en_ETSI_302769v010101ven_ETSI_302769v010101v
en_ETSI_302769v010101vAniruddh Tyagi
 
en_ETSI_302769v010101v
en_ETSI_302769v010101ven_ETSI_302769v010101v
en_ETSI_302769v010101vaniruddh Tyagi
 

Similar to Video Compression, Part 2-Section 1, Video Coding Concepts (20)

ITU-T Study Group 16 Meeting Achievements
ITU-T Study Group 16 Meeting AchievementsITU-T Study Group 16 Meeting Achievements
ITU-T Study Group 16 Meeting Achievements
 
Standard standardization protocol
Standard standardization protocolStandard standardization protocol
Standard standardization protocol
 
Itu ngn-v2
Itu  ngn-v2Itu  ngn-v2
Itu ngn-v2
 
Video Teleconferencing (VTC) Technology at the National ...
Video Teleconferencing (VTC) Technology at the National ...Video Teleconferencing (VTC) Technology at the National ...
Video Teleconferencing (VTC) Technology at the National ...
 
QoS for Media Networks
QoS for Media NetworksQoS for Media Networks
QoS for Media Networks
 
PEMWN'21 - ANGELA
PEMWN'21 - ANGELAPEMWN'21 - ANGELA
PEMWN'21 - ANGELA
 
SDI to IP 2110 Transition Part 1
SDI to IP 2110 Transition Part 1SDI to IP 2110 Transition Part 1
SDI to IP 2110 Transition Part 1
 
Sohail's CV 090416
Sohail's CV 090416Sohail's CV 090416
Sohail's CV 090416
 
Digital Industry Standards
Digital Industry StandardsDigital Industry Standards
Digital Industry Standards
 
China OTT
China OTTChina OTT
China OTT
 
Standardisation In Media Formats
Standardisation In Media FormatsStandardisation In Media Formats
Standardisation In Media Formats
 
ITU-T Study Group 9 Introduction
ITU-T Study Group 9 IntroductionITU-T Study Group 9 Introduction
ITU-T Study Group 9 Introduction
 
VVC tutorial at ICIP 2020 together with Benjamin Bross
VVC tutorial at ICIP 2020 together with Benjamin BrossVVC tutorial at ICIP 2020 together with Benjamin Bross
VVC tutorial at ICIP 2020 together with Benjamin Bross
 
FITT Toolbox: Standardisation in Media Formats
FITT Toolbox: Standardisation in Media FormatsFITT Toolbox: Standardisation in Media Formats
FITT Toolbox: Standardisation in Media Formats
 
VVC tutorial at VCIP 2020 together with Benjamin Bross
VVC tutorial at VCIP 2020 together with Benjamin BrossVVC tutorial at VCIP 2020 together with Benjamin Bross
VVC tutorial at VCIP 2020 together with Benjamin Bross
 
en_302769v010101v
en_302769v010101ven_302769v010101v
en_302769v010101v
 
en_ETSI_302769v010101v
en_ETSI_302769v010101ven_ETSI_302769v010101v
en_ETSI_302769v010101v
 
en_302769v010101v
en_302769v010101ven_302769v010101v
en_302769v010101v
 
en_302769v010101v
en_302769v010101ven_302769v010101v
en_302769v010101v
 
en_ETSI_302769v010101v
en_ETSI_302769v010101ven_ETSI_302769v010101v
en_ETSI_302769v010101v
 

More from Dr. Mohieddin Moradi

An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3Dr. Mohieddin Moradi
 
An Introduction to HDTV Principles-Part 2
An Introduction to HDTV Principles-Part 2An Introduction to HDTV Principles-Part 2
An Introduction to HDTV Principles-Part 2Dr. Mohieddin Moradi
 
Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3Dr. Mohieddin Moradi
 
Broadcast Camera Technology, Part 1
Broadcast Camera Technology, Part 1Broadcast Camera Technology, Part 1
Broadcast Camera Technology, Part 1Dr. Mohieddin Moradi
 
An Introduction to Audio Principles
An Introduction to Audio Principles An Introduction to Audio Principles
An Introduction to Audio Principles Dr. Mohieddin Moradi
 
Video Compression, Part 4 Section 1, Video Quality Assessment
Video Compression, Part 4 Section 1,  Video Quality Assessment Video Compression, Part 4 Section 1,  Video Quality Assessment
Video Compression, Part 4 Section 1, Video Quality Assessment Dr. Mohieddin Moradi
 
Video Compression, Part 4 Section 2, Video Quality Assessment
Video Compression, Part 4 Section 2,  Video Quality Assessment Video Compression, Part 4 Section 2,  Video Quality Assessment
Video Compression, Part 4 Section 2, Video Quality Assessment Dr. Mohieddin Moradi
 

More from Dr. Mohieddin Moradi (9)

HDR and WCG Principles-Part 2
HDR and WCG Principles-Part 2HDR and WCG Principles-Part 2
HDR and WCG Principles-Part 2
 
SDI to IP 2110 Transition Part 2
SDI to IP 2110 Transition Part 2SDI to IP 2110 Transition Part 2
SDI to IP 2110 Transition Part 2
 
An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3An Introduction to HDTV Principles-Part 3
An Introduction to HDTV Principles-Part 3
 
An Introduction to HDTV Principles-Part 2
An Introduction to HDTV Principles-Part 2An Introduction to HDTV Principles-Part 2
An Introduction to HDTV Principles-Part 2
 
Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3Broadcast Camera Technology, Part 3
Broadcast Camera Technology, Part 3
 
Broadcast Camera Technology, Part 1
Broadcast Camera Technology, Part 1Broadcast Camera Technology, Part 1
Broadcast Camera Technology, Part 1
 
An Introduction to Audio Principles
An Introduction to Audio Principles An Introduction to Audio Principles
An Introduction to Audio Principles
 
Video Compression, Part 4 Section 1, Video Quality Assessment
Video Compression, Part 4 Section 1,  Video Quality Assessment Video Compression, Part 4 Section 1,  Video Quality Assessment
Video Compression, Part 4 Section 1, Video Quality Assessment
 
Video Compression, Part 4 Section 2, Video Quality Assessment
Video Compression, Part 4 Section 2,  Video Quality Assessment Video Compression, Part 4 Section 2,  Video Quality Assessment
Video Compression, Part 4 Section 2, Video Quality Assessment
 

Recently uploaded

Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesPrabhanshu Chaturvedi
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdfKamal Acharya
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 

Recently uploaded (20)

Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Glass Ceramics: Processing and Properties
Glass Ceramics: Processing and PropertiesGlass Ceramics: Processing and Properties
Glass Ceramics: Processing and Properties
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024Water Industry Process Automation & Control Monthly - April 2024
Water Industry Process Automation & Control Monthly - April 2024
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur EscortsRussian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
Russian Call Girls in Nagpur Grishma Call 7001035870 Meet With Nagpur Escorts
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 

Video Compression, Part 2-Section 1, Video Coding Concepts

  • 2. Section I − Video Compression History − A Generic Interframe Video Encoder − The Principle of Compression − Differential Pulse-Code Modulation (DPCM) − Transform Coding − Quantization of DCT Coefficients − Entropy Coding Section II − Still Image Coding − Prediction in Video Coding (Temporal and Spatial Prediction) − A Generic Video Encoder/Decoder − Some Motion Estimation Approaches 2 Outline
  • 4. Why Compression? SD-SDI 270 Mbps HD-SDI 1.5Gbps, 3Gbps 4K-UHD 12Gbps 8K-UHD 48Gbps 4
  • 6. 6 Coded video Coded audio Video format .264, .265, VP9… Container format MP4, MOV, WebM, MXF… Audio format .aac, .ogg, .mp3… Codec and Container Format − A container or wrapper format is a metafile format whose specification describes how different elements of data and metadata coexist in a computer file. − Wrappers serve two purposes mainly: • To gather programme material and related information • To identify those pieces of information Container
  • 7. 7 Codec and Container (Wrapper) Format Media Format Wrapper CODEC AVC-Intra Class 100 DNXHD, ProRes AVC LongG MXF OPAtom MXF OP1B, Quicktime, DVI P2, AVCHD HDCAM, Mini DV, SR LTO, HDD, BluRay Disc P2 Card, SD Card
  • 8. 8 Ex: MXF File Structure of AVC-LongG OP-1b and OP-1a
  • 9. Goal of Standards − Ensuring Interoperability − Enabling communication between devices made by different manufacturers − Promoting a technology or industry − Reducing costs 9 The Scope of Video Standardization Decoder Bitstream Encoder
  • 10. Goal of Standards − Ensuring Interoperability − Enabling communication between devices made by different manufacturers − Promoting a technology or industry − Reducing costs − Not the encoder, Not the decoder − Just the bitstream syntax and the decoding process (e.g. use IDCT, but not how to implement the IDCT) 10 The Scope of Video Standardization Decoder Bitstream Scope of Standardization Encoder (Decoding Processes)
  • 11. Only Specifications of the Bitstream, Syntax, and Decoding Processes are standardized: • Enables improved encoding & decoding strategies to be employed in a standard-compatible manner • Provides no guarantees of quality • Permits optimization beyond the obvious • Permits complexity reduction for implementability Pre-Processing Source Destination Post-Processing & Error Recovery Scope of Standard Encoding Decoding 11 CODEC (enCODer/DECoder) Standard defines this The Scope of Video Standardization
  • 12. 12 − This allows future encoders of better performance to remain compatible with existing decoders. − Also allows for commercially secret encoders to be compatible with standard decoders Today’s Ho-Hum Encoder Tomorrow’s Nifty Encoder Very Secret Encoder Today’s Decoder Today’s decoder still works! The Scope of Video Standardization • The international standard does not specify the design of the video encoders and decoders. • It only specifies the syntax and semantics of the bitstream and signal processing at the encoder/decoder interface. • Therefore, options are left open to the video codec manufacturers to trade-off cost, speed, picture quality and coding efficiency.
  • 13. JTC1 IEC ISO SC 29 RAAGM AG WG12WG11WG1 WG JBIG JPEG SG MHEG-5 Main- tenance MHEG-6 SG Audio SNHC System Video Requirements Implementation Studies Test SG Liaisons Advisory Group (AG) on Management (AGM) • To advise SC 29 and its WGs on matters of management that affect their works. Advisory Group (AG) on Registration Authority (RA) WG1: Still images, JPEG and JBIG • Joint Photographic Experts Group and Joint Bi-level Image Group WG11: Video, MPEG • Motion Picture Experts Group WG12: Multimedia, MHEG • Multimedia Hypermedia Experts Group International Standardization Organization Subcommittee 29 Title: “Coding of Audio, Picture, Multimedia and Hypermedia Information” Joint Technical Committee ISO/IEC JTC 1/SC 29 Structure and MPEG MPEG (Moving Picture Experts Group, 1988 ) To develop standards for coded representation of digital audio, video, 3D Graphics and other data International Electrotechnical Committee 13
  • 14. Telecommunication Standardization Advisory Group (TSAG) WTSA World Telecommunication Standardization Assembly SG Workshops, Seminars, Symposia … IPRs (Intellectual Property Rights) WP Questions: Develop Recommendations SG WP WP Q Focus Group VCEG (ITU-T SG16/Q6) ) • Study Group 16 Multimedia terminals, systems and applications • Working Party 3 Media coding • Question 6 Video coding Rapporteurs (R): Mr Gary SULLIVAN, Mr Thomas WIEGAND SG16 WP3 14 ITU-T structure and VCEG (Video Coding Experts Group or Visual Coding Experts Group) Administrative Entities Q Q Q Q Q Q Q Q Q Q Q6 VCEG
  • 15. 15 ITU, International Telecommunication Union structure − Founded in 1865, it is the oldest specialized agency of the United Nations system − ITU is an International organization where governments, industries, telecom operators, service providers and regulators work together to coordinate global telecommunication networks and services − Help the world communicate! What does ITU actually do? • Spectrum allocation and registration • Coordinate national spectrum planning • International telecoms/ICT standardization • Collaborate in international tariff-setting • Cooperate in telecommunications development assistance • Develop measures for ensuring safety of life • Provide policy reviews and information exchange • Insure and extend universal Telecom access
  • 16. 16 ITU, International Telecommunication Union structure − Plenipotentiary Conference: Key event, all ITU Member States decide on the future role of the organization (Held every four years) − ITU Council: The role of the Council is to consider, in the interval between Plenipotentiary Conferences, broad telecommunication policy issues to ensure that the Union's activities, policies and strategies fully respond to today's dynamic, rapidly changing telecommunication environment (held yearly)
  • 17. 17 ITU, International Telecommunication Union structure − General Secretariat: Coordinates and manages the administrative and financial aspects of the Union’s activities (provision of conference services, information services, legal advice, finance, personnel, etc.) − ITU-R: Coordinates radio communications, radio-frequency spectrum management and wireless services. − ITU-D: Technical assistance and deployment of telecom networks and services in developing and least developed countries to allow the development of telecommunication. − ITU-T: Telecommunication standardization on a world-wide basis. Ensures the efficient and on-time production of high quality standards covering all fields of telecommunications (technical, operating and tariff issues). (The Secretariat of ITU-T (TSB: Telecommunication Standardization Bureau) provides services to ITU-T Participants)
  • 18. 18 ITU, International Telecommunication Union structure Telecommunication Standardization Bureau (TSB) (Place des Nations, CH-1211 Geneva 20) − The TSB provides secretarial support for ITU-T and services for participants in ITU-T work (e.g. organization of meeting, publication of Recommendations, website maintenance etc.). − Disseminates information on international telecommunications and establishes agreements with many international SDOs. Mission of ITU-T Standardization Sector of ITU − Helping people all around the world to communicate and to equally share the advantages and opportunities of telecommunication reducing the digital divide by studying technical, operating and tariff matters to develop telecommunication standards (Recommendations) on a worldwide basis.
  • 19. 19 ITU, International Telecommunication Union structure World Telecommunication Standardization Assembly (WTSA) − WTSA sets the overall direction and structure for ITU-T, meets every four years and for the next four-year period: • Defines the general policy for the Sector • Establishes the study groups (SG) • Approves SG work programmes • Appoints SG chairmen and vice-chairmen Telecommunication Standardization Advisory Group (TSAG) − TSAG provides ITU-T with flexibility between WTSAs, and reviews priorities, programmes, operations, financial matters and strategies for the Sector (meets ~~ 9 months ) • Follows up on accomplishment of the work programme • Restructures and establishes ITU-T study groups • Provides guidelines to the study groups • Advises the TSB Director • Produces the A-series Recommendations on organization and working procedures
  • 20. • ISO/IEC MPEG = “Moving Picture Experts Group” (ISO/IEC JTC 1/SC 29/WG 11 = International Standardization Organization and International Electrotechnical Commission, Joint Technical Committee 1, Subcommittee 29, Working Group 11) • ITU-T VCEG = “Video Coding Experts Group” (ITU-T SG16/Q6 = International Telecommunications Union – Telecommunications Standardization Sector (ITU-T, a United Nations Organization, formerly CCITT), Study Group 16, Working Party 3, Question 6) • JVT = “Joint Video Team” Collaborative team of MPEG & VCEG, responsible for developing AVC (discontinued in 2009) • JCT-VC = “Joint Collaborative Team on Video Coding” Team of MPEG & VCEG , responsible for developing HEVC (established January 2010) • JVET = “Joint Video Experts Team” Exploring potential for new technology beyond HEVC (established Oct. 2015 as Joint Video Exploration Team, renamed Apr. 2018) 20 Video Coding Standardization Organizations
  • 21. 21 H.263/+/++ (1995-2000+) MPEG-4 Visual (1998-2001+) MPEG-1 (1993) ISO/IECITU-T H.261 (1990+) H.262 / 13818-2 (1994/95-1998+) (2003-2018+) (2013-2018+) H.120 (1984-1988) Computer SD HD H.264 / 14496-10 AVC 4K UHD H.265 / 23008-2 HEVC It developed by Joint Video Team (JVT) It developed by Joint Collaborative Team on Video Coding (JCT-VC) (MPEG-2) (2020-...) 8K, 360, ... H.26x / 23090-3 VVC It will be developed by Joint Video Experts Team (JVET) 1990 1994 2003 2013 2020 History of Video Coding Standardization (1985 ~ 2020) Video telephony
  • 22. 22 ITU-T Standard Joint ITU-T/MPEG Standards MPEG Standard 1988 1990 1992 1994 1996 1998 2002 2004 20062000 2008 2010 H.261 (Version 1) H.261 (Version 2) H.263 H.263+ H.263++ H.262/MPEG-2 H.264/MPEG-4 AVC H.265/HVC MPEG-1 MPEG-4 (Version 1) MPEG-4 (Version 2) H.261 Video Compression Standard
  • 23. 23 H series are low delay codecs for telecom applications (International Telecommunication Union (ITU-T) developed several recommendations for video coding) − H.261 (1990): the first video codec specification, “Video Codec for Audio Visual Services at p x 64kbps” − H.262 (1995) : Infrastructure of audiovisual services—Coding of moving video − H.263 (1996): next conf. solution, Video coding for low bit rate communications − H.263+ (H.263V2) (1998) − H.263++ (H.263V3)(2000), follow-on solutions − H.26L: “long-term” solution for low bit-rate video coding for communication applications (Not backward compatible to H.263+) − H.264 (H.26L) completed in May 2003 and lead to H.264: known as advanced video coding (AVC) − H.265/HEVC (2013) High Efficiency Video Coding ITU H.26x History
  • 24. 24 Motion Picture Experts Group (MPEG) codecs are designed for storage/broadcast/streaming applications MPEG-1 (1992) • Started in 1988 by Lenardo Chiariglione • Compression standard for progressive frame-based video in SIF (360x240) formats • Applications: VCD MPEG-2 (1994-5) • Compression standard for interlaced frame-based video in CCIR-601 (720x480) and high definition (1920x1088i) formats • Applications: DVD, SVCD, DIRECTV, GA, DVB, HDTV Studio, DTV Broadcast, DVD, HD, video standards for television and telecommunications standards MPEG-4 (1999) • Multimedia standard for object-based video from natural or synthetic source • Applications: Internet, cable TV, virtual studio, home LAN etc.. • Object-oriented • Over-ambitious? MPEG History MPEG 21 MPEG-2 MPEG-1 MPEG-4 MPEG-7
  • 25. 25 Motion Picture Experts Group (MPEG) codecs are designed for storage/broadcast/streaming applications MPEG-7, 2001 • Standardized descriptions of multimedia information, formally called “Multimedia Content Description Interface” • Metadata for audio-video streams • Applications: Internet, video search engine, digital library MPEG-21, 2002 • Intellectual right protection propose • Distribution, exchange, user access of multimedia data and intellectual property management AVC (2003), also known as MPEG-4 version 10 • Conventional to HD • Emphasis on compression performance and loss resilience HEVC (2013) High Efficiency Video Coding MPEG History MPEG 21 MPEG-2 MPEG-1 MPEG-4 MPEG-7
  • 26. 26 ITU and MPEG (ISO/IEC) have also worked together for joint codecs: − MPEG-2 is also called H.262 − H.26L has lead to a codec now is called: • H.264 in telecom • MPEG-4 (version 10) in broadcast • AVC (Advanced Video Coding) in broadcast • Joint Video Team (JVT) Codec − H.265/HEVC (2013) High Efficiency Video Coding Joint ITU/MPEG
  • 27. 27 The Story of MPEG and VCEG
  • 28. 28 ITU and MPEG (ISO/IEC) have also worked together for joint codecs: Joint ITU/MPEG 50% bitrate saving – Direct-to-home 30% bitrate saving – Contribution 50% bitrate saving – Direct-to-home 30% bitrate saving – Contribution 2020 VVC 2020 ≈50% bitrate saving – Direct-to-home ≈30% bitrate saving – Contribution
  • 29. 29 Wrapper TypeBitrate (Mbps) 1920×1080, 4:2:2, 10 bit Codec NameCodec Brand MXF367 (50p)DNxHD 365x AVID MXF184 (50i)DNxHD 185x MXF174 (50i) /345 (50p), (12 bit)DNxHR HQX MOV38(50i)/76(50p)ProRes 422 Proxy APPLE MOV85(50i)/170(50p)ProRes 422 LT MOV122(50i)/245(50p)ProRes 422 MOV184(50i)/367(50p)ProRes 422 HQ MXF/MP4112 (50i)/ 223 (50p) [MXF]XAVC Intra Class 100 SONY MXF227 (50i)/ 454 (50p)XAVC Intra Class 200 MXF/MP4 50 (50i,50p) [MXF] Max Bit Rate=80 Mb/s XAVC Long GOP 50 MXF/MP4 35 (50i,50p) [MXF] Max Bit Rate=80 Mb/s XAVC Long GOP 35 MXF/MP4 25 (50i) [MXF] Max Bit Rate=80 Mb/s XAVC Long GOP 25 MXF226 (50i)/452 (50p)AVC-Intra 200 PANASONIC MXF111 (50i)/222 (50p)AVC-Intra 100 MXF50 (50i)AVC-LongG 50 MXF25 (50i)/50 (50p)AVC-LongG 25 Some Famous Codecs for HD
  • 31. 31 Spatial Domain − Elements are used “raw” in suitable combinations. − The frequency of occurrence of such combinations is used to influence the design of the coder so that shorter codewords are used for more frequent combinations and vice versa (entropy coding). Transform Domain − Elements are mapped onto a different domain (i.e. the frequency domain). − The resulting coefficients are quantised and entropy-coded. Hybrid − Combinations of the above. Classification of Compression Techniques
  • 32. Current Stage Used since early days of video compression standards, e.g. MPEG-1/-2/-4, H.264/AVC, HEVC and also in most proprietary codecs (VC1, VP8 etc.) Input Frame 1 ,Q 32 A Generic Interframe Video Encoder
  • 33. Input Frame 1 DCT ,Q 33 A Generic Interframe Video Encoder
  • 34. Quantized 010011101001… Input Frame 1 DCT ,Q 34 A Generic Interframe Video Encoder
  • 35. QuantizedInput Frame 1 DCT 010011101001… Reconstructed Frame 1 ,Q 35 A Generic Interframe Video Encoder
  • 36. Input Frame 2 ,Q 36 Reconstructed Frame 1 A Generic Interframe Video Encoder
  • 37. 010011101001… Entropy Coded MVs ,Q 37 Reconstructed Frame 1 Input Frame 2 A Generic Interframe Video Encoder
  • 38. 010011101001… Entropy Coded MVs ,Q 38 Reconstructed Frame 1 with MC Input Frame 2 A Generic Interframe Video Encoder
  • 39. Input Frame 2 Residual with MC (Frames 1&2) ,Q 39 Reconstructed Frame 1 with MC A Generic Interframe Video Encoder If the motion prediction is successful, the energy in the residual is lower than in the original frame and can be represented with fewer bits.
  • 40. Residual with MC (Frames 1&2) DCT ,Q 40 A Generic Interframe Video Encoder
  • 41. 010011101001… QuantizedDCT Residual with MC (Frames 1&2) ,Q 41 A Generic Interframe Video Encoder
  • 42. Reconstructed Residual with MC (Frames 1&2) QuantizedDCT Residual with MC (Frames 1&2) ,Q 42 A Generic Interframe Video Encoder
  • 43. ,Q 43 Reconstructed Residual with MC (Frames 1&2) Reconstructed Frame 1 with MC + Reconstructed Frame 2 with MC = A Generic Interframe Video Encoder
  • 45. − Spatial Redundancy Reduction (pixels inside a picture are similar) − Temporal Redundancy Reduction (Similarity between the frames) − Statistical Redundancy Reduction (more frequent symbols are assigned short code words and less frequent ones longer words) The Principle of Compression 45
  • 46. 46 − It arises when parts of a picture are often replicated within a single frame of video (with minor changes). Spatial Redundancy in Still Images This area is all blue This area is half blue and half green Sky Blue Sky Blue Sky Blue Sky Blue Sky Blue Sky Blue Sky Blue Sky Blue
  • 47. − Take advantage of similarity between successive frames − It arises when successive frames of video display images of the same scene. 47 Temporal Redundancy in Moving Images This picture is the same as the previous one except for this area
  • 48. All signals & data have some redundancy and some entropy. – Data is compressed by keeping entropy and throwing away redundancy if possible! – Redundancy is the useless stuff. – Redundancy can be thrown away – More redundancy in simple signals & data • Black & burst, colour bars, flat scenery, talking heads, quiet music, 1kHz sine test tone, bitmap images, database files, text files. – Entropy is the useful stuff. – Entropy is a term often used for ‘activity’ or ‘chaos’. – More entropy in complex signals & data • Multiburst and pathological test signals, football match, white noise, executables(computer file that can be executed), DLL files. 48 The Principle of Compression
  • 49. • . Simple Complex Dataorbandwidth Max Redundancy Entropy 2:1 compression Lost entropy 49 Redundancy & Entropy High compression ratio could be lead to lost of entropy
  • 50. • . Simple Complex Dataorbandwidth Max Redundancy Entropy Lost entropy 4:1 compression 50 Redundancy & Entropy High compression ratio could be lead to lost of entropy
  • 51. Spatial Redundancy Reduction 51 Spatial Redundancy Reduction Transform coding Discrete Sine Transform (DST) Discrete Wavelet Transform (DWT) Hadamard Transform(HT) Discrete Cosine Transform (DCT) Differential Pulse Code Modulation (DPCM)
  • 53. PCM was invented by the British engineer Alec Reeves in 1937 in France. − Pulse code modulation (PCM) is produced by analog-to-digital conversion process. − As in the case of other pulse modulation techniques, the rate at which samples are taken and encoded must conform to the Nyquist sampling rate. − The sampling rate must be greater than, twice the highest frequency in the analog signal, 𝒇 𝒔 > 𝟐𝒇 𝒎𝒂𝒙 Pulse Code Modulation (PCM) 53
  • 54. Encoding in PCM 54 AllowedQuantizationLevel 1.52 → 1.5 1.08 → 1.1 0.92 → 0.9 0.56 → 0.6 0.28 → 0.3 0.27 → 0.3 0.11 → 0.1
  • 56. Regeneration (re-amplification, retiming, reshaping) Regeneration 56
  • 57. Advantages of PCM • Robustness to noise and interference • Efficient regeneration • Efficient SNR and bandwidth trade-off • Uniform format • Ease add and drop • Secure DS0 • A basic digital signaling rate of 64 kbit/s. • To carry a typical phone call, the audio sound is digitized at an 8 kHz sample rate using 8-bit pulse-code modulation. Advantages of PCM 57
  • 58. − Encode information in terms of signal transition; a transition is used to designate Symbol 0. − Symbol 0→ Transition (0→1, 1→0) Differential Encoding 58
  • 59. − Usually PCM has the sampling rate higher than the Nyquist rate. − The encode signal contains redundant information. − DPCM can efficiently remove this redundancy. − Prediction error of m[n] : 𝒆 𝒏 = 𝒎 𝒏 − ෝ𝒎 𝒏 − Quantized value of m[n] is: 𝒎 𝒒 𝒏 = 𝒆 𝒒 𝒏 + ෝ𝒎 𝒏 − Quantization error of 𝒆 𝒏 is defines as: 𝒒 𝒏 ≜ 𝒆[𝒏] − 𝒆 𝒒 𝒏 − We can proof that: 𝒎 𝒏 − 𝒎 𝒒 𝒏 = ( ෝ𝒎 𝒏 + 𝒆 𝒏 )- (𝒆 𝒒 𝒏 + ෝ𝒎 𝒏 )=𝒆 𝒏 − 𝒆 𝒒 𝒏 = 𝒒 𝒏 Differential Pulse-Code Modulation (DPCM) 59 𝒎 𝒏 − 𝒎 𝒒 𝒏 = 𝒒 𝒏 𝒎[𝒏 + 𝟏] ෝ𝒎 [𝒏 + 𝟏] ෝ𝒎 [𝒏 + 𝟏] 𝒎 𝒒 [𝒏] 𝒎 𝒒 [𝒏] 𝒆 𝒒 [𝒏 + 𝟏]
  • 60. 𝒎 𝒏 − 𝒎 𝒒 𝒏 = 𝒆 𝒏 − 𝒆 𝒒 𝒏 = 𝒒 𝒏 means that: − The pointwise coding error in the input sequence is exactly equal to q(n) that is equal to the quantization error in e(n) − With a reasonable predictor the mean square value of the differential signal e(n) is much smaller than that of m(n) − For the same mean square quantization error, e[n] requires fewer quantization bits than m[n] ⇒ The number of bits required for transmission has been reduced while the quantization error is kept the same. Differential Pulse-Code Modulation (DPCM) 60
  • 61. − An important aspect of DPCM is that the prediction is based on the output (the quantized samples) rather than the input (the unquantized samples). − This results in the predictor being in the “feedback loop” around the quantizer, so that the quantizer error at a given step is fed back to the quantizer input at the next step. − This has a “stabling effect” that prevents DC drift and accumulation of error in the reconstructed signal 𝒎 𝒒 𝒏 . Differential Pulse-Code Modulation (DPCM) 61
  • 62.     )(minimizeGmaximizefilter topredictionaDesign GGain,Processing )SNR( isrationoiseonquantizati-to-signaltheand errorspredictiontheofvariancetheiswhere )SNR( ))(((SNR) and0)]][[(ofvariancesareandwhere (SNR) issystemDPCMtheof(SNR)The 2 2 2 2 2 2 2 2 2 2 o 22 2 2 o o Ep E M p Q E Q E Qp Q E E M QM Q M G nqnmEnm                    Processing Gain 62
  • 63. 63 Predictive Coding (from previous symbol) Predictive Coding (generalised) − Prediction is based on combination of previous symbols − Prediction template needs to be “causal” i.e. template should contain only “previous” elements w.r.t the direction of scanning (shown with arrows). − This is important for coding applications as the decoder will need to have decoded the template elements first to perform the prediction of the current element.
  • 64. 64 Predictive Coding (from previous symbol) Predictive Coding (previous symbol) − Previous symbol used as a prediction of current symbol − Prediction error coded in a memoryless fashion − Prediction error alphabet and codebook have twice the size − i.e. symbol alphabet {1, 2, 3, 4} prediction alphabet {-3, -2, -1, 0, 1, 2, 3} − A good predictor will minimise the error (most occurrence will be zero)
  • 65. − If the frame is processed in raster order, then pixels A, B and C in the current and previous rows are available in both the encoder and the decoder since these should already have been decoded before X. − The decoder forms the same prediction and adds the decoded residual to reconstruct the pixel. 65 Predictive Image Coding Pixel X to be encoded P(X) is a prediction of X using A,B and C Residual R(X) = X − P(X) R(X) is encoded and transmitted 1 •Encoder forms a prediction for X based on some combination of previously coded pixels 2 •Then subtracts this prediction from X 3 •Then encodes the residual (the result of the subtraction)
  • 66. Example − Encoder prediction P(X) = (2A + B + C)/4 − Residual R(X) = X − P(X) is encoded and transmitted. − Decoder decodes R(X) and forms the same prediction: P(X) = (2A + B + C)/4 − Reconstructed pixel X = R(X) + P(X) 66 Predictive Image Coding Spatial prediction (DPCM) 1 •Encoder forms a prediction for X based on some combination of previously coded pixels 2 •Then subtracts this prediction from X 3 •Then encodes the residual (the result of the subtraction) By Encoder By Decoder
  • 67. − If the encoding process is lossy, i.e. if the residual is quantized (𝑹′ 𝒙 ) • Then the decoded pixels 𝐴′ , 𝐵′ and 𝐶′ may not be identical to the original A, B and C due to losses during encoding and so the above process could lead to a cumulative mismatch or ‘drift’ between the encoder and decoder. − Hence the encoder uses decoded pixels 𝑨′ , 𝑩′ and 𝑪′ to form the prediction, i.e. P(X) = (2𝑨′ + 𝑩′ + 𝑪′ ) / 4 in the above example. − The compression efficiency of this approach depends on the accuracy of the prediction P(X). 67 Predictive Image Coding To avoid this, the encoder should itself decode the residual 𝑹′ 𝒙 and reconstruct each pixel. In this way, both encoder and decoder use the same prediction P(X) and drift is avoided. Quantizer 𝑹′ 𝒙𝑹 𝒙 = 𝑿 − 𝑷(𝑿)
  • 68. − If the prediction is successful, the energy in the residual is lower than in the original frame and the residual can be represented with fewer bits (Motion compensation is an example of predictive coding). − Spatial Prediction involves predicting an image sample or region from previously-transmitted samples in the same image or frame and is sometimes described as ‘Differential Pulse Code Modulation’ (DPCM). 68 Predictive Image Coding Spatial Prediction in a Frame=DPCM
  • 71.      1 0 )sin()cos( 2 )( n nn nxbnxa a tf 0) 2 cos()( 2 2 2    ndx T xn xf T a T T n  0) 2 sin()( 2 2 2    ndx T xn xf T b T T n     2 2 0 )( 1 T T dxxf T a 71 Fourier Series Recall
  • 75. 75 -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) -1.5 -1 -0.5 0 0.5 1 1.5 0 2 4 6 8 10 t squaresignal,sw(t) Ideally need infinite terms. Fourier Series Recall
  • 76. How transform coding can lead to data compression? − Although each pixel 𝑥1 or 𝑥2 may take any value uniformly between 0 (black) and its maximum value 255 (white), since there is a high correlation (similarity) between them, then it is most likely that their joint occurrences lie mainly on a 45-degree line. − The joint occurrences on the new coordinates have a uniform distribution along the 𝒚 𝟏 axis, but are highly peaked around zero on the 𝒚 𝟐 axis. − The 𝒚 𝟏 is called the average or DC value of 𝒙 𝟏 and 𝒙 𝟐 − The 𝒚 𝟐 represents residual differences of 𝒙 𝟏 and 𝒙 𝟐 − The normalization factor of 1 2 makes sure that the signal energy due to transformation is not changed (Parseval theorem). 76 Transform Coding Joint occurrences of a pair of pixels in one frame
  • 77. − Transform domain coding is mainly used to remove the spatial redundancies in images by mapping the pixels into a transform domain prior to data reduction. − The strength of transform coding in achieving data compression is that the image energy of most natural scenes is mainly concentrated in the low-frequency region, and hence into a few transform coefficients. − These coefficients can then be quantized with the aim of discarding insignificant coefficients, without significantly affecting the reconstructed image quality. 77 Transform Coding
  • 79. Through transformation, a group of correlated pixels are converted into a group of none correlated coefficients. − Only one coefficient becomes important, and the rest carry non-significant energy. − The larger the number of pixels transformed together, the better compression efficiency − If pixels intensity variations match to the transformation basis vectors, then only one coefficient (apart from DC) becomes significant (unitary/orthonormality). 79 Transform Coding
  • 80. The choice of transform depends on a number of criteria: 1. Data in the transform domain should be decorrelated, i.e. separated into components within minimal inter-dependence, and compact, i.e. most of the energy in the transformed data should be concentrated into a small number of values. 2. The transform should be reversible. 3. The transform should be computationally tractable, e.g. low memory requirement, achievable using limited-precision arithmetic, low number of arithmetic operations, etc. 80 The Choice of Transform Coding
  • 81. − A group of U pixels in each line are 1-D transformed. − This is repeated for V lines. − A group of V coefficients in the vertical directions are transformed. − This is repeated for U columns. − The final output is UV 2-D transform coefficients. − Transform coefficients are quantized for compression. − Compressed coefficients are inverse transformed to reconstruct the image. 81 What Is a Two Dimensional Transform? One-dimensional transformation in the Horizontal direction One-dimensional transformation in the Vertical direction U V Normally U=V 2D Coeff. 1D Coeff.
  • 82. − No reduction in data, just replacement (Replaces the original pixel samples with coefficients). − Coefficients describe how the samples are changing. − Helps to separate entropy from redundancy. − DCT always performed on a block of samples. Discrete Cosine Transform 82
  • 83. Discrete Cosine Transform Smallest DCT block is a 2x2 block. • Top left coefficient is the DC coefficient → Describes the average of the 4 samples. • Top right coefficient is the horizontal coefficient → Describes how the 4 samples are changing horizontally. • Bottom left coefficient is the vertical coefficient → Describes how the 4 samples are changing vertically. • Bottom right coefficient is the diagonal coefficient → Describes how the 4 samples are changing diagonally. 83 Original pixel samples Original pixel samples DCT Inverse DCT DC Horizontal coefficient Diagonal coefficient Vertical coefficient
  • 84. 255 255 0 0 255 0 255- 0 255 255- 0 0 84 Discrete Cosine Transform 255 0 255 0 255 255 = σ𝑖=1 4 𝑃𝑖 2 W to B → 127.5 B to W → -127.5
  • 85. 127.5 127.5- 127.5 127.5- 127.5 127.5 127.5 127.5 127.5 127.5 127.5- 127.5- 127.5 127.5- 127.5- 127.5 85 Discrete Cosine Transform
  • 86. – DCT always performed on a block of samples. 86 Discrete Cosine Transform
  • 87. 87 Detail in a Block vs. DCT Coefficients Transmitted Discrete Cosine Transform
  • 88. Most compression systems use an 8x8 DCT block. • The top left coefficient is the DC coefficient. • Top row are horizontal coefficients → Low frequency changes to the left, high to the right. • Left column are vertical coefficients → Low frequency changes at the top, high at the bottom. • The other coefficients for different angle/frequencies → Low frequency to the top left, & high to the bottom right. Discrete Cosine Transform 88 Pixel Domain Frequency Domain
  • 89. 55 55 5555 109 55 109 109 55 5555 55 109 109 109 109 109 109 109 109 109 109 109 109 109 109 109 109 109 109 109 109 109 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 55 m = 0 1 2 3 4 5 6 7 n = 0 n = 1 n = 2 n = 3 n = 4 n = 5 n = 6 n = 7 f(m,n) Spatial 8 x 8 Pixel Values Discrete Cosine Transform 89
  • 90. NINT[ ] NINT = Nearest INteger Truncation 602 -69 -63147 -50 0 -24 -45 0 22-52 0 0 16 21 14 -22 0 15 19 12 0 0 0 0 0 0 16 8 0 -5 -7 -4 0 0 0 0 0 0 -11 5 0 3 4 3 0 0 0 0 0 0 9 4 0 -3 -4 -2 0 -15 0 12 34 0 -29 u = 0 1 2 3 4 5 6 7 v = 0 v = 1 v = 2 v = 3 v = 4 v = 5 v = 6 v = 7 F(u,v) Frequency Domain 8 x 8 Transform Values Discrete Cosine Transform 90
  • 98. − The Forward DCT (FDCT) of an N × N sample block is given by − The Inverse DCT (IDCT) is given by − A is an N × N transform matrix. The elements of A are − FDCT and IDCT may be written in summation form: 98 Discrete Cosine Transform
  • 99. − Ex: The transform matrix A for a 4 × 4 DCT is: 99 Discrete Cosine Transform
  • 100. − The output of a 2-dimensional FDCT is a set of N × N coefficients representing the image block data in the DCT domain which can be considered as ‘weights’ of a set of standard basis patterns. − The basis patterns for the 4 × 4 DCT are shown. − The basis patterns are composed of combinations of horizontal and vertical cosine functions. − Any image block may be reconstructed by combining all N × N basis patterns, with each basis multiplied by the appropriate weighting factor (coefficient). 100 Discrete Cosine Transform
  • 101. 101 u = 0 u = 1 u = 2 u = 3 v = 0 v = 1 v = 2 v = 3 Discrete Cosine Transform (4×4 basis patterns)
  • 102. 102 Discrete Cosine Transform 𝑁 = 8 𝑎𝑛𝑑 𝑖, 𝑗 = 0, . . , 7 𝐼𝑛 𝑟𝑒𝑎𝑙 𝑐𝑜𝑑𝑒𝑐𝑠: −2048 ≤ 𝐷(𝑖, 𝑗) ≤ +2047
  • 103. 103 Discrete Cosine Transform 𝑁 = 8 𝑎𝑛𝑑 𝑖, 𝑗 = 0, . . , 7
  • 104. Discrete Cosine Transform DCT: − Basis Vectors: 𝐶𝑂𝑆 𝑘𝜋 2𝑛 + 1 2𝑁 𝑘 & 𝑛 = 0, … … . , 𝑁 − 1 − For orthonormality, transform coefficients are divided by 𝑵 − Both transforms are orthonormal, but DCT has a smooth varying basis vector that matches natural images better. 104 DCT and Hadamard 8x8 Matrices 1 1 1 1 1 1 1 1 cos x cos3x sin3x sin x - sin x - sin3x - cos3x - cos x cos2x sin2x -sin2x -cos2x -cos2x -sin2x sin2x cos2x cos3x -sin x -cos x -sin3x sin3x cos x sin x -cos3x 1 -1 -1 1 1 -1 -1 1 sin3x -cos x sin x cos3x -cos3x -sin x cos x -sin3x sin2x -cos2x cos2x -sin2x -sin2x cos2x -cos2x sin2x sin x -sin3x cos3x -cos x cos x -cos3x sin3x -sin x 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 -1 -1 -1 -1 1 1 -1 -1 -1 -1 1 1 1 1 -1 -1 1 1 -1 -1 1 -1 -1 1 1 -1 -1 1 1 -1 -1 1 -1 1 1 -1 1 -1 1 -1 -1 1 -1 1 1 -1 1 -1 1 -1 1 -1           11 11 nn nn n HH HH H With H0=1 Hadamanrd Transform
  • 105. By subtracting 128 from each array. Because the DCT is designed to work on pixels values ranging from -128 to 127. D TMT  105 Discrete Cosine Transform Implementation Example
  • 106. − DCT calculations are mathematically intensive. − Easier to use simple matrix manipulation and a “look-up” matrix. − “Look-up” matrix act like a key or look-up table. − This “look-up” matrix is called the basis pictures. − For a 2x2 DCT block the basis pictures are 4x4. − For an 8x8 DCT block the basis pictures are 64x64. Discrete Cosine Transform 106
  • 107. 1078 × 8 DCT basis patterns Discrete Cosine Transform (8×8 DCT basis patterns) − The basis patterns for the 8 × 8 DCT are shown. − The basis patterns are composed of combinations of horizontal and vertical cosine functions. − Any image block may be reconstructed by combining all N × N basis patterns, with each basis multiplied by the appropriate weighting factor (coefficient).
  • 108. 108 Discrete Cosine Transform (8×8 DCT basis patterns)
  • 112. Note that: – Low-low coefficients are much larger than high-high coefficients – While pixel values change at all positions, DCT values are mainly larger at low frequency. 8x8 DCT Example 112
  • 113. 8x8 pixels are coded and the lowest N out of 64 coefficients are retained for inverse DCT 8x8 DCT Example 113
  • 114. DCT coding with increasingly coarse quantization, block size 8x8 Typical DCT Coding Artifacts Quantizer Stepsize For AC Coefficients: 25 Quantizer Stepsize For AC Coefficients: 100 Quantizer Stepsize For AC Coefficients: 200 114
  • 115. 115 Discrete Cosine Transform (4×4 DCT basis patterns) 4 × 4 DCT basis patterns
  • 116. 116 Image section showing 4 × 4 block Original block DCT coefficients 4x4 DCT Example
  • 117. 117 Original block DCT coefficients Block reconstructed from 1, 2, 3, 5 coefficients 4x4 DCT Example
  • 118. 118 4 × 4 DCT basis patterns 8 × 8 DCT basis patterns Discrete Cosine Transform (DCT basis patterns Comparison)
  • 119. Top Field and Bottom Field Pixels 119
  • 120. 120 Luminance MB structure in frame-organized DCT coding (for slow moving) Luminance MB in field-organized DCT coding (for fast moving) Blocks (8×8)MB (16×16) Frame Type DCT vs. Field Type DCT Blocks (8×8)MB (16×16)
  • 121. 121 Frame Type DCT vs. Field Type DCT
  • 122. − The significant DCT coefficients of a block of image or residual samples are typically the ‘low frequency’ positions around the DC (0,0) coefficient. − Figure plots the probability of non-zero DCT coefficients at each position in an 8 × 8 block. − The non-zero DCT coefficients are clustered around the top-left (DC) coefficient and the distribution is roughly symmetrical in the horizontal and vertical directions. 122 DCT Coefficient Distribution 8 × 8 DCT coefficient distribution (Frame)
  • 123. − Histograms for 8x8 DCT coefficient amplitudes measured for natural images (from Mauersberger). − DC coefficient is typically uniformly distributed. − For the other coefficients, the distribution resembles a Laplacian pdf. 123 Amplitude Distribution of the DCT Coefficients
  • 124. − Figure plots the probability of non-zero DCT coefficients for a residual field. − The coefficients are clustered around the DC position but are ‘skewed’, i.e. more non-zero coefficients occur along the left-hand edge of the plot. − This is because a field picture may have a stronger high-frequency component in the vertical axis due to the subsampling in the vertical direction, resulting in larger DCT coefficients corresponding to vertical frequencies. 124 DCT Coefficient Distribution 8 × 8 DCT coefficient distribution (Field)
  • 125. − The zig-zag scan may not be ideal for a field block because of the skewed coefficient distribution, and a modified scan order may be more effective for some field blocks, in which coefficients on the left hand side of the block are scanned before the right hand side. 125 DCT Coefficient Scan Zigzag scan example : frame block Zigzag scan example : field block
  • 128. • . Normally small numbers Normally big numbers Discrete Cosine Transform 128
  • 129. • . Normally big numbers Normally small numbers Redundancy Entropy 129 Discrete Cosine Transform
  • 130. 130 3-Dimensional DCT − Remove spatiotemporal correlation − Good for low motion video − Bad for high motion video − Frame storage → Large delay 1 1 1 3 0 0 0 8 (2 1) (2 1) (2 1) ( , , ) ( ) ( ) ( ) ( , , )cos cos cos 2 2 2 N N N t x y x u y v t w F x y t C u C v C w x y t N N N N                               for 0,..., 1 , 0,..., 1 and 0,..., 1 1/ 2 for 0 where 8 and ( ) 1 otherwise u N v N w N k N C k            
  • 131. The transform should – Minimize the correlation among resulting coefficients, so that scalar quantization can be employed without losing too much in coding efficiency compared to vector quantization – Compact the energy into as few coefficients as possible Optimal transform − Karhunan Loeve Transform (KLT) • Signal statistics dependent • It is an optimum transform, for complete decorrelation Suboptimal transform − Discrete Cosine transform (DCT): nearly as good as KLT for common image signals − Hadamard transform with all elements of +1, -1. 131 Why DCT? What Block Size?
  • 132. Properties of the DCT: − Smoothly varying basis vector that matches natural images better (better than Hadamard) − Basis vectors are not sparse (better than DFT, that has many zero valued coefficient at small block sizes) − Basis vectors closely match natural scenes as KLT, but uses a fix and a fast transformation algorithm (better than KLT). 132 Why DCT? What Block Size? 5% 4% 3% 2% 1% 4x4 8x8 16x16 32x32 64x64 Block size Mean-squared-error DFT HT KLT & DCT  Equal number of retained coefficients &
  • 133. Properties of the DCT: − Efficiency as a function of block size NxN, measured for 8 bit quantization in the original domain and equivalent quantization in the transform domain − Block size 8x8 is a good compromise. 133 Efficiency Why DCT? What Block Size?
  • 134. − Wavelet is a non-periodic element, i.e. a mini wave. − Uses a set of ‘mother wavelets’. − Scale and transform actions possible. − Better at high frequency capture. − Less visual degradation than DCT. − Graceful degradation at high compression. − Good for audio compression. Wavelet Coding 134
  • 136. The ‘wavelet transform’ is based on sets of filters with coefficients that are equivalent to discrete wavelet functions − A pair of filters is applied to the signal to decompose it into a low frequency band (L) and a high frequency band (H). − Each band is subsampled by a factor of two, so that the two frequency bands each contain N/2 samples. − With the correct choice of filters, this operation is reversible. 136 Wavelet
  • 137. − This approach may be extended to apply to a 2-dimensional signal such as an intensity image. − Each row of a 2D image is filtered with a low-pass and a high-pass filter (Lx and Hx) − The output of each filter is down-sampled by a factor of two to produce the intermediate images L and H. − L is the original image low-pass filtered and downsampled in the x-direction and H is the original image high-pass filtered and downsampled in the x-direction. − Each column of these new images is filtered with low- and high-pass filters (Ly and Hy) − The output of each filter is down-sampled by a factor of two to produce four sub-images LL, LH, HL and HH. 137 Wavelet
  • 138. • ‘LL’ is the original image, low-pass filtered in horizontal and vertical directions and subsampled by a factor of two. • ‘HL’ is high-pass filtered in the vertical direction and contains residual vertical frequencies • ‘LH’ is high-pass filtered in the horizontal direction and contains residual horizontal frequencies • ‘HH’ is high-pass filtered in both horizontal and vertical directions. − Between them, the four sub-band images contain all of the information present in the original image but the sparse nature of the LH, HL and HH sub-bands makes them amenable to compression. 138 Wavelet
  • 139. − In an image compression application, the 2-dimensional wavelet decomposition is applied again to the ‘LL’ image, forming four new sub-band images. − The resulting low-pass image, always the top-left sub-band image, is iteratively filtered to create a tree of sub-band images. 139 Wavelet
  • 140. − Many of the samples (coefficients) in the higher-frequency sub-band images are close to zero, shown here as near-black, and it is possible to achieve compression by removing these insignificant coefficients prior to transmission. − At the decoder, the original image is reconstructed by repeated up-sampling, filtering and addition, reversing the order of operations. 140 Wavelet
  • 142. Many coefficients in higher sub-bands, towards the bottom-right of the figure, are near zero and may be quantized to zero without significant loss of image quality. − Non-zero coefficients tend to be related to structures in the image; for example, the violin bow appears as a clear horizontal structure in all the horizontal and diagonal sub- bands. − When a coefficient in a lower-frequency sub-band is non-zero, there is a strong probability that coefficients in the corresponding position in higher frequency sub-bands will also be non-zero. 142 Wavelet Coefficient Scan A typical distribution of 2D wavelet coefficients
  • 143. We may consider a ‘tree’ of non-zero quantized coefficients, starting with a ‘root’ in a low-frequency sub-band. − A single coefficient in the LL band of layer 1 has one corresponding coefficient in each of the other bands of layer 1, i.e. these four coefficients correspond to the same region in the original image. − The layer 1 coefficient position (parent coefficient) maps to four corresponding child coefficient positions in each sub- band at layer 2. − Recall that the layer 2 sub-bands have twice the horizontal and vertical resolution of the layer 1 sub-bands. 143 Wavelet Coefficient Scan LL child coefficient child coefficient child coefficient root Parent coefficient Parent coefficient Parent coefficient
  • 144. − Idea: Conditional coding of all descendants (incl. children) − significant coefficients: Coefficient magnitude > Threshold − Four cases (The coefficients are coded by symbol P, N, ZTR, or IZ) • ZTR (Zero Tree Root): coefficient and all descendants are not significant • IZ (Isolated Zero): coefficient is not significant, but some descendants are significant • POS: POSitive significant (greater than the given threshold) • NEG: NEGative significant (greater than the given threshold ) 144 Zero Tree Encoding (Embedded Zero-tree Wavelet Algorithm)
  • 145. − It is desirable to encode the non-zero wavelet coefficients as compactly as possible prior to entropy coding. − An efficient way of achieving this is to encode each tree of non-zero coefficients starting from the lowest or root level of the decomposition. − A coefficient at the lowest layer is encoded, followed by its child coefficients at the next layer up, and so on. The encoding process continues until the tree reaches a zero-valued coefficient. − Further children of a zero-valued coefficient are likely to be zero themselves and so the remaining children are represented by a single code that identifies a tree of zeros (zero tree). − The decoder reconstructs the coefficient map starting from the root of each tree; non-zero coefficients are decoded and reconstructed and when a zerotree code is reached, all remaining ‘children’ are set to zero. − This is the basis of the embedded zero tree (EZW) method of encoding wavelet coefficients. 145 Zero Tree Encoding
  • 147. Transformation does not result in compression by its own − Due to linearity of transformation, energy in pixel domain= energy in transform domain − But transformation concentrate the energy in a few transform coefficients − It is the Quantisation of transform coefficients that lead to compression (bit rate reduction) − Small valued transform coefficients are set to zero 147 Quantisation of DCT Coefficients
  • 148. − A quantizer maps a signal with a range of values X to a quantized signal with a reduced range of values Y. − It should be possible to represent the quantized signal with fewer bits than the original since the range of possible values is smaller. − A scalar quantizer maps one sample of the input signal to one quantized output value 148 Quantisation Quantizer (Mapping) X Y (with reduced range) Y is presented with fewer bits
  • 149. − A more general example of a uniform quantizer is: 149 Scalar Quantization Quantizer (Mapping) 𝑋 𝑌 = 𝐹𝑄. 𝑄𝑃 𝐹𝑄 = 𝑅𝑜𝑢𝑛𝑑 ( 𝑋 𝑄𝑃 ) 𝑄𝑃: 𝑎 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 ‘𝑠𝑡𝑒𝑝 𝑠𝑖𝑧𝑒’ 𝐹𝑄: 𝑓𝑜𝑟𝑤𝑎𝑟𝑑 𝑞𝑢𝑎𝑛𝑡𝑖𝑧𝑒𝑟
  • 150. − In image and video compression CODECs, the quantization operation is usually made up of two parts, a forward quantizer FQ in the encoder and an ‘inverse quantizer’ or ‘rescaler’ (IQ) in the decoder. − If the step size is large, the range of quantized values is small and can therefore be efficiently represented and hence highly compressed during transmission, but the re-scaled values are a crude approximation to the original signal. − If the step size is small, the re-scaled values match the original signal more closely but the larger range of quantized values reduces compression efficiency. 150 Quantization Encoder (FQ: Forward Quantizer ) Decoder (IQ: Inverse Quantizer) 𝑌 = 𝐹𝑄. 𝑄𝑃 𝐹𝑄 = 𝑅𝑜𝑢𝑛𝑑 ( 𝑋 𝑄𝑃 ) 𝑋
  • 151. 151 Linear and non-Linear Scalar Quantizer
  • 152. − A vector quantizer maps a set of input data such as a block of image samples to a single value (codeword) and at the decoder, each codeword maps to an approximation to the original set of input data, a ‘vector’. − The set of vectors are stored at the encoder and decoder in a codebook. 152 Vector Quantization Vector Quantizer (Mapping) A Set Of Input Data A Single Value (Codeword)
  • 153. 1. Partition the original image into regions such as N × N pixel blocks. 2. Choose a vector from the codebook that matches the current region as closely as possible. 3. Transmit an index that identifies the chosen vector to the decoder. 4. At the decoder, reconstruct an approximate copy of the region using the selected vector. 153 A typical application of Vector Quantization − Here, quantization is applied in the image (spatial) domain, i.e. groups of image samples are quantized as vectors − But it can equally be applied to motion compensated and/or transformed data. Key issues: the design of the codebook and efficient searching of the codebook to find the optimal vector.
  • 154. 154 1 2 n . . . source image codebook 1 2 n . . . codebook i index of nearest codeword decoded image Vector Quantization
  • 156. Equal distances between adjacent decision levels and between adjacent reconstruction levels 𝒕𝒍 − 𝒕𝒍−𝟏 = 𝒓𝒍 − 𝒓𝒍−𝟏 = 𝒒 • Parameters of Uniform Quantization – R: Bit Resolution – L: Levels (𝑳 = 𝟐 𝑹) – B: Dynamic Range of input 𝑩 = 𝒇 𝒎𝒂𝒙 – 𝒇 𝒎𝒊𝒏 – q: Quantization interval (step size) • Quantization function 156 𝒒 = 𝑩 𝑳 = 𝑩. 𝟐−𝑹 Uniform Quantization q q
  • 157. Input signal is continuous • The output of a Charge-Coupled Device (CCD) camera is in the range of 0.0 to 5 volt. • 𝑳 = 𝟐𝟓𝟔 – 𝑞 = 5 / 256 – The output value in the interval (𝒍 × 𝒒, (𝒍 + 𝟏) × 𝒒) is represented by index 𝑙, 𝒍 = 𝟎, … , 𝟐𝟓𝟓. – The reconstruction level 𝑸 𝒇 = 𝒇−𝒇 𝒎𝒊𝒏 𝒒 × 𝒒 + 𝒒 𝟐 + 𝒇 𝒎𝒊𝒏 → 𝒓𝒍 = 𝒍 × 𝒒 + 𝒒 𝟐 𝒍 = 𝟎, … , 𝟐𝟓𝟓. 157 Example 1 of Uniform Quantizer
  • 158. Input signal is discrete • Digital Image of 256 gray levels is quantize it into 4 levels – 𝑞 = 256 4 = 64 – The reconstruction level 𝑸 𝒇 = 𝒇 − 𝒇 𝒎𝒊𝒏 𝒒 × 𝒒 + 𝒒 𝟐 + 𝒇 𝒎𝒊𝒏 → 𝑸 𝒇 = 𝒇 𝟔𝟒 × 𝟔𝟒 + 𝟑𝟐 158 Example 2 of Uniform Quantizer
  • 160. 160 Uniform Threshold Quantiser (UTQ) − The class of quantiser that has been used in all standard video codecs is based around the so- called Uniform Threshold Quantiser (UTQ). − It has equal step sizes with reconstruction values pegged to the centroid of the steps. − The centroid value is typically defined midway between quantisation intervals. q q
  • 161. 161 Uniform Threshold Quantiser (UTQ) and Bit Rate Control − The DC coefficient has a fairly uniform distribution. − Although AC transform coefficients have nonuniform characteristics, and hence can be better quantised with nonuniform quantiser step sizes, but bit rate control would be easier if they were quantised linearly. − Hence, a key property of UTQ is that the step sizes can be easily adapted to facilitate bit rate control. q q
  • 162. 162 Uniform Threshold Quantiser (UTQ) Uniform Threshold Quantiser (UTQ) (a) with and (b) without dead zone UTQ-DZ UTQ
  • 163. 163 Uniform Threshold Quantiser (UTQ) − Typically, UTQ is used for quantising intraframe DC, F(0, 0), coefficients, while UTQ-DZ is used for the AC and the DC coefficients of interframe prediction error. − This is intended primarily to cause more nonsignificant AC coefficients to become zero, thus increasing the compression. For quantising intraframe DC, F(0, 0), coefficientsFor quantizing AC and the DC coefficients of interframe prediction error.
  • 164. 164 Uniform Threshold Quantiser (UTQ) − Both quantisers are derived from the generic quantiser, where in UTQ, th is set to zero, but in UTQ-DZ, it is set to q/2, and in the most inner region the th is allowed to vary between q/2 to q, just to increase the number of zero-valued outputs. → Thus, the dead zone length can be from q to 2q. − In some implementations (e.g. H.263 or MPEG-4), the decision and/or the reconstruction levels of the UTQ-DZ quantiser might be shifted by q/4 or q/2. th is allowed to vary between q/2 to q
  • 165. − In practice, rather than transmitting a quantised coefficient (= 𝑭(𝒖, 𝒗)) to the decoder, its ratio to the quantiser step size, called Quantisation Index, I, is transmitted: − The reason for defining the quantisation index is that it has a much smaller entropy than the quantised coefficient. At the decoder, the reconstructed coefficients, 𝐹 𝑞 (𝑢, 𝑣), after inverse quantisation, are given by − If required, depending on the polarity of the index, an addition or subtraction of half the quantisation step is required to deliver the centroid representation, reflecting the quantisation characteristics in previous slide. 165 Quantization Index Quantizer (Mapping) 𝑫𝑪𝑻 𝒄𝒐𝒆𝒇𝒇𝒊𝒄𝒊𝒆𝒏𝒕 𝑭 𝒖, 𝒗 = 𝑰 𝒖, 𝒗 . 𝒒
  • 166. − For the standard codecs, the quantiser step size q is fixed at 8 for UTQ, but varies from 2 to 62, in even step sizes, for the UTQ- DZ (2,4,6,8,…,60,62). − Hence, the entire quantiser range, or the quantiser parameter Qp, can be defined with 5 bits (1–31). − Uniform quantisers with and without dead zone can also be used in DPCM coding of pixels. Here, threshold is set to zero, th=0, and the quantisers are usually identified with odd and even number of levels, respectively. 166 Quantization Step Size even number of levels odd number of levels
  • 167. One of the main problems of linear quantisers in DPCM is that for lower bit rates, the number of quantisation levels is limited and hence the quantiser step size is large. In coding of plain areas of the picture (In plane areas DPCM output is near zero): − If a quantiser with even number of levels is used, then the reconstructed pixels oscillate between -q/2 and +q/2. − This type of noise at these areas, in particular at low luminance levels, is visible and is called granular noise. − Larger quantiser step sizes with the odd number of levels (dead zone) reduce the granular noise, but cause loss of pixel resolution at the plain areas. − This type of noise when the quantiser step size is relatively large is annoying and is called the contouring noise. 167 Granular and Contouring Noises even number of levels odd number of levels
  • 168. Banding, Contouring Granular noise − It can be seen that when the original analog input signal has a relatively constant amplitude, the reconstructed signal has variations that were not present in the original signal. 168 8 bits 256 Levels 10 bits 1024 Levels Granular and Contouring Noises
  • 169. • . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 238 -43 -12 -14 -6 0 -4 -8 39 12 -9 13 4 -2 -3 -4 -16 12 10 8 -3 7 5 0 -3 -7 1 -3 5 1 -1 0 -7 -12 8 -8 -1 -3 0 2 4 5 -7 1 5 -4 -1 0 -5 -4 2 -3 2 0 1 0 -1 7 -3 -2 1 0 0 0 169 ÷ = DifferentStep-sizes(Q) Quantisation Matrix Quantised DCT CoefficientsDCT Coefficients 238 -43 -12 -14 -6 0 -4 -8 39 12 -9 13 4 -2 -3 -4 -16 12 10 8 -3 7 5 0 -3 -7 1 -3 5 1 -1 0 -7 -12 8 -8 -1 -3 0 2 4 5 -7 1 5 -4 -1 0 -5 -4 2 -3 2 0 1 0 -1 7 -3 -2 1 0 0 0
  • 170. • . Quantisation 238 -43 -12 -14 -6 0 -4 -8 39 12 -9 13 4 -2 -3 -4 -16 12 10 8 -3 7 5 0 -3 -7 1 -3 5 1 -1 0 -7 -12 8 -8 -1 -3 0 2 4 5 -7 1 5 -4 -1 0 -5 -4 2 -3 2 0 0 0 -1 7 -3 -2 1 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 2 1 1 1 1 1 2 2 2 238 -43 -12 -14 -6 0 -4 -8 39 12 -9 13 4 -2 -3 -4 -16 12 10 8 -3 7 5 0 -3 -7 1 -3 5 1 -1 0 -7 -12 8 -8 -1 -3 0 2 4 5 -7 1 5 -4 -1 0 -5 -4 2 -3 2 0 1 0 -1 7 -3 -2 1 0 0 0 170 ÷ = DifferentStep-sizes(Q) Quantisation Matrix Quantised DCT CoefficientsDCT Coefficients
  • 171. • . Quantisation 238 -43 -12 -14 -6 0 -4 -8 39 12 -9 13 4 -2 -3 -4 -16 12 10 8 -3 7 5 0 -3 -7 1 -3 5 1 0 0 -7 -12 8 -8 -1 -1 0 1 4 5 -7 1 2 -2 0 0 -5 -4 2 -1 1 0 0 0 -1 7 -1 -1 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 2 1 1 1 1 1 2 2 2 1 1 1 1 2 2 2 4 1 1 1 2 2 2 4 4 1 1 2 2 2 4 4 4 238 -43 -12 -14 -6 0 -4 -8 39 12 -9 13 4 -2 -3 -4 -16 12 10 8 -3 7 5 0 -3 -7 1 -3 5 1 -1 0 -7 -12 8 -8 -1 -3 0 2 4 5 -7 1 5 -4 -1 0 -5 -4 2 -3 2 0 1 0 -1 7 -3 -2 1 0 0 0 171 Quantisation Matrix Quantised DCT CoefficientsDCT Coefficients ÷ = DifferentStep-sizes(Q) Zig-zag Scanning for Separating Redundancy and Entropy
  • 172. • . Quantisation 238 -43 -12 -14 -6 0 -2 -4 39 12 -9 13 4 -1 -1 -2 -16 12 10 8 -1 3 2 0 -3 -7 1 -1 2 0 0 0 -7 -12 4 -4 0 0 0 0 4 2 -3 0 1 -1 0 0 -2 -2 1 0 0 0 0 0 0 3 0 0 0 0 0 0 1 1 1 1 1 1 2 2 1 1 1 1 1 2 2 2 1 1 1 1 2 2 2 4 1 1 1 2 2 2 4 4 1 1 2 2 2 4 4 4 1 2 2 2 4 4 4 8 2 2 2 4 4 4 8 8 2 2 4 4 4 8 8 8 238 -43 -12 -14 -6 0 -4 -8 39 12 -9 13 4 -2 -3 -4 -16 12 10 8 -3 7 5 0 -3 -7 1 -3 5 1 -1 0 -7 -12 8 -8 -1 -3 0 2 4 5 -7 1 5 -4 -1 0 -5 -4 2 -3 2 0 1 0 -1 7 -3 -2 1 0 0 0 172 Quantisation Matrix Quantised DCT CoefficientsDCT Coefficients ÷ = DifferentStep-sizes(Q) Zig-zag Scanning for Separating Redundancy and Entropy
  • 173. • . Quantisation 1 1 1 2 2 2 4 4 1 1 2 2 2 4 4 4 1 2 2 2 4 4 4 8 2 2 2 4 4 4 8 8 2 2 4 4 4 8 8 8 2 4 4 4 8 8 8 16 4 4 4 8 8 8 16 16 4 4 8 8 8 16 16 16 238 -43 -12 -7 -3 0 -1 -2 39 12 -4 6 2 0 0 -1 -16 6 5 4 0 1 1 0 -1 -3 0 0 1 0 0 0 -3 -6 2 -2 0 0 0 0 2 1 -1 0 0 0 0 0 -1 -1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 238 -43 -12 -14 -6 0 -4 -8 39 12 -9 13 4 -2 -3 -4 -16 12 10 8 -3 7 5 0 -3 -7 1 -3 5 1 -1 0 -7 -12 8 -8 -1 -3 0 2 4 5 -7 1 5 -4 -1 0 -5 -4 2 -3 2 0 1 0 -1 7 -3 -2 1 0 0 0 173 Quantisation Matrix Quantised DCT CoefficientsDCT Coefficients ÷ = DifferentStep-sizes(Q) Zig-zag Scanning for Separating Redundancy and Entropy
  • 174. • . Quantisation 1 2 2 4 4 4 8 8 2 2 4 4 4 8 8 8 2 4 4 4 8 8 8 16 4 4 4 8 8 8 16 16 4 4 8 8 8 16 16 16 4 8 8 8 16 16 16 32 8 8 4 16 16 16 32 32 8 8 16 16 16 32 32 32 238 -21 -6 -3 -1 0 0 -1 19 6 -2 3 1 0 0 0 -8 3 2 2 0 0 0 0 0 -1 0 0 0 0 0 0 -1 -3 1 -1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 238 -43 -12 -14 -6 0 -4 -8 39 12 -9 13 4 -2 -3 -4 -16 12 10 8 -3 7 5 0 -3 -7 1 -3 5 1 -1 0 -7 -12 8 -8 -1 -3 0 2 4 5 -7 1 5 -4 -1 0 -5 -4 2 -3 2 0 1 0 -1 7 -3 -2 1 0 0 0 174 Quantisation Matrix Quantised DCT CoefficientsDCT Coefficients ÷ = DifferentStep-sizes(Q) Zig-zag Scanning for Separating Redundancy and Entropy
  • 175. • . Zig-zag Scanning 175 DC and low frequency coefficients are first and the high frequency coefficients are last.
  • 176. • . Zig-zag Scanning for Separating Redundancy and Entropy 176 RedundancyEntropy DC and low frequency coefficients are first and the high frequency coefficients are last.
  • 177. − Use uniform quantiser for each coefficient − Different coefficients are quantized with different step-sizes (Q): − Human eye is more sensitive to low frequency components • Low frequency coefficients with a smaller Q • High frequency coefficients with a larger Q − Specified in a normalization matrix (Standard Quantization Matrix) − Normalization matrix can then be scaled by a scale factor 177 Different Step-sizes (Q) (JPEG Standard Quantization Matrix)
  • 178. In JPEG we have quality level from 1 to 100. − With a quality level 50 we get high compression and excellent decompressed image quality (Standard Quantization Matrix). − For a quality level grater than 5o (less compression, higher image quality), the standard quantization matrix is multiplied by 𝟏𝟎𝟎 − 𝑸𝒖𝒂𝒍𝒊𝒕𝒚 𝑳𝒆𝒗𝒆𝒍 𝟓𝟎 − For a quality level less than 50 (more compression, lower image quality), the standard quantizarion matrix is multiplied by 𝟓𝟎 𝑸𝒖𝒂𝒍𝒊𝒕𝒚 𝑳𝒆𝒗𝒆𝒍 − The quantization matrix is then rounded and clipped to have positive integer values ranging from 1 to 255. 178 Ex: Different Quality Level in JPEG by Quantization Matrix (JPEG Standard Quantization Matrix)
  • 179. By subtracting 128 from each array. Because the DCT is designed to work on pixels values ranging from -128 to 127. D TMT  179 Ex: Different Quality Level in JPEG by Quantization Matrix
  • 180. 180 Ex: Different Quality Level in JPEG by Quantization Matrix (Standard Quantization Matrix)
  • 181. 181 (Standard Quantization Matrix) Ex: Quantization with Matrix Q50 in JPEG
  • 182. 182 Ex: Inverse Quantitation in JPEG (Standard Quantization Matrix)
  • 183. 183 Ex: Inverse DCT and adding 128 in JPEG N=
  • 184. 184 Ex: Comparison between Original and Decompressed Block
  • 188. Example: Quantized Indices Default Normalization Matrix in JPEG 188 (Standard Quantization Matrix) QM(i,j)
  • 189. Quantized coefficients ratios to their quantizer step sizes give the indices Previous matrix elements (Normalization Matrix) Example: Quantized Indices 189
  • 190. Multiple of indices to the step size results in quantized coefficients values to be used for inverse transform Example: Quantized Coefficients 190
  • 192. 192 Quantization Noise and Bit Resolution
  • 193. 193 Quantization Noise Zoom in of Staircase − Pink dots show that analog range that maps to an ADC Value. − Black arrows show the Quantization error for 2 points. PDF of Quantization Error Slope=1
  • 194. − Quantization error is uniformly distributed. − Integrates to 1 194 Quantization Noise Slope=1
  • 195. − RMS value for a full scale sinusoidal input is: − Then 195 Quantization Noise and SQNR 2 𝑁. ∆ = 𝑺𝑸𝑵𝑹 (𝒅𝑩)
  • 196. 196 𝑆𝑄𝑁𝑅 = 10 log 𝑅𝑀𝑆 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟 𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟 = 6𝐵 + 1.78 𝑃𝑆𝑁𝑅 = 10 log 𝑃𝑒𝑎𝑘 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟 𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟 =? 𝑃𝑒𝑎𝑘 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟 𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟 = 𝑃𝑒𝑎𝑘 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟 𝑅𝑀𝑆 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟 × 𝑅𝑀𝑆 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟 𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟 𝑃𝑒𝑎𝑘 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟 𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟 =( (2𝐴)2 ( 𝐴 2 )2 )× 𝑅𝑀𝑆 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟 𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟 =8 𝑅𝑀𝑆 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟 𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟 𝑷𝑺𝑵𝑹 = 10 log 8 × 𝑅𝑀𝑆 𝑆𝑖𝑔𝑛𝑎𝑙 𝑃𝑜𝑤𝑒𝑟 𝑅𝑀𝑆 𝑄𝑢𝑎𝑛𝑡𝑖𝑧𝑎𝑡𝑖𝑜𝑛 𝑁𝑜𝑖𝑠𝑒 𝑃𝑜𝑤𝑒𝑟 = 10 log 8 + (6𝐵 + 1.78) ≈ 𝟔𝑩 + 𝟏𝟏 (𝒅𝑩) PSNR for a Sin Waveform 2𝐴
  • 198. 198 Elementary Information Theory − How much information does a symbol convey? − Intuitively, the more unpredictable or surprising it is, the more information is conveyed. − Conversely, if we strongly expected something, and it occurs, we have not learnt very much
  • 199. 199 Elementary Information Theory − If p is the probability that a symbol will occur − The amount of information, I, conveyed is: − The information, I, is measured in bits − It is the optimum code length for the symbol − The entropy, H, is the average information per symbol − Provides a lower bound on the compression that can be achieved 𝑰 = 𝐥𝐨𝐠 𝟐 𝟏 𝒑 𝐻 = ෍ 𝑠 𝑝 𝑠 log2 1 𝑝(𝑠)
  • 200. 200 Elementary Information Theory A simple example − Suppose we need to transmit four possible weather conditions: 1. Sunny 2. Cloudy 3. Rainy 4. Snowy − If all conditions are equally likely, p(s)=0.25→H=2 – i.e. we need a minimum of 2 bits per symbol
  • 201. 201 Elementary Information Theory A simple example − Suppose we need to transmit four possible weather conditions: 1. Sunny 0.5 of the time 2. Cloudy 0.25 of the time 3. Rainy 0.125 of the time 4. Snowy 0.125 of the time − Then the entropy is − i.e. we need a minimum of 1.75 bits per symbol 75.175.05.05.0 3125.02225.015.0 125.0 1 log125.02 25.0 1 log25.0 5.0 1 log5.0 222    H H H
  • 202. – It reduces amount of data or bit rate. – Truly lossless. – Different types • Fractal Coding • Run Length Coding (RLC) or Run Level Encoding • Variable Length Coding (VLC)– [ie Huffman/Arithmetic] • Wavelet Coding – Compression systems often do not use all of them together. – Some systems combine different types. Entropy Coding 202
  • 203. − Resulting from studies by Benoit Mandlebrot. − Images are self similar. − Self similar shapes are called fractals. − Scale, stretch, rotate, mirror, skew actions possible. − Computationally intensive. − Requires multiple sweeps. − Difficult to do on video in real time. Fractal Coding 203
  • 204. 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 19[2] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 24[0] 2 3 4 5 6 7 8 9 10 11 12 13 14 15 2:15 8 7 4 5 6 8 6 8 0 2 3 8 9 3 0 2 3 5 4 7 2 1 5 2 5326214 5472 = 1 152 = 4 023 = 2 8745 = 5 6868 = 3 893 = 6 • Replaces runs of the same number with a code … … or • Particular strings of numbers with a code. 204 Run Length Coding
  • 205. 205 Run Length Coding Sample Block Zigzag Scanning (MPEG-2) for doing RLC
  • 206. 206 Run Length Coding Sample Block Run-length Encoding (MPEG-2)
  • 207. The output of the re-ordering process of transform coefficient is an array that typically contains one or more clusters of non-zero coefficients near the start, followed by strings of zero coefficients. − The large number of zero values may be encoded to represent them more compactly. − The array of re-ordered coefficients are represented as (run,level) pairs where run: indicates the number of zeros preceding a non-zero coefficient. level: indicates the magnitude of the non-zero coefficient. 207 Run-Level Encoding
  • 208. Example 1. Input array: 16,0,0,−3,5,6,0,0,0,0,−7 2. Output values: (0,16),(2,−3),(0,5),(0,6),(4,−7) 3. Each of these output values (run , level) is encoded as a separate symbol by the entropy encoder. Three-dimensional’ Run-level Encoding If ‘three-dimensional’ run-level encoding is used, each symbol encodes three quantities, run, level and last. In the example above, if –7 is the final non-zero coefficient, the 3-D values are: (0, 16, 0), (2, −3, 0), (0, 5, 0), (0, 6, 0), (4, −7, 1) The 1 in the final code indicates that this is the last non-zero coefficient in the block. 208 Run-Level Encoding
  • 209. No. Code 0 = 0 +1 = 101 -1 = 100 +2 = 1101 -2 = 1100 +3 = 11101 -3 = 11100 +4 = 111101 -4 = 111100 +5 = 1111101 -5 = 1111100 . . . . Code table Original numbers Codes +1 -3 0 0 +4 -5 +2 -1 0 +1 +3 101101111001011110001011110000101111000011110101111000011110111111001101100010111101 11 x 8bits = 88 bits 39 bits Variable Length Coding 209 Commonly occurring numbers Rare occurring numbers
  • 210. • . Code table Regenerated numbers Codes 101111000011110111111001101100010111101 +1+1 -3+1 -3 0+1 -3 0 0+1 -3 0 0 +4+1 -3 0 0 +4 -5 +2 -1 0 +1 +3 No. Code 0 = 0 +1 = 101 -1 = 100 +2 = 1101 -2 = 1100 +3 = 11101 -3 = 11100 +4 = 111101 -4 = 111100 +5 = 1111101 -5 = 1111100 . . . . Variable Length Coding 210 Commonly occurring numbers Rare occurring numbers
  • 211. 211 Variable Length Coding Variable-length Encoding of Sample Block Coefficients (MPEG-2)
  • 212. – True data reduction. – Totally lossless. – Replaces numbers with codes. • Run length coding can also be called entropy coding. – Commonly occurring numbers have a small code & rare numbers have a bigger code. – Relies on common numbers occurring a lot. Variable Length Coding 212
  • 213. − The lengths of the codes should vary inversely with the probability of occurrences of the various symbols in VLC. − The bit rate required to code these symbols is the inverse of the logarithm of probability, p, at base 2 (bits), that is,𝐥𝐨𝐠 𝟐 𝟏/𝐩. − Hence, the entropy of the symbols, which is the minimum average bits required to code the symbols, can be calculated as There are two types of VLC, Huffman and Arithmetic coding. − It is noted that Huffman coding is a simple VLC, but its compression can never reach as low as the entropy due to the constraint that the assigned symbols must have an integral number of bits. − However, the arithmetic coding can approach the entropy since the symbols are not coded individually. 213 Variable Length Coding 𝐻 𝑥 = ෍ 𝑠 𝑝 𝑠 log2 1 𝑝(𝑠) = − ෍ 𝑖=0 𝑛 𝑃𝑖 log2 𝑃𝑖
  • 214. Huffman Coding − Huffman codes can be used to compress information − Like WinZip – although WinZip doesn’t use the Huffman algorithm − JPEGs do use Huffman as part of their compression process − The basic idea is that instead of storing each character in a file as an 8-bit ASCII value, we will instead store the more frequently occurring characters using fewer bits and less frequently occurring characters using more bits − On average this should decrease the filesize (usually ½) 214
  • 215. Huffman Coding − As an example, lets take the string: “duke blue devils” − First, a frequency count of the characters: e:3, d:2, u:2, l:2, space:2, k:1, b:1, v:1, i:1, s:1 − Next, use a Greedy algorithm to build up a Huffman Tree − We start with nodes for each character e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1 215
  • 216. Huffman Coding To pick the nodes with the smallest frequency and combine them together to form a new node – The selection of these nodes is the Greedy part • The two selected nodes are removed from the set, but replace by the combined node • This continues until we have only 1 node left in the set 216 e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1i,1 s,1
  • 217. Huffman Coding e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1 217
  • 218. Huffman Coding e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1 2 218
  • 219. Huffman Coding e,3 d,2 u,2 l,2 sp,2 k,1 b,1 v,1 i,1 s,1 22 219
  • 220. Huffman Coding e,3 d,2 u,2 l,2 sp,2 k,1 i,1 s,1 2 b,1 v,1 2 3 220
  • 221. Huffman Coding e,3 d,2 u,2 l,2 sp,2 k,1 i,1 s,1 2 b,1 v,1 2 34 221
  • 222. Huffman Coding e,3 d,2 u,2 l,2 sp,2 k,1 i,1 s,1 2 b,1 v,1 2 344 222
  • 223. Huffman Coding e,3 d,2 u,2 l,2 sp,2 k,1i,1 s,1 2 b,1 v,1 2 3 44 5 223
  • 224. Huffman Coding e,3 d,2 u,2 l,2 sp,2 k,1i,1 s,1 2 b,1 v,1 2 3 4 4 57 224
  • 225. Huffman Coding e,3 d,2 u,2 l,2 sp,2 k,1i,1 s,1 2 b,1 v,1 2 3 44 5 7 9 225
  • 226. Huffman Coding e,3 d,2 u,2 l,2 sp,2 k,1i,1 s,1 2 b,1 v,1 2 3 44 5 7 9 16 226
  • 227. Huffman Coding e 00 d 010 u 011 l 100 sp 101 i 1100 s 1101 k 1110 b 11110 v 11111 − Now we assign codes to the tree by placing – 0 on every left branch – 1 on every right branch − A traversal of the tree from root to leaf give the Huffman code for that particular leaf character − Note that no code is the prefix of another code 227 e:3, d:2, u:2, l:2, space:2, k:1, b:1, v:1, i:1, s:1 0 1 0 0 0 0 0 1 1 1 1 Root 1 1 11 0 0 0 l 100 e,3 d,2 u,2 l,2 sp,2 k,1i,1 s,1 2 b,1 v,1 2 3 44 5 7 9 16
  • 228. Huffman Coding − These codes are then used to encode the string − Thus, “duke blue devils” turns into: 010 011 1110 00 101 11110 100 011 00 101 010 00 11111 1100 100 1101 − When grouped into 8-bit bytes: 01001111 10001011 11101000 11001010 10001111 11100100 1101xxxx − Thus it takes 7 bytes of space (as compressed) − Compare it to 16 characters with 1 byte/char → 16 bytes uncompressed 228 e 00 d 010 u 011 l 100 sp 101 i 1100 s 1101 k 1110 b 11110 v 11111 0 1 0 0 0 0 0 1 1 1 1 Root 1 1 11 0 0 0 e,3 d,2 u,2 l,2 sp,2 k,1i,1 s,1 2 b,1 v,1 2 3 44 5 7 9 16
  • 229. Huffman Coding − Uncompressing works by reading in the file bit by bit • After getting the first bit, start from the root of the tree • If a 0 is read, head left • If a 1 is read, head right • When a leaf is reached decode that character and start over again at the root of the tree − Thus, we need to save Huffman table information as a header in the compressed file • Doesn’t add a significant amount of size to the file for large files (which are the ones you want to compress anyway) • Or we could use a fixed universal set of codes / frequencies. 229 e 00 d 010 u 011 l 100 sp 101 i 1100 s 1101 k 1110 b 11110 v 11111 0 1 0 0 0 0 0 1 1 1 1 Root 1 1 11 0 0 0 e,3 d,2 u,2 l,2 sp,2 k,1i,1 s,1 2 b,1 v,1 2 3 44 5 7 9 16
  • 230. − Table lists the probabilities of the most commonly-occurring motion vectors in the encoded sequence and their information content, 𝐥𝐨𝐠 𝟐 𝟏/𝐩. − To achieve optimum compression, each value should be represented with exactly 𝐥𝐨𝐠 𝟐 𝟏/𝐩 bits. − ‘0’ is the most common value and the probability drops for larger motion vectors. 230 Example : Huffman Coding, Sequence of Motion Vectors
  • 231. 1. Generating the Huffman code tree − To generate a Huffman code table for this set of data, the following iterative procedure is carried out. The procedure is repeated until there is a single ‘root’ node that contains all other nodes and data items listed ‘beneath’ it. 1. Order the list of data in increasing order of probability. 2. Combine the two lowest-probability data items into a ‘node’ and assign the joint probability of the data items to this node. 3. Re-order the remaining data items and node(s) in increasing order of probability and repeat step 2. 231 Example : Huffman Coding, Sequence of Motion Vectors P=0.6
  • 232. P=0.6 1. Generating the Huffman code tree (Cont.) − Original list: The data items are shown as square boxes. Vectors (−2) and (+2) have the lowest probability and these are the first candidates for merging to form node ‘A’. − Stage 1: The newly-created node ‘A’, shown as a circle, has a probability of 0.2, from the combined probabilities of (−2) and (2). There are now three items with probability 0.2. Choose vectors (−1) and (1) and merge to form node ‘B’. − Stage 2: A now has the lowest probability (0.2) followed by B and the vector 0; choose A and B as the next candidates for merging to form ‘C’. − Stage 3: Node C and vector (0) are merged to form ‘D’. Final tree: The data items have all been incorporated into a binary ‘tree’ containing five data values and four nodes. Each data item is a ‘leaf’ of the tree. 232 Example : Huffman Coding, Sequence of Motion Vectors
  • 233. P=0.6 2. Encoding − Each ‘leaf’ of the binary tree is mapped to a variable-length code. To find this code, the tree is traversed from the root node, D in this case, to the leaf or data item. − For every branch, a 0 or 1 is appended to the code, 0 for an upper branch, 1 for a lower branch. − The lengths of the Huffman codes, each an integral number of bits, do not match the ideal lengths given by log2 1/p. − For example, the series of vectors (1, 0, −2) would be transmitted as the binary sequence 0111000. 233 Example : Huffman Coding, Sequence of Motion Vectors
  • 234. 3. Decoding − The decoder must have a local copy of the Huffman code tree or look-up table (Note that once the tree has been generated in Encoding, the codes may be stored in a look-up table). − This may be achieved by transmitting the look-up table itself or by sending the list of data and probabilities prior to sending the coded data. − Each uniquely-decodeable code is converted back to the original data. 234 Example : Huffman Coding, Sequence oF Motion Vectors P=0.6
  • 236. − The Huffman coding process has two disadvantages for a practical video CODEC. I. The encoder needs to transmit the information contained in the probability table before the decoder can decode the bit stream and this extra overhead reduces compression efficiency, particularly for shorter video sequences. II. The probability table for a large video sequence(to generate the Huffman tree) cannot be calculated until after the video data is encoded which may introduce an unacceptable delay into the encoding process. − For these reasons, image and video coding standards define sets of codewords based on the probability distributions of ‘generic’ video material. − The main differences from ‘true’ Huffman coding are I. The codewords are pre-calculated based on ‘generic’ probability distributions II. In the case of TCOEF (Transform coefficient), only 102 commonly-occurring symbols have defined codewords and any other symbol encoded using a fixed-length code. 236 Pre-calculated Huffman-based Coding
  • 237. 237 Pre-calculated Huffman-based Coding MPEG4 TCOEF VLCs (partial) (Some of the codes shown in left table are represented in ‘tree’ form in this figure) MPEG-4 Visual Transform Coefficient (TCOEF) VLCs : partial, all codes < 9 bits MPEG4 Motion Vector Difference (MVD) VLCs − The following two examples of pre-calculated VLC tables are taken from MPEG-4 Visual (Simple Profile).
  • 238. − Minimum assigned bit is 1, but for highly probable symbols it can be much less (e.g. − 𝐥𝐨𝐠 𝟐 𝟎. 𝟗𝟓 ≈ 𝟎 − A scheme using an integral number of bits for each data symbol such as Huffman coding is unlikely to come so close to the optimum number of bits − The fractional bits can only be assigned, if symbols are coded together: − Some with high bits and some with ZERO bits − This is possible if ZERO bit is assigned to high probable symbols − Arithmetic coding does this! 238 Problems with Huffman
  • 239. – A form of variable length coding. – Better than Huffman coding. – Takes longer to do than Huffman coding. – More delicate than Huffman coding. – More limiting than Huffman coding. – Subject to patents and royalty payments. – IBM, AT&T, Mitsubishi. Arithmetic Coding 239
  • 240. The fundamental idea is to use a scale in which the coding intervals of real numbers between 0 and 1 are represented. – This is in fact the cumulative probability density function of all the symbols which add up to 1. – The interval needed to represent the message becomes smaller as the message becomes longer, and the number of bits needed to specify that interval is increased. – According to the symbol probabilities generated by the model, the size of the interval is reduced by successive symbols of the message. – The more likely symbols reduce the range less than the less likely ones and hence they contribute fewer bits to the message. Arithmetic Coding 240
  • 241. – Once the symbol probability is known, each individual symbol needs to be assigned a portion of the [0, 1) range that corresponds to its probability of appearance in the cumulative density function. – The character range is [lower, upper). – The most significant portion of an arithmetic coded message is the first symbol to be encoded. – Ex: Message eaii! → • The first symbol to be coded is e • The symbol ! that is known by both decoder and encoder is used for the end of decoding symbol, and the decoding process is terminated. – After the first character is encoded, we know that the lower number and the upper number now bind our range for the output. – Each new symbol to be encoded will further restrict the possible range of the output number during the rest of the encoding process. Arithmetic Coding 241
  • 242. Symbol Probability Range a 0.2 [0.0, 0.2) e 0.3 [0.2, 0.5) i 0.1 [0.5, 0.6) o 0.2 [0.6, 0.8) u 0.1 [0.8, 0.9) ! 0.1 [0.9, 1.0) New character Range Initially: [0, 1) After seeing a symbol: e [0.2, 0.5) a [0.2, 0.26) i [0.23, 0.236) i [0.233, 0.2336) ! [0.23354, 0.2336) Arithmetic Coding 242 Example:1, To code a set of symbols eaii! – To explain how arithmetic coding works, a fixed-model arithmetic code is used in the example for easy illustration. – Suppose the alphabet is {a, e, i, o, u, !}, and the fixed model is used with the probabilities shown in Table. – Ex: The final coded message has to be a number greater than or equal to 0.2 and less than 0.5 for e. The final range, [0.23354, 0.2336), represents the message eaii!. This means that if we transmit any number in the range of 0.23354 ≤ x < 0.2336, that number represents the whole message of eaii!.
  • 243. ! o u e i a Nothing ! o u e i a ! o u e i a ! o u e i a ! o u e i a 0.5 e a i ! ! o u e i a i0.2360.26 0.2336 0.2 0.2336 0.2 0.23 0.233 0.23354 0.0+(0.2)x(1.0) =0.2 0.0+(0.5)x(1.0) =0.5 0.2+(0.0)x(0.3) =0.2 0.2+(0.2)x(0.3) =0.26 0.2+(0.5)x(0.06) =0.23 0.2+(0.6)x(0.06) =0.236 0.23+(0.5)x(0.006) =0.233 0.23+(0.6)x(0.006) =0.2336 0.233+(0.9)x(0.0006) =0.23354 0.233+(1.0)x(0.0006) =0.2336 Encode number =(0.5)x(0.00006)+0.23354=0.23355 243 Symbol Probability Range a 0.2 [0.0, 0.2) e 0.3 [0.2, 0.5) i 0.1 [0.5, 0.6) o 0.2 [0.6, 0.8) u 0.1 [0.8, 0.9) ! 0.1 [0.9, 1.0) Ex1: To code a set of symbols eaii! To encode number of e 1.0 0.9 0.8 0.6 0.5 0.2 0.0 1.0 0.3 0.06 0.006 0.0006 0.00006 1.0 0.9 0.8 0.6 0.5 0.2 0.0 1.0 0.9 0.8 0.6 0.5 0.2 0.0 1.0 0.9 0.8 0.6 0.5 0.2 0.0 1.0 0.9 0.8 0.6 0.5 0.2 0.0 1.0 0.9 0.8 0.6 0.5 0.2 0.0 The final range, [0.23354, 0.2336), represents the message eaii!. This means that if we transmit any number in the range of 0.23354 ≤ x < 0.2336, that number represents the whole message of eaii!.
  • 244. ! o u e i a 1.0 0.0 ! o u e i a ! o u e i a ! o u e i a a i ! ! o u e i a i 0.120.2 0.112 0.0 0.112 0.1 0.11 0.1118 (0.0)x(1.0) +0.0=0.0 (0.2)x(1.0) +0.0=0.2 (0.5)x(0.2) +0.0=0.1 (0.6)x(0.2) +0.0=0.12 (0.5)x(0.02) +0.1=0.11 (0.6)x(0.02) +0.1=0.112 (0.9)x(0.002) +0.11=0.1118 (1.0)x(0.002) +0.1=0.112 Encode number =(0.2)x(0.0002)+0.1118=0.1119 To encode number of a 244 Ex2: To code a set of symbols aii! 0.9 0.8 0.6 0.5 0.2 Symbol Probability Range a 0.2 [0.0, 0.2) e 0.3 [0.2, 0.5) i 0.1 [0.5, 0.6) o 0.2 [0.6, 0.8) u 0.1 [0.8, 0.9) ! 0.1 [0.9, 1.0) The final range, [0.1118, 0.112), represents the message aii!. This means that if we transmit any number in the range of 0.1118 ≤ x < 0.112, that number represents the whole message of aii!. Arithmetic Coding
  • 245. Decoding for Ex: 1 − In general, the decoding process can be formulated as: • Where 𝑅 𝑛 is a code within the range of lower value 𝐿 𝑛 and upper value 𝑈 𝑛 of the nth symbol. • 𝑅 𝑛+1 is the code for the next symbol. Arithmetic Coding 245 𝑹 𝒏+𝟏 = 𝑹 𝒏 − 𝑳 𝒏 𝑼 𝒏 − 𝑳 𝒏 Corresponding Range (𝑳 𝒏, 𝑼 𝒏] Output symbol 𝑅𝑒𝑐𝑖𝑒𝑣𝑒𝑑 𝐶𝑜𝑑𝑒= 0.23355 [0.2, 0.5) e 𝑹 𝒏+𝟏 = 0.23355−0.2 0.5−0.2 =0.11185 [0, 0.2) a 𝑹 𝒏+𝟏 = 0.11185−0 0.2−0 =0.55925 [0.5, 0.6) i 𝑹 𝒏+𝟏 = 0.55925 −0.5 0.6−0.5 =0.5925 [0.5, 0.6) i 𝑹 𝒏+𝟏 = 0.5925−0.5 0.6−0.5 =0.925 [0.9, 1) ! Symbol Probability Range a 0.2 [0.0, 0.2) e 0.3 [0.2, 0.5) i 0.1 [0.5, 0.6) o 0.2 [0.6, 0.8) u 0.1 [0.8, 0.9) ! 0.1 [0.9, 1.0)
  • 246. 246 Example 3 Motion vectors, sequence 1: probabilities and sub-ranges Sub-range example New Range (1) New Range (2) New Range (3) New Range (4) New Range (5) New Range (1) New Range (2) New Range (3) New Range (4) New Range (5)
  • 247. 247 Example 3 Encoding Procedure for Vector Sequence (0, −1, 0, 2) New Range (1) New Range (2) New Range (3) New Range (4) +0.3×1 +0.7×1 +0.1×0.4 +0.3×0.4 +0.3×0.08 +0.7×0.08 +0.9×0.032 +1×0.032 New Range (5)
  • 248. 248 Example 3 Decoding Procedure New Range (1) New Range (2) New Range (3) New Range (4) 𝑹 𝒏+𝟏 = 𝑹 𝒏 − 𝑳 𝒏 𝑼 𝒏 − 𝑳 𝒏 Corresponding Range (𝑳 𝒏, 𝑼 𝒏] Output symbol 𝑅𝑒𝑐𝑖𝑒𝑣𝑒𝑑 𝐶𝑜𝑑𝑒= 0.394 [0.3, 0.7) 0 𝑹 𝒏+𝟏 = 0.394−0.3 0.7−0.3 =0.235 [0.1, 0.3) -1 𝑹 𝒏+𝟏 = 0.235−0.1 0.3−0.1 =0.675 [0.3, 0.7) 0 𝑹 𝒏+𝟏 = 0.675 −0.3 0.7−0.3 =0.9375 [0.9, 1) +2 Reasonable Approach Decoder do not have it!!
  • 249. The principal advantage of arithmetic coding − The transmitted number, 0.394 in this case, which may be represented as a fixed-point number with sufficient accuracy using 9 bits, is not constrained to an integral number of bits for each transmitted data symbol. − To achieve optimal compression, the sequence of data symbols should be represented with: −(𝐥𝐨𝐠 𝟐 𝑷 𝟎 + 𝐥𝐨𝐠 𝟐 𝑷−𝟏 + 𝐥𝐨𝐠 𝟐 𝑷 𝟎 + 𝐥𝐨𝐠 𝟐 𝑷 𝟐) = 𝟖. 𝟐𝟖 𝒃𝒊𝒕𝒔 − In this example, arithmetic coding achieves 9 bits, which is close to optimum. 249 Example 3 (Cont.) (0, −1, 0, 2)