SlideShare a Scribd company logo
1 of 26
Unit 6 Speech Signal
DR MINAKSHI PRADEEP ATRE
PVG’S COET & GKPIM PUNE
References
Book: Speech and Audio Processing by Dr Shaila Apte madam
Pdf document: http://cs.haifa.ac.il/~nimrod/Compression/Speech/S1Basics2010.pdf
For speech samples:
https://www.signalogic.com/index.pl?page=speech_codec_wav_samples
Contents
Speech:
1. Basics of speech signal and its features
2. LTI representation of speech signal
3. LTV representation of speech signal
4. Estimation of fundamental frequency
5. identification of voiced and unvoiced speech
6. and noise removal
Speech
Speech signal is generated by nature
Naturally occurring so random in nature
Necessary to understand the generalized human speech production
Simple linear time invariant (LTI) model for speech production
Inherently time varying nature of speech
Introduction to linear time variant (LTV) model of speech
Speech type: consonants, fricatives
Voiced and unvoiced (V/UV) speech
Speech Production Mechanism: Pipelines
Model
Vocal Tract
Vocal Tract
 Vocal tract is the cavity between the vocal cords and the
lips, and acts as a resonator that spectrally shapes the
periodic input, much like the cavity of a musical wind
instrument. ƒ
Simple model of a steady-state vowel regards the vocal
tract as a linear time-invariant (LTI) filter with a periodic
impulse-like input.
What is Speech signal?
 Created at the Vocal cords, travels through the Vocal tract, and
produced at speakers mouth
 Gets to the listeners ear as a pressure wave
 Non-Stationary, but can be divided to sound segments which have
some common acoustic properties for a short time interval
 Two Major classes: Phonemes (Vowels and Consonants)
Phonemes
The basic sounds of a language (e.g. "a" in the word "father“) are
called phonemes
A typical speech utterance consists of a string of vowel and
consonant phonemes whose temporal and spectral characteristics
change with time
In addition, the time-varying source and system can also
nonlinearly interact in a complex way: our simple model is correct for
a steady vowel, but the sounds of speech are not always well
represented by linear time-invariant systems !
Vowel Production
In vowel production, air is forced from the lungs by contraction of
the muscles around the lung cavity
Air flows through the vocal cords, which are two masses of flesh,
causing periodic vibration of the cords whose rate gives the pitch of
the sound
Resulting periodic puffs of air act as an excitation input, or source,
to the vocal tract
Typical Vowels
Speech Production
A sound source excites a (vocal tract) filter
◦ Voiced: Periodic source, created by vocal cords
◦ Unvoiced: Aperiodic and noisy source
Pitch is the fundamental frequency of the vocal cords vibration (also called F0) followed by 4-5
Formants (F1 - F5) at higher frequencies
Natural frequencies occur at
odd multiples of 500 Hz.
These resonant frequencies
are called formants.
Vowel Adult Male Adult Female
F1 F2 F3 F1 F2 F3
(i) 255 2330 3000 340 2610 3210
(u) 290 940 2180 390 995 2585
(ae) 735 1625 2465 950 1955 2900
Typical formant frequencies for selected vowels in Hz
This table shows
the three values
LTI Model for speech production
Impulse Train
Generator
(Glottis)
Random Signal
Generator
Impulse Response
of Vocal Tract
Generated Speech
Impulse train generator is
used as an excitation signal
when a voiced segment is
produced VOWEL
e.g. “a”
Basic Assumption: source of excitation and
the vocal tract systems are independent
Periodic
LTI Model for speech production
Impulse Train
Generator
(Glottis)
Random Signal
Generator
Impulse Response
of Vocal Tract
Generated Speech
Random Signal Generator is
used as an excitation signal
when an unvoiced segment
is produced
CONSONANTS
e.g. “s”
LTI model is used for a short segment of
speech @10 ms for which we can assume the
parameters of vocal tract remain constant
Random
Nature of Speech Signal
 Speech is generated by components like vocal cords and vocal tracts
 It’s not possible to generate a speech signal on its own
Speech is random signal
 Speech has/ can have infinite features (story of an elephant and the blind people touching the
elephant to identify and specify what the elephant looks like)
So it’s a complex problem
 Uttering the different words is possible because of humans can change the resonant modes of
the vocal cavity and can also stretch the vocal cords to some extent for modifying the pitch
period for different vowels
And that’s why we have the linear time-varying (LTV) model
Linear Time-varying Model: Speech
production
Impulse Train
Generator
Random Signal
Generator
Impulse Response
of Vocal Tract
Generated Speech
Amplitude
Pitch period is
variable
Impulse response is
variable
Speech Sound Categories
Periodic (Sonorants, Voiced)
Noisy (Fricatives , Un-Voiced)
Impulsive (Plosive)
Example:
In the word “shop,” the “sh,” “o,” and “p” are generated from a
noisy, periodic, and impulsive source, respectively
Frequency Range
Speech:
Pitch frequency:
◦ male ~ 85-155 Hz;
◦ female ~ 165-255 Hz;
Singer’s vocal range: from bass to
soprano: 80 Hz-1100 Hz
Pitch
Pitch period: The time duration of one glottal cycle
Pitch (fundamental frequency): The reciprocal of the pitch period.
Remember: we will
calculate the pitch
for voiced segment
Pitch Detection
The pitch period and V/UV
decisions are elementary
to many speech coders
Many methods for the
calculation:
◦ Autocorrelation function
◦ ZCR
Features or categorization of speech
sound
Speech sounds are studied and classified from the following
perspectives:
1) The nature of the source: periodic, noisy, or impulsive, and
combinations of the three
2) The shape of the vocal tract
3) The time-domain waveform, which gives the pressure change with
time at the lips output
4) The time-varying spectral characteristics revealed through the
spectrogram
Spectrogram
Time-varying spectral characteristics of the speech signal can be graphically
displayed through the use of a tow-dimensional pattern
Vertical axis: frequency, Horizontal axis: time
The pseudo-color of the (red: high energy ) pattern is proportional to signal
energy
The resonance frequencies of the vocal tract show up as “energy bands”
Voiced intervals characterized by striated appearance (periodically of the
signal)
Un-Voiced intervals are more solidly filled in
Yellow are formants
Most common Manner of articulation
Plosive, or oral stop, where there is complete occlusion (blockage) of both the oral and nasal
cavities of the vocal tract, and therefore no air flow. Examples include English /p t k/ (voiceless)
and /b d g/ (voiced)
Nasal stop, where there is complete occlusion of the oral cavity, and the air passes instead
through the nose. The shape and position of the tongue determine the resonant cavity that
gives different nasal stops their characteristic sounds. Examples include English /m, n/
Fricative, sometimes called spirant, where there is continuous frication (turbulent and noisy
airflow) at the place of articulation. Examples include English /f, s/ (voiceless), /v, z/ (voiced), etc
Most common Manner of articulation
Sibilants are a type of fricative where the airflow is guided by a groove in the tongue toward the
teeth, creating a high-pitched and very distinctive sound. These are by far the most common
fricatives. English sibilants include /s/ and /z
Affricate, which begins like a plosive, but this releases into a fricative rather than having a
separate release of its own. The English letters "ch" and "j" represent affricates
Trill, in which the articulator (usually the tip of the tongue) is held in place, and the airstream
causes it to vibrate. The double "r" of Spanish "perro" is a trill.
Approximant, where there is very little obstruction. Examples include English /w/ and /r/. Lateral
approximants, usually shortened to lateral, are a type of approximant pronounced with the side
of the tongue. English /l/ is a lateral.
Time for MATLAB Program
THANK YOU

More Related Content

What's hot

Detection and Binary Decision in AWGN Channel
Detection and Binary Decision in AWGN ChannelDetection and Binary Decision in AWGN Channel
Detection and Binary Decision in AWGN ChannelDrAimalKhan
 
Angle modulation .pptx
Angle modulation .pptxAngle modulation .pptx
Angle modulation .pptxswatihalunde
 
Superhetrodyne receiver
Superhetrodyne receiverSuperhetrodyne receiver
Superhetrodyne receiverlrsst
 
Specific features of hearing aids
Specific features of hearing aidsSpecific features of hearing aids
Specific features of hearing aidsPra_buddha
 
Optimal reception-of-digital-signals
Optimal reception-of-digital-signalsOptimal reception-of-digital-signals
Optimal reception-of-digital-signalsxyxz
 
Digital modulation
Digital modulationDigital modulation
Digital modulationIbrahim Omar
 
non parametric methods for power spectrum estimaton
non parametric methods for power spectrum estimatonnon parametric methods for power spectrum estimaton
non parametric methods for power spectrum estimatonBhavika Jethani
 
Analog Vs Digital Signals
Analog Vs Digital SignalsAnalog Vs Digital Signals
Analog Vs Digital Signalssajjad1996
 
Ppt on continuous phase modulation
Ppt on continuous phase modulationPpt on continuous phase modulation
Ppt on continuous phase modulationHai Venkat
 
Differential pulse code modulation
Differential pulse code modulationDifferential pulse code modulation
Differential pulse code modulationRamraj Bhadu
 
Design and Implementation of Log-Periodic Antenna
Design and Implementation of Log-Periodic AntennaDesign and Implementation of Log-Periodic Antenna
Design and Implementation of Log-Periodic AntennaShruti Nadkarni
 
Antennas and Wave Propagation
Antennas and Wave Propagation Antennas and Wave Propagation
Antennas and Wave Propagation VenkataRatnam14
 
Pulse code modulation and Demodulation
Pulse code modulation and DemodulationPulse code modulation and Demodulation
Pulse code modulation and DemodulationAbdul Razaq
 

What's hot (20)

Detection and Binary Decision in AWGN Channel
Detection and Binary Decision in AWGN ChannelDetection and Binary Decision in AWGN Channel
Detection and Binary Decision in AWGN Channel
 
Angle modulation .pptx
Angle modulation .pptxAngle modulation .pptx
Angle modulation .pptx
 
Superhetrodyne receiver
Superhetrodyne receiverSuperhetrodyne receiver
Superhetrodyne receiver
 
log periodic antenna
log periodic antennalog periodic antenna
log periodic antenna
 
Speech coding techniques
Speech coding techniquesSpeech coding techniques
Speech coding techniques
 
Specific features of hearing aids
Specific features of hearing aidsSpecific features of hearing aids
Specific features of hearing aids
 
Optimal reception-of-digital-signals
Optimal reception-of-digital-signalsOptimal reception-of-digital-signals
Optimal reception-of-digital-signals
 
Digital modulation
Digital modulationDigital modulation
Digital modulation
 
SPEECH CODING
SPEECH CODINGSPEECH CODING
SPEECH CODING
 
Frequency Modulation
Frequency ModulationFrequency Modulation
Frequency Modulation
 
Digital modulation
Digital modulationDigital modulation
Digital modulation
 
Spread spectrum modulation
Spread spectrum modulationSpread spectrum modulation
Spread spectrum modulation
 
non parametric methods for power spectrum estimaton
non parametric methods for power spectrum estimatonnon parametric methods for power spectrum estimaton
non parametric methods for power spectrum estimaton
 
Analog Vs Digital Signals
Analog Vs Digital SignalsAnalog Vs Digital Signals
Analog Vs Digital Signals
 
Ppt on continuous phase modulation
Ppt on continuous phase modulationPpt on continuous phase modulation
Ppt on continuous phase modulation
 
Differential pulse code modulation
Differential pulse code modulationDifferential pulse code modulation
Differential pulse code modulation
 
Design and Implementation of Log-Periodic Antenna
Design and Implementation of Log-Periodic AntennaDesign and Implementation of Log-Periodic Antenna
Design and Implementation of Log-Periodic Antenna
 
Antennas and Wave Propagation
Antennas and Wave Propagation Antennas and Wave Propagation
Antennas and Wave Propagation
 
Pulse code modulation and Demodulation
Pulse code modulation and DemodulationPulse code modulation and Demodulation
Pulse code modulation and Demodulation
 
Filters
FiltersFilters
Filters
 

Similar to Part1 speech basics

Speech signal processing lizy
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizyLizy Abraham
 
Principal characteristics of speech
Principal characteristics of speechPrincipal characteristics of speech
Principal characteristics of speechNikolay Karpov
 
speech processing basics
speech processing basicsspeech processing basics
speech processing basicssivakumar m
 
SodaBottles-licensing Copyright-Fix.pdf
SodaBottles-licensing Copyright-Fix.pdfSodaBottles-licensing Copyright-Fix.pdf
SodaBottles-licensing Copyright-Fix.pdfNga Trinh
 
Phonetics & Phonology Mine.pptx
Phonetics & Phonology Mine.pptxPhonetics & Phonology Mine.pptx
Phonetics & Phonology Mine.pptxKoukabKhan
 
Phoneticsphonology lecture 2
Phoneticsphonology  lecture 2Phoneticsphonology  lecture 2
Phoneticsphonology lecture 2Raj Wali Khan
 
Principal characteristics of speech
Principal characteristics of speechPrincipal characteristics of speech
Principal characteristics of speechNikolay Karpov
 
Phonetic and phonology pp2
Phonetic and phonology pp2Phonetic and phonology pp2
Phonetic and phonology pp2zhian fadhil
 
Phonetics ( Introduction to Linguistics )
Phonetics ( Introduction to Linguistics )Phonetics ( Introduction to Linguistics )
Phonetics ( Introduction to Linguistics )Romulo Mulianto
 
Class 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epg
Class 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epgClass 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epg
Class 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epgLisa Lavoie
 
Cube model Theory of acoustic phonetics
Cube model Theory of acoustic phonetics Cube model Theory of acoustic phonetics
Cube model Theory of acoustic phonetics KarloHammer
 
Acoustic phonetics
Acoustic phoneticsAcoustic phonetics
Acoustic phoneticsVivaAs
 
1 ESO Música - El so
1 ESO Música - El so1 ESO Música - El so
1 ESO Música - El soJoan Sèculi
 
Speech organ and manner of articulation
Speech organ and manner of articulationSpeech organ and manner of articulation
Speech organ and manner of articulationYanti95
 

Similar to Part1 speech basics (20)

Speech signal processing lizy
Speech signal processing lizySpeech signal processing lizy
Speech signal processing lizy
 
Phonetics
PhoneticsPhonetics
Phonetics
 
Linguistics
LinguisticsLinguistics
Linguistics
 
Principal characteristics of speech
Principal characteristics of speechPrincipal characteristics of speech
Principal characteristics of speech
 
English Mystery 2
English Mystery 2English Mystery 2
English Mystery 2
 
Phonetics
PhoneticsPhonetics
Phonetics
 
4455355.ppt
4455355.ppt4455355.ppt
4455355.ppt
 
speech processing basics
speech processing basicsspeech processing basics
speech processing basics
 
SodaBottles-licensing Copyright-Fix.pdf
SodaBottles-licensing Copyright-Fix.pdfSodaBottles-licensing Copyright-Fix.pdf
SodaBottles-licensing Copyright-Fix.pdf
 
Phonetics & Phonology Mine.pptx
Phonetics & Phonology Mine.pptxPhonetics & Phonology Mine.pptx
Phonetics & Phonology Mine.pptx
 
Phoneticsphonology lecture 2
Phoneticsphonology  lecture 2Phoneticsphonology  lecture 2
Phoneticsphonology lecture 2
 
Principal characteristics of speech
Principal characteristics of speechPrincipal characteristics of speech
Principal characteristics of speech
 
Phonetic and phonology pp2
Phonetic and phonology pp2Phonetic and phonology pp2
Phonetic and phonology pp2
 
Phonetics ( Introduction to Linguistics )
Phonetics ( Introduction to Linguistics )Phonetics ( Introduction to Linguistics )
Phonetics ( Introduction to Linguistics )
 
Class 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epg
Class 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epgClass 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epg
Class 09 emerson_phonetics_fall2014_phonemes_allophones_vot_epg
 
Cube model Theory of acoustic phonetics
Cube model Theory of acoustic phonetics Cube model Theory of acoustic phonetics
Cube model Theory of acoustic phonetics
 
Acoustic phonetics
Acoustic phoneticsAcoustic phonetics
Acoustic phonetics
 
Class 4
Class 4Class 4
Class 4
 
1 ESO Música - El so
1 ESO Música - El so1 ESO Música - El so
1 ESO Música - El so
 
Speech organ and manner of articulation
Speech organ and manner of articulationSpeech organ and manner of articulation
Speech organ and manner of articulation
 

More from Minakshi Atre

Signals&Systems: Quick pointers to Fundamentals
Signals&Systems: Quick pointers to FundamentalsSignals&Systems: Quick pointers to Fundamentals
Signals&Systems: Quick pointers to FundamentalsMinakshi Atre
 
Unit 4 Statistical Learning Methods: EM algorithm
Unit 4 Statistical Learning Methods: EM algorithmUnit 4 Statistical Learning Methods: EM algorithm
Unit 4 Statistical Learning Methods: EM algorithmMinakshi Atre
 
Inference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsInference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsMinakshi Atre
 
Artificial Intelligence: Basic Terminologies
Artificial Intelligence: Basic TerminologiesArtificial Intelligence: Basic Terminologies
Artificial Intelligence: Basic TerminologiesMinakshi Atre
 
2)local search algorithms
2)local search algorithms2)local search algorithms
2)local search algorithmsMinakshi Atre
 
Performance appraisal/ assessment in higher educational institutes (HEI)
Performance appraisal/ assessment in higher educational institutes (HEI)Performance appraisal/ assessment in higher educational institutes (HEI)
Performance appraisal/ assessment in higher educational institutes (HEI)Minakshi Atre
 
Artificial intelligence agents and environment
Artificial intelligence agents and environmentArtificial intelligence agents and environment
Artificial intelligence agents and environmentMinakshi Atre
 
Unit 6: DSP applications
Unit 6: DSP applications Unit 6: DSP applications
Unit 6: DSP applications Minakshi Atre
 
Unit 6: DSP applications
Unit 6: DSP applicationsUnit 6: DSP applications
Unit 6: DSP applicationsMinakshi Atre
 
Learning occam razor
Learning occam razorLearning occam razor
Learning occam razorMinakshi Atre
 
Waltz algorithm in artificial intelligence
Waltz algorithm in artificial intelligenceWaltz algorithm in artificial intelligence
Waltz algorithm in artificial intelligenceMinakshi Atre
 
Perception in artificial intelligence
Perception in artificial intelligencePerception in artificial intelligence
Perception in artificial intelligenceMinakshi Atre
 
Popular search algorithms
Popular search algorithmsPopular search algorithms
Popular search algorithmsMinakshi Atre
 
Artificial Intelligence Terminologies
Artificial Intelligence TerminologiesArtificial Intelligence Terminologies
Artificial Intelligence TerminologiesMinakshi Atre
 
composite video signal
composite video signalcomposite video signal
composite video signalMinakshi Atre
 
Basic terminologies of television
Basic terminologies of televisionBasic terminologies of television
Basic terminologies of televisionMinakshi Atre
 

More from Minakshi Atre (20)

Signals&Systems: Quick pointers to Fundamentals
Signals&Systems: Quick pointers to FundamentalsSignals&Systems: Quick pointers to Fundamentals
Signals&Systems: Quick pointers to Fundamentals
 
Unit 4 Statistical Learning Methods: EM algorithm
Unit 4 Statistical Learning Methods: EM algorithmUnit 4 Statistical Learning Methods: EM algorithm
Unit 4 Statistical Learning Methods: EM algorithm
 
Inference in HMM and Bayesian Models
Inference in HMM and Bayesian ModelsInference in HMM and Bayesian Models
Inference in HMM and Bayesian Models
 
Artificial Intelligence: Basic Terminologies
Artificial Intelligence: Basic TerminologiesArtificial Intelligence: Basic Terminologies
Artificial Intelligence: Basic Terminologies
 
2)local search algorithms
2)local search algorithms2)local search algorithms
2)local search algorithms
 
Performance appraisal/ assessment in higher educational institutes (HEI)
Performance appraisal/ assessment in higher educational institutes (HEI)Performance appraisal/ assessment in higher educational institutes (HEI)
Performance appraisal/ assessment in higher educational institutes (HEI)
 
DSP preliminaries
DSP preliminariesDSP preliminaries
DSP preliminaries
 
Artificial intelligence agents and environment
Artificial intelligence agents and environmentArtificial intelligence agents and environment
Artificial intelligence agents and environment
 
Unit 6: DSP applications
Unit 6: DSP applications Unit 6: DSP applications
Unit 6: DSP applications
 
Unit 6: DSP applications
Unit 6: DSP applicationsUnit 6: DSP applications
Unit 6: DSP applications
 
Learning occam razor
Learning occam razorLearning occam razor
Learning occam razor
 
Learning in AI
Learning in AILearning in AI
Learning in AI
 
Waltz algorithm in artificial intelligence
Waltz algorithm in artificial intelligenceWaltz algorithm in artificial intelligence
Waltz algorithm in artificial intelligence
 
Perception in artificial intelligence
Perception in artificial intelligencePerception in artificial intelligence
Perception in artificial intelligence
 
Popular search algorithms
Popular search algorithmsPopular search algorithms
Popular search algorithms
 
Artificial Intelligence Terminologies
Artificial Intelligence TerminologiesArtificial Intelligence Terminologies
Artificial Intelligence Terminologies
 
composite video signal
composite video signalcomposite video signal
composite video signal
 
Basic terminologies of television
Basic terminologies of televisionBasic terminologies of television
Basic terminologies of television
 
Mpeg 2
Mpeg 2Mpeg 2
Mpeg 2
 
Beginning of dtv
Beginning of dtvBeginning of dtv
Beginning of dtv
 

Recently uploaded

KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamKIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamDr. Radhey Shyam
 
Arduino based vehicle speed tracker project
Arduino based vehicle speed tracker projectArduino based vehicle speed tracker project
Arduino based vehicle speed tracker projectRased Khan
 
internship exam ppt.pptx on embedded system and IOT
internship exam ppt.pptx on embedded system and IOTinternship exam ppt.pptx on embedded system and IOT
internship exam ppt.pptx on embedded system and IOTNavyashreeS6
 
Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2T.D. Shashikala
 
Furniture showroom management system project.pdf
Furniture showroom management system project.pdfFurniture showroom management system project.pdf
Furniture showroom management system project.pdfKamal Acharya
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxwendy cai
 
Peek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdfPeek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdfAyahmorsy
 
Digital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdfDigital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdfAbrahamGadissa
 
An improvement in the safety of big data using blockchain technology
An improvement in the safety of big data using blockchain technologyAn improvement in the safety of big data using blockchain technology
An improvement in the safety of big data using blockchain technologyBOHRInternationalJou1
 
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES  INTRODUCTION UNIT-IENERGY STORAGE DEVICES  INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES INTRODUCTION UNIT-IVigneshvaranMech
 
Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...
Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...
Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...Lovely Professional University
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单
一比一原版(UofT毕业证)多伦多大学毕业证成绩单一比一原版(UofT毕业证)多伦多大学毕业证成绩单
一比一原版(UofT毕业证)多伦多大学毕业证成绩单tuuww
 
Dairy management system project report..pdf
Dairy management system project report..pdfDairy management system project report..pdf
Dairy management system project report..pdfKamal Acharya
 
Natalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in KrakówNatalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in Krakówbim.edu.pl
 
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...Roi Lipman
 
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfA CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfKamal Acharya
 
retail automation billing system ppt.pptx
retail automation billing system ppt.pptxretail automation billing system ppt.pptx
retail automation billing system ppt.pptxfaamieahmd
 
School management system project report.pdf
School management system project report.pdfSchool management system project report.pdf
School management system project report.pdfKamal Acharya
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopEmre Günaydın
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdfKamal Acharya
 

Recently uploaded (20)

KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data StreamKIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
 
Arduino based vehicle speed tracker project
Arduino based vehicle speed tracker projectArduino based vehicle speed tracker project
Arduino based vehicle speed tracker project
 
internship exam ppt.pptx on embedded system and IOT
internship exam ppt.pptx on embedded system and IOTinternship exam ppt.pptx on embedded system and IOT
internship exam ppt.pptx on embedded system and IOT
 
Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2Research Methodolgy & Intellectual Property Rights Series 2
Research Methodolgy & Intellectual Property Rights Series 2
 
Furniture showroom management system project.pdf
Furniture showroom management system project.pdfFurniture showroom management system project.pdf
Furniture showroom management system project.pdf
 
Construction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptxConstruction method of steel structure space frame .pptx
Construction method of steel structure space frame .pptx
 
Peek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdfPeek implant persentation - Copy (1).pdf
Peek implant persentation - Copy (1).pdf
 
Digital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdfDigital Signal Processing Lecture notes n.pdf
Digital Signal Processing Lecture notes n.pdf
 
An improvement in the safety of big data using blockchain technology
An improvement in the safety of big data using blockchain technologyAn improvement in the safety of big data using blockchain technology
An improvement in the safety of big data using blockchain technology
 
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES  INTRODUCTION UNIT-IENERGY STORAGE DEVICES  INTRODUCTION UNIT-I
ENERGY STORAGE DEVICES INTRODUCTION UNIT-I
 
Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...
Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...
Activity Planning: Objectives, Project Schedule, Network Planning Model. Time...
 
一比一原版(UofT毕业证)多伦多大学毕业证成绩单
一比一原版(UofT毕业证)多伦多大学毕业证成绩单一比一原版(UofT毕业证)多伦多大学毕业证成绩单
一比一原版(UofT毕业证)多伦多大学毕业证成绩单
 
Dairy management system project report..pdf
Dairy management system project report..pdfDairy management system project report..pdf
Dairy management system project report..pdf
 
Natalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in KrakówNatalia Rutkowska - BIM School Course in Kraków
Natalia Rutkowska - BIM School Course in Kraków
 
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
The battle for RAG, explore the pros and cons of using KnowledgeGraphs and Ve...
 
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdfA CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
A CASE STUDY ON ONLINE TICKET BOOKING SYSTEM PROJECT.pdf
 
retail automation billing system ppt.pptx
retail automation billing system ppt.pptxretail automation billing system ppt.pptx
retail automation billing system ppt.pptx
 
School management system project report.pdf
School management system project report.pdfSchool management system project report.pdf
School management system project report.pdf
 
İTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering WorkshopİTÜ CAD and Reverse Engineering Workshop
İTÜ CAD and Reverse Engineering Workshop
 
Online blood donation management system project.pdf
Online blood donation management system project.pdfOnline blood donation management system project.pdf
Online blood donation management system project.pdf
 

Part1 speech basics

  • 1. Unit 6 Speech Signal DR MINAKSHI PRADEEP ATRE PVG’S COET & GKPIM PUNE
  • 2. References Book: Speech and Audio Processing by Dr Shaila Apte madam Pdf document: http://cs.haifa.ac.il/~nimrod/Compression/Speech/S1Basics2010.pdf For speech samples: https://www.signalogic.com/index.pl?page=speech_codec_wav_samples
  • 3. Contents Speech: 1. Basics of speech signal and its features 2. LTI representation of speech signal 3. LTV representation of speech signal 4. Estimation of fundamental frequency 5. identification of voiced and unvoiced speech 6. and noise removal
  • 4. Speech Speech signal is generated by nature Naturally occurring so random in nature Necessary to understand the generalized human speech production Simple linear time invariant (LTI) model for speech production Inherently time varying nature of speech Introduction to linear time variant (LTV) model of speech Speech type: consonants, fricatives Voiced and unvoiced (V/UV) speech
  • 5. Speech Production Mechanism: Pipelines Model Vocal Tract
  • 6. Vocal Tract  Vocal tract is the cavity between the vocal cords and the lips, and acts as a resonator that spectrally shapes the periodic input, much like the cavity of a musical wind instrument. ƒ Simple model of a steady-state vowel regards the vocal tract as a linear time-invariant (LTI) filter with a periodic impulse-like input.
  • 7. What is Speech signal?  Created at the Vocal cords, travels through the Vocal tract, and produced at speakers mouth  Gets to the listeners ear as a pressure wave  Non-Stationary, but can be divided to sound segments which have some common acoustic properties for a short time interval  Two Major classes: Phonemes (Vowels and Consonants)
  • 8. Phonemes The basic sounds of a language (e.g. "a" in the word "father“) are called phonemes A typical speech utterance consists of a string of vowel and consonant phonemes whose temporal and spectral characteristics change with time In addition, the time-varying source and system can also nonlinearly interact in a complex way: our simple model is correct for a steady vowel, but the sounds of speech are not always well represented by linear time-invariant systems !
  • 9. Vowel Production In vowel production, air is forced from the lungs by contraction of the muscles around the lung cavity Air flows through the vocal cords, which are two masses of flesh, causing periodic vibration of the cords whose rate gives the pitch of the sound Resulting periodic puffs of air act as an excitation input, or source, to the vocal tract
  • 11. Speech Production A sound source excites a (vocal tract) filter ◦ Voiced: Periodic source, created by vocal cords ◦ Unvoiced: Aperiodic and noisy source Pitch is the fundamental frequency of the vocal cords vibration (also called F0) followed by 4-5 Formants (F1 - F5) at higher frequencies Natural frequencies occur at odd multiples of 500 Hz. These resonant frequencies are called formants. Vowel Adult Male Adult Female F1 F2 F3 F1 F2 F3 (i) 255 2330 3000 340 2610 3210 (u) 290 940 2180 390 995 2585 (ae) 735 1625 2465 950 1955 2900 Typical formant frequencies for selected vowels in Hz This table shows the three values
  • 12. LTI Model for speech production Impulse Train Generator (Glottis) Random Signal Generator Impulse Response of Vocal Tract Generated Speech Impulse train generator is used as an excitation signal when a voiced segment is produced VOWEL e.g. “a” Basic Assumption: source of excitation and the vocal tract systems are independent Periodic
  • 13. LTI Model for speech production Impulse Train Generator (Glottis) Random Signal Generator Impulse Response of Vocal Tract Generated Speech Random Signal Generator is used as an excitation signal when an unvoiced segment is produced CONSONANTS e.g. “s” LTI model is used for a short segment of speech @10 ms for which we can assume the parameters of vocal tract remain constant Random
  • 14. Nature of Speech Signal  Speech is generated by components like vocal cords and vocal tracts  It’s not possible to generate a speech signal on its own Speech is random signal  Speech has/ can have infinite features (story of an elephant and the blind people touching the elephant to identify and specify what the elephant looks like) So it’s a complex problem  Uttering the different words is possible because of humans can change the resonant modes of the vocal cavity and can also stretch the vocal cords to some extent for modifying the pitch period for different vowels And that’s why we have the linear time-varying (LTV) model
  • 15. Linear Time-varying Model: Speech production Impulse Train Generator Random Signal Generator Impulse Response of Vocal Tract Generated Speech Amplitude Pitch period is variable Impulse response is variable
  • 16. Speech Sound Categories Periodic (Sonorants, Voiced) Noisy (Fricatives , Un-Voiced) Impulsive (Plosive) Example: In the word “shop,” the “sh,” “o,” and “p” are generated from a noisy, periodic, and impulsive source, respectively
  • 17. Frequency Range Speech: Pitch frequency: ◦ male ~ 85-155 Hz; ◦ female ~ 165-255 Hz; Singer’s vocal range: from bass to soprano: 80 Hz-1100 Hz
  • 18. Pitch Pitch period: The time duration of one glottal cycle Pitch (fundamental frequency): The reciprocal of the pitch period. Remember: we will calculate the pitch for voiced segment
  • 19. Pitch Detection The pitch period and V/UV decisions are elementary to many speech coders Many methods for the calculation: ◦ Autocorrelation function ◦ ZCR
  • 20. Features or categorization of speech sound Speech sounds are studied and classified from the following perspectives: 1) The nature of the source: periodic, noisy, or impulsive, and combinations of the three 2) The shape of the vocal tract 3) The time-domain waveform, which gives the pressure change with time at the lips output 4) The time-varying spectral characteristics revealed through the spectrogram
  • 21. Spectrogram Time-varying spectral characteristics of the speech signal can be graphically displayed through the use of a tow-dimensional pattern Vertical axis: frequency, Horizontal axis: time The pseudo-color of the (red: high energy ) pattern is proportional to signal energy The resonance frequencies of the vocal tract show up as “energy bands” Voiced intervals characterized by striated appearance (periodically of the signal) Un-Voiced intervals are more solidly filled in
  • 23. Most common Manner of articulation Plosive, or oral stop, where there is complete occlusion (blockage) of both the oral and nasal cavities of the vocal tract, and therefore no air flow. Examples include English /p t k/ (voiceless) and /b d g/ (voiced) Nasal stop, where there is complete occlusion of the oral cavity, and the air passes instead through the nose. The shape and position of the tongue determine the resonant cavity that gives different nasal stops their characteristic sounds. Examples include English /m, n/ Fricative, sometimes called spirant, where there is continuous frication (turbulent and noisy airflow) at the place of articulation. Examples include English /f, s/ (voiceless), /v, z/ (voiced), etc
  • 24. Most common Manner of articulation Sibilants are a type of fricative where the airflow is guided by a groove in the tongue toward the teeth, creating a high-pitched and very distinctive sound. These are by far the most common fricatives. English sibilants include /s/ and /z Affricate, which begins like a plosive, but this releases into a fricative rather than having a separate release of its own. The English letters "ch" and "j" represent affricates Trill, in which the articulator (usually the tip of the tongue) is held in place, and the airstream causes it to vibrate. The double "r" of Spanish "perro" is a trill. Approximant, where there is very little obstruction. Examples include English /w/ and /r/. Lateral approximants, usually shortened to lateral, are a type of approximant pronounced with the side of the tongue. English /l/ is a lateral.
  • 25. Time for MATLAB Program