SlideShare a Scribd company logo
1 of 3
Download to read offline
Shrinkage Methods in Linear Regression
June 29, 2018
Benno Geißelmann
1 Motivation
Linear Regression has the aim to predict a dependent target variable by a
combination of predictor variables (features) together with their according
coefficients. The linear regression function is of the following form:
ˆyi = w0 + w1xi1 + · · · + wpxip + εi (1)
where yi is the ith target value (ˆyi is the ith predicted value) and xi is the
according ith feature vector. w0 · · · wp are the p coefficients of the regression
function which need to be determined. εi is some random noise. P is the
number of features and N is the number of training data. In some scenarios
P can be very large, sometimes even larger than N. Therefore in these
scenarios there needs to be performed some kind of feature selection in order
to shrink the number of used features for the regression model. Otherwise the
huge amount of predictor variables would lead to overfitting on the training
data and therefore modelling the noise of the data rather than the underlying
connections. By shrinking of the coefficients, the variance in the model is
reduced. At the same time the bias of the model is increased (bias-variance-
dilemma) with the aim to build a model which is able to perform well on
unseen data. Lasso and Ridge regression are methods to shrink the amount
of features or their according coefficients. In the next sections the concepts
of Ridge and Lasso is introduced.
2 The Ridge model
The Rigde regression analysis method defines an approach by which it is
possible to shrink the size of the coefficients of the regression function. Due to
1
this, overfitting is avoided and therefore the ability for generalization beyond
the traing set of the regression model is improved. The optimization goal
which Ridge is trying to achieve can be formulated like this:
min
w0,w
1
N
N
i=1
(yi − w0 − xT
i w)2
+ λ
P
j=1
w2
j (2)
The first part of the sum is the ordinary least squares (OLS) approach and in
fact if you set λ = 0 you get OLS. The second part of the sum (penatly term)
is introduced by Ridge. It defines a further restriction on the sum over all
coefficients, weighted by some parameter λ. By λ there can be handled how
big the influence of the penalty term is for the optimization. The bigger it
is, the more influence the penalty term is going to have in the optimization.
3 The Lasso model
Lasso (least absolute shrinkage and selection operator) is comparable to
Ridge except the difference, that the penatly term uses the absolute value
instead of the square number:
min
w0,w
1
N
N
i=1
(yi − w0 − xT
i w)2
+ λ
P
j=1
|wj| (3)
4 Difference between Ridge and Lasso Model
As stated before the difference between the Ridge approach and the Lasso
approach the the usage of square number respectively the absolute value. Due
to this, the Ridge approach tends to create small but non null coefficients
(because squares of numbers less than 1 are small but bigger than 0) where
as the Lasso approach tends to set more coefficients to 0. Therefore Lasso is
stronger in doing actual feature selection. This might be an important reason
why Lasso is the more popular approach for shrinkage in linear regression
today.
2
5 Gradient descent for optimization of the
Ridge approach
For finding a feasable solution for the above described optimization problem
of the Ridge regression, there can be used gradient descent. The following
algorithm describes this approach:
Algorithm 1: Gradient Descent for Ridge Regression
Input: iterations
Input: η
Result: the coefficients w0 · · · wn
Cost ← 1
N
N
i=1(yi − w0 − xT
i w)2
+ λ P
j=1 w2
j ;
for it in interations do
for wi in w do
wi ← wi − η∂Cost
∂wi
end
end
The algorithm gets the number of iterations and the step size η as input.
In each iteration there is calculated the partial derivative of the cost func-
tion and a certain coefficient wi. Using the negative value of this derivative
multiplied by the stepsize, we are descending on the cost function.
6 Coordinate descent for optimization of the
Lasso approach
Due to the usage of the absolute value in the optimization function of Lasso
we cannot directly use gradient descent because the function is not differ-
entiable at wj = 0. There is an alternative approach for approximating a
solution for this optimization problem called coordinate descent.
3

More Related Content

What's hot

Lossless image compression via by lifting scheme
Lossless image compression via by lifting schemeLossless image compression via by lifting scheme
Lossless image compression via by lifting schemeSubhashini Subramanian
 
Mini project boston housing dataset v1
Mini project   boston housing dataset v1Mini project   boston housing dataset v1
Mini project boston housing dataset v1Wyendrila Roy
 
Kinetic line plane
Kinetic line planeKinetic line plane
Kinetic line planeMark Hilbert
 
G11ppt one2one inversefunction
G11ppt one2one inversefunctionG11ppt one2one inversefunction
G11ppt one2one inversefunctionabmullessoj
 
Factor label method
Factor label methodFactor label method
Factor label methodaimorales
 
NUMERICAL INTEGRATION AND ITS APPLICATIONS
NUMERICAL INTEGRATION AND ITS APPLICATIONSNUMERICAL INTEGRATION AND ITS APPLICATIONS
NUMERICAL INTEGRATION AND ITS APPLICATIONSGOWTHAMGOWSIK98
 
Transportation Problem In Linear Programming
Transportation Problem In Linear ProgrammingTransportation Problem In Linear Programming
Transportation Problem In Linear ProgrammingMirza Tanzida
 
Kuliah teori dan analisis jaringan - linear programming
Kuliah teori dan analisis jaringan - linear programmingKuliah teori dan analisis jaringan - linear programming
Kuliah teori dan analisis jaringan - linear programmingHarun Al-Rasyid Lubis
 
Mba i qt unit-1.2_transportation, assignment and transshipment problems
Mba i qt unit-1.2_transportation, assignment and transshipment problemsMba i qt unit-1.2_transportation, assignment and transshipment problems
Mba i qt unit-1.2_transportation, assignment and transshipment problemsRai University
 
Assignment Problem
Assignment ProblemAssignment Problem
Assignment ProblemAmit Kumar
 
Scheduling advertisements on a web page to maximize revenue
Scheduling advertisements on a web page to maximize revenueScheduling advertisements on a web page to maximize revenue
Scheduling advertisements on a web page to maximize revenueShu-Jeng Hsieh
 
Transportation and assignment_problem
Transportation and assignment_problemTransportation and assignment_problem
Transportation and assignment_problemAnkit Katiyar
 
Pat05 ppt 0201
Pat05 ppt 0201Pat05 ppt 0201
Pat05 ppt 0201wzuri
 
Deepika(14 mba5012) transportation ppt
Deepika(14 mba5012)  transportation pptDeepika(14 mba5012)  transportation ppt
Deepika(14 mba5012) transportation pptDeepika Bansal
 
Transportation problem
Transportation problemTransportation problem
Transportation problemShubhagata Roy
 

What's hot (20)

Lossless image compression via by lifting scheme
Lossless image compression via by lifting schemeLossless image compression via by lifting scheme
Lossless image compression via by lifting scheme
 
Mini project boston housing dataset v1
Mini project   boston housing dataset v1Mini project   boston housing dataset v1
Mini project boston housing dataset v1
 
Kinetic line plane
Kinetic line planeKinetic line plane
Kinetic line plane
 
G11ppt one2one inversefunction
G11ppt one2one inversefunctionG11ppt one2one inversefunction
G11ppt one2one inversefunction
 
Factor label method
Factor label methodFactor label method
Factor label method
 
Integration
IntegrationIntegration
Integration
 
NUMERICAL INTEGRATION AND ITS APPLICATIONS
NUMERICAL INTEGRATION AND ITS APPLICATIONSNUMERICAL INTEGRATION AND ITS APPLICATIONS
NUMERICAL INTEGRATION AND ITS APPLICATIONS
 
Transportation Problem In Linear Programming
Transportation Problem In Linear ProgrammingTransportation Problem In Linear Programming
Transportation Problem In Linear Programming
 
Transportation winston
Transportation winstonTransportation winston
Transportation winston
 
Kuliah teori dan analisis jaringan - linear programming
Kuliah teori dan analisis jaringan - linear programmingKuliah teori dan analisis jaringan - linear programming
Kuliah teori dan analisis jaringan - linear programming
 
Mba i qt unit-1.2_transportation, assignment and transshipment problems
Mba i qt unit-1.2_transportation, assignment and transshipment problemsMba i qt unit-1.2_transportation, assignment and transshipment problems
Mba i qt unit-1.2_transportation, assignment and transshipment problems
 
Transportation problems
Transportation problemsTransportation problems
Transportation problems
 
Calc 4.6
Calc 4.6Calc 4.6
Calc 4.6
 
Assignment Problem
Assignment ProblemAssignment Problem
Assignment Problem
 
Scheduling advertisements on a web page to maximize revenue
Scheduling advertisements on a web page to maximize revenueScheduling advertisements on a web page to maximize revenue
Scheduling advertisements on a web page to maximize revenue
 
Transportation and assignment_problem
Transportation and assignment_problemTransportation and assignment_problem
Transportation and assignment_problem
 
Pat05 ppt 0201
Pat05 ppt 0201Pat05 ppt 0201
Pat05 ppt 0201
 
Deepika(14 mba5012) transportation ppt
Deepika(14 mba5012)  transportation pptDeepika(14 mba5012)  transportation ppt
Deepika(14 mba5012) transportation ppt
 
Assign transportation
Assign transportationAssign transportation
Assign transportation
 
Transportation problem
Transportation problemTransportation problem
Transportation problem
 

Similar to Shrinkage Methods in Linear Regression

Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...
Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...
Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...Mengxi Jiang
 
Regression analysis and its type
Regression analysis and its typeRegression analysis and its type
Regression analysis and its typeEkta Bafna
 
Get Multiple Regression Assignment Help
Get Multiple Regression Assignment Help Get Multiple Regression Assignment Help
Get Multiple Regression Assignment Help HelpWithAssignment.com
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)Abhimanyu Dwivedi
 
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..butest
 
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..butest
 
Machine-Learning-with-Ridge-and-Lasso-Regression.pdf
Machine-Learning-with-Ridge-and-Lasso-Regression.pdfMachine-Learning-with-Ridge-and-Lasso-Regression.pdf
Machine-Learning-with-Ridge-and-Lasso-Regression.pdfAyadIliass
 
Linear programming class 12 investigatory project
Linear programming class 12 investigatory projectLinear programming class 12 investigatory project
Linear programming class 12 investigatory projectDivyans890
 
Supervised Learning.pdf
Supervised Learning.pdfSupervised Learning.pdf
Supervised Learning.pdfgadissaassefa
 
Machine Learning Interview Question and Answer
Machine Learning Interview Question and AnswerMachine Learning Interview Question and Answer
Machine Learning Interview Question and AnswerLearnbay Datascience
 
Machine Learning - Regression model
Machine Learning - Regression modelMachine Learning - Regression model
Machine Learning - Regression modelRADO7900
 
Regression vs Neural Net
Regression vs Neural NetRegression vs Neural Net
Regression vs Neural NetRatul Alahy
 
LP linear programming (summary) (5s)
LP linear programming (summary) (5s)LP linear programming (summary) (5s)
LP linear programming (summary) (5s)Dionísio Carmo-Neto
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inferenceKemal İnciroğlu
 
unit 3 regression.pptx
unit 3 regression.pptxunit 3 regression.pptx
unit 3 regression.pptxssuser5c580e1
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsDerek Kane
 

Similar to Shrinkage Methods in Linear Regression (20)

MF Presentation.pptx
MF Presentation.pptxMF Presentation.pptx
MF Presentation.pptx
 
Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...
Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...
Application of Graphic LASSO in Portfolio Optimization_Yixuan Chen & Mengxi J...
 
Regression analysis and its type
Regression analysis and its typeRegression analysis and its type
Regression analysis and its type
 
Get Multiple Regression Assignment Help
Get Multiple Regression Assignment Help Get Multiple Regression Assignment Help
Get Multiple Regression Assignment Help
 
Machine learning session4(linear regression)
Machine learning   session4(linear regression)Machine learning   session4(linear regression)
Machine learning session4(linear regression)
 
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
 
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
 
Machine-Learning-with-Ridge-and-Lasso-Regression.pdf
Machine-Learning-with-Ridge-and-Lasso-Regression.pdfMachine-Learning-with-Ridge-and-Lasso-Regression.pdf
Machine-Learning-with-Ridge-and-Lasso-Regression.pdf
 
Regression ppt.pptx
Regression ppt.pptxRegression ppt.pptx
Regression ppt.pptx
 
Linear programming class 12 investigatory project
Linear programming class 12 investigatory projectLinear programming class 12 investigatory project
Linear programming class 12 investigatory project
 
Supervised Learning.pdf
Supervised Learning.pdfSupervised Learning.pdf
Supervised Learning.pdf
 
Machine Learning Interview Question and Answer
Machine Learning Interview Question and AnswerMachine Learning Interview Question and Answer
Machine Learning Interview Question and Answer
 
Machine Learning - Regression model
Machine Learning - Regression modelMachine Learning - Regression model
Machine Learning - Regression model
 
Regression vs Neural Net
Regression vs Neural NetRegression vs Neural Net
Regression vs Neural Net
 
LP linear programming (summary) (5s)
LP linear programming (summary) (5s)LP linear programming (summary) (5s)
LP linear programming (summary) (5s)
 
Simple lin regress_inference
Simple lin regress_inferenceSimple lin regress_inference
Simple lin regress_inference
 
working with python
working with pythonworking with python
working with python
 
Linear regression
Linear regressionLinear regression
Linear regression
 
unit 3 regression.pptx
unit 3 regression.pptxunit 3 regression.pptx
unit 3 regression.pptx
 
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsData Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
Data Science - Part XII - Ridge Regression, LASSO, and Elastic Nets
 

Recently uploaded

Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxgindu3009
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 

Recently uploaded (20)

Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 

Shrinkage Methods in Linear Regression

  • 1. Shrinkage Methods in Linear Regression June 29, 2018 Benno Geißelmann 1 Motivation Linear Regression has the aim to predict a dependent target variable by a combination of predictor variables (features) together with their according coefficients. The linear regression function is of the following form: ˆyi = w0 + w1xi1 + · · · + wpxip + εi (1) where yi is the ith target value (ˆyi is the ith predicted value) and xi is the according ith feature vector. w0 · · · wp are the p coefficients of the regression function which need to be determined. εi is some random noise. P is the number of features and N is the number of training data. In some scenarios P can be very large, sometimes even larger than N. Therefore in these scenarios there needs to be performed some kind of feature selection in order to shrink the number of used features for the regression model. Otherwise the huge amount of predictor variables would lead to overfitting on the training data and therefore modelling the noise of the data rather than the underlying connections. By shrinking of the coefficients, the variance in the model is reduced. At the same time the bias of the model is increased (bias-variance- dilemma) with the aim to build a model which is able to perform well on unseen data. Lasso and Ridge regression are methods to shrink the amount of features or their according coefficients. In the next sections the concepts of Ridge and Lasso is introduced. 2 The Ridge model The Rigde regression analysis method defines an approach by which it is possible to shrink the size of the coefficients of the regression function. Due to 1
  • 2. this, overfitting is avoided and therefore the ability for generalization beyond the traing set of the regression model is improved. The optimization goal which Ridge is trying to achieve can be formulated like this: min w0,w 1 N N i=1 (yi − w0 − xT i w)2 + λ P j=1 w2 j (2) The first part of the sum is the ordinary least squares (OLS) approach and in fact if you set λ = 0 you get OLS. The second part of the sum (penatly term) is introduced by Ridge. It defines a further restriction on the sum over all coefficients, weighted by some parameter λ. By λ there can be handled how big the influence of the penalty term is for the optimization. The bigger it is, the more influence the penalty term is going to have in the optimization. 3 The Lasso model Lasso (least absolute shrinkage and selection operator) is comparable to Ridge except the difference, that the penatly term uses the absolute value instead of the square number: min w0,w 1 N N i=1 (yi − w0 − xT i w)2 + λ P j=1 |wj| (3) 4 Difference between Ridge and Lasso Model As stated before the difference between the Ridge approach and the Lasso approach the the usage of square number respectively the absolute value. Due to this, the Ridge approach tends to create small but non null coefficients (because squares of numbers less than 1 are small but bigger than 0) where as the Lasso approach tends to set more coefficients to 0. Therefore Lasso is stronger in doing actual feature selection. This might be an important reason why Lasso is the more popular approach for shrinkage in linear regression today. 2
  • 3. 5 Gradient descent for optimization of the Ridge approach For finding a feasable solution for the above described optimization problem of the Ridge regression, there can be used gradient descent. The following algorithm describes this approach: Algorithm 1: Gradient Descent for Ridge Regression Input: iterations Input: η Result: the coefficients w0 · · · wn Cost ← 1 N N i=1(yi − w0 − xT i w)2 + λ P j=1 w2 j ; for it in interations do for wi in w do wi ← wi − η∂Cost ∂wi end end The algorithm gets the number of iterations and the step size η as input. In each iteration there is calculated the partial derivative of the cost func- tion and a certain coefficient wi. Using the negative value of this derivative multiplied by the stepsize, we are descending on the cost function. 6 Coordinate descent for optimization of the Lasso approach Due to the usage of the absolute value in the optimization function of Lasso we cannot directly use gradient descent because the function is not differ- entiable at wj = 0. There is an alternative approach for approximating a solution for this optimization problem called coordinate descent. 3