A Unified Approach to Interpreting Model Predictions (SHAP)

A Unified Approach
to Interpreting
Model Predictions
Rama Irsheidat
Scott M. Lundberg. et al.

Introduction SHAP
Additive Feature
Attribution Methods
Simple Properties
Uniquely Determine
Additive Feature
Attributions
02
01
04
03
TABLE OF CONTENTS
Computational and
User Study
Experiments
05
Conclusion
06

Why do we care so
much about
explainability in ML ?

Introduction
Understanding why a model makes a certain prediction can be as crucial as
the prediction’s accuracy in many applications

Introduction
The reason this problem exists is with in Large Datasets complex models tend
to be very accurate but hard to interpret

Introduction
Focus on explaining individual predictions one at a time.

Introduction
We are replacing the input to a summation that you would get in a linear
model with something that represents the importance of that feature in the
complicated model.

Interprets individual
model predictions based
on locally approximating
the model around a given
prediction.
Interprets the predictions
of deep networks.
Recursive prediction
explanation method for
deep learning.
Shapley regression
values
Quantitative
Input Influence
Shapley sampling
values
LIME
DeepLIFT
Layer-Wise
Relevance
Propagation
Feature importance for linear
models. This method requires
retraining the model on all
features. It assigns an
importance value to each
feature that represents the
effect on the model prediction
of including that feature.
Explaining any model by Applying sampling
approximations to Equation in Shapley reg. values ,
and Approximating the effect of removing a variable
from the model by integrating over samples from the
training dataset.
Proposes a sampling
approximation to Shapley
values that is nearly
identical to Shapley
sampling values.

prediction. Interprets the predictions
of deep networks.
deep learning.
Shapley regression
values
Quantitative
Input Influence
Shapley sampling
values
LIME
DeepLIFT
Layer-Wise
Relevance
Propagation
Feature importances for linear
models in the presence of
multicollinearity. This method
requires retraining the model
on all features. It assigns an
Explain any model by Applying sampling
training dataset.
Proposes a sampling
sampling values.
Additive Feature Attribution Methods

of deep networks.
deep learning.
Shapley regression
values
Quantitative
Input Influence
Shapley sampling
values
LIME
DeepLIFT
Layer-Wise
Relevance
Propagation
training dataset.
Proposes a sampling
sampling values.
Have some better
theoretical grounding
but slower
computation

of deep networks.
deep learning.
Shapley regression
values
Quantitative
Input Influence
Shapley sampling
values
LIME
DeepLIFT
Layer-Wise
Relevance
Propagation
training dataset.
Proposes a sampling
sampling values.
Have faster
estimation but less
guarantees

prediction.
Interprets the predictions
of deep networks.
deep learning.
Shapley regression
values
Quantitative
Input Influence
Shapley sampling
values
LIME
DeepLIFT
Layer-Wise
Relevance
Propagation
training dataset.
Proposes a sampling
sampling values.

How should we define importance of each
feature (φi (f , x))

Base rate of loan rejection or how often do people get their loans denied on
average?

We have to explain is this
35 percent difference
here .
So how can we do this?

We should just take the expected value of the output of our model (Base
rate), then we can just introduce a term into that conditional expectation.
Fact that John's 20, his
risk jumps up by 15
percent.

A very risky profession
and that jumps the risk
up to 70 percent.

He made a ton of money
in the stock market last
year. So his capital gains
pushes him down to 55
percent.

We have to explain is this
35 percent difference
here . So how can we do
this?
We've basically divided up how
we got from here to here by
conditioning one at a time on
all the features until we've
conditioned on all of them.

We can't just pick a particular order and think that we've solved it so what
do we do here?
Meaningless

Simple Properties Uniquely Determine Additive Feature Attributions

Means the output of the
explanation model matches
the original model for the
prediction being explained.
Requires features missing
in the original input to have
no impact.
If you change the original model such that a
feature has a larger impact in every possible
ordering, then that input's attribution
(importance) should not decrease.
Shapley Properties
Local accuracy Missingness
Consistency

SHAP values arise from averaging the φi values
across all possible orderings.
Very painful to compute.

1. Model-Agnostic Approximations
1.1 Shapley sampling values
2.1 Kernel SHAP (Linear LIME + Shapley values)
Linear LIME (uses a linear explanation model) fit a linear model locally to the original
model that we're trying to explain.
Shapley values are the only possible solution that satisfies Properties 1-3 – local
accuracy, missingness and consistency.
This means we can now
estimate the Shapley
values using linear
regression.

2. Model-Specific Approximations
2.1 Linear SHAP
For linear models, SHAP values can be approximated directly from the model’s weight
coefficients.
2.2 Low-Order SHAP
3.2 Max SHAP
Calculating the probability that each input will increase the maximum value over every
other input.

2. Model-Specific Approximations
2.4 Deep SHAP (DeepLIFT + Shapley values)
DeepLIFT
Recursive prediction explanation method for deep learning that satisfies local accuracy
and missingness, we know that Shapley values represent the attribution values that
satisfy consistency.
Adapting DeepLIFT to become a compositional approximation of SHAP values, leading to
Deep SHAP.

Computational and User Study
Experiments

1. Computational Efficiency
Comparing Shapley sampling, SHAP, and LIME on both dense and sparse decision tree
models illustrates both the improved sample efficiency of Kernel SHAP and that values
from LIME can differ significantly from SHAP values that satisfy local accuracy and
consistency.

2. Consistency with Human Intuition
(A) Attributions of sickness score (B) Attributions of profit among three men
Participants were asked to assign importance for the output (the sickness score or
money won) among the inputs (i.e., symptoms or players). We found a much stronger
agreement between human explanations and SHAP than with other methods.

3. Explaining Class Differences
Explaining the output of a convolutional network trained on the MNIST digit dataset.
(A) Red areas increase the probability of that class, and blue areas decrease the
probability . Masked removes pixels in order to go from 8 to 3.
(B) The change in log odds when masking over 20 random images supports the use of
better estimates of SHAP values.

Conclusion
• The growing tension between the accuracy and interpretability of model
predictions has motivated the development of methods that help users
interpret predictions.
• The SHAP framework identifies the class of additive feature importance
methods (which includes six previous methods) and shows there is a
unique solution in this class that adheres to desirable properties.
• We presented several different estimation methods for SHAP values, along
with proofs and experiments showing that these values are desirable.

A Unified Approach to Interpreting Model Predictions (SHAP)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to A Unified Approach to Interpreting Model Predictions (SHAP)

Similar to A Unified Approach to Interpreting Model Predictions (SHAP) (20)

Recently uploaded

Recently uploaded (20)

A Unified Approach to Interpreting Model Predictions (SHAP)