2. DISCRIMINANT ANALYSIS
⢠Discriminant analysis is a statistical procedure which
allows us to classify cases in separate categories to
which they belong on the basis of a set of
characteristic independent variables called predictors
or discriminant variables
⢠The target variable (the one determining allocation
into groups) is a qualitative (nominal or ordinal) one,
while the characteristics are measured by quantitative
variables.
⢠DA looks at the discrimination between two groups
⢠Multiple discriminant analysis (MDA) allows for
classification into three or more groups.
3. APPLICATIONS OF DA
DA is especially useful to understand the differences and
factors leading consumers to make different choices
allowing them to develop marketing strategies which
take into proper account the role of the predictors.
Examples:
⢠Determinants of customer loyalty
⢠Shopper profiling and segmentation
⢠Determinants of purchase and non-purchase
4. EXAMPLE ON THE TRUST DATA-SET
⢠Purchasers of Chicken at the Butcherâs Shop
⢠Respondents may belong to one of two
groups
⢠Those who purchase chicken at the butcherâs shop
⢠Those who do not
⢠Discrimination between these groups through
a set of consumer characteristics
⢠Expenditure on chicken in a standard week
⢠Age of the respondent
⢠Whether respondents agree (on a seven-point ranking scale) that butchers sell safe chicken
⢠Trust (on a seven-point ranking scale) towards supermarkets
⢠Does a linear combination of these four
characteristics allow one to discriminate
between those who buy chicken at the
butcherâs and those who do not?
5. DISCRIMINANT ANALYSIS (DA)
⢠Two groups only, thus a single discriminating value
(discriminating score)
⢠For each respondent a score is computed using the
estimated linear combination of the predictors (the
discriminant function)
⢠Respondents with a score above the discriminating
value are expected to belong to one group, those below
to the other group.
⢠When the discriminant score is standardized to have
zero mean and unity variance it is called Z score
⢠DA also provides information about the discriminating
power of each of the original predictors
6. MULTIPLE DISCRIMINANT ANALYSIS (MDA) (1)
Discriminant analysis may involve more than
two groups, in which case it is termed
multiple discriminant analysis (MDA).
Example from the Trust data-set
⢠Dependent variable: Type of chicken purchased âin a
typical weekâ, choosing among four categories: value
(good value for money), standard, organic and luxury
⢠Predictors: age , stated relevance of taste , value for
money and animal welfare , plus an indicator of
income
7. MULTIPLE DISCRIMINANT ANALYSIS (2)
⢠In this case there will be more than one discriminant
function.
⢠The exact number of discriminant functions is equal to
either (g-1), where g is the number of categories in
classification or to k, the number of independent
variables, whichever is the smaller
⢠Trust example: four groups and five explanatory
variables, the number of discriminant functions is
three (that is g-1 which is smaller than k=5).
8. THE OUTPUT OF MDS
Similarities with factor (principal component)
analysis
⢠the first discriminant function is the most relevant for
discriminating across groups, the second is the second
most relevant, etc.
⢠the discriminant functions are also independent, which
means that the resulting scores are non-correlated.
⢠Once the coefficients of the discriminant functions are
estimated and standardized, they are interpreted in a
similar fashion to the factor loadings.
⢠The larger the standardised coefficients (in absolute
terms), the more relevant the respective variables to
discriminating between groups
There is no single discriminant score in MDA
⢠group means are computed (centroids) for each of the
discriminant functions to have a clearer view of the
classification rule
9. RUNNING DISCRIMINANT ANALYSIS
(2 GROUPS)
9
0 1 1 2 2 3 3 4 4z x x x xÎą Îą Îą Îą Îą= + + + +
Discriminant function
(Target variable: purchasers of chicken at the
butcherâs shop)
Discriminant score Predictors
⢠weekly expenditure on chicken
⢠age
⢠safety of butcherâs chicken
⢠trust in supermarkets
The Îą discriminant coefficients
need to be estimated
10. FISHERâS LINEAR DISCRIMINANT ANALYSIS
The discriminate function is the starting point.
Two key assumptions behind linear DA
(a) the predictors are normally distributed;
(b) the covariance matrices for the predictors within each of the groups are
equal.
Departure from condition (a) should suggest use of
alternative methods.
Departure from condition (b) requires the use of
different discriminant techniques (usually quadratic
discriminant functions).
In most empirical cases, the use of linear DA is
appropriate.
10
11. ESTIMATION
The first step is the estimation of the Îą coefficients,
also termed as discriminant coefficients or
weights.
Estimation is similar to factor analysis or PCA, as the
coefficients are those which maximize the
variability between groups
In MDA, the first discriminating function is the one
with the highest between-group variability, the
second discriminating function is independent
from the first and maximizes the remaining
between-group variability and so on
11
12. SPSS â TWO GROUPS CASE
12
1. Choose the
target variable
2. Define the range of the
dependent variable
3.Select the
predictors
14. CLASSIFICATION OPTIONS
14
Decide whether prior
probabilities are equal
across groups or group
sizes reflect different
allocation probabilities
These are diagnostic
indicators to evaluate
how well the
discriminant function
predict the groups
15. SAVE CLASSIFICATION
15
Create new variables in
the data-set,
containing the
predicted group
membership and/or
the discriminant score
for each case and each
function
16. OUTPUT â COEFFICIENT ESTIMATES
16
Canonical Discriminant Function Coefficients
.095
.454
-.297
.025
-2.515
In a typical week how
much do you spend
on fresh or frozen
chicken (Euro)?
From the butcher
Supermarkets
Age
(Constant)
1
Function
Unstandardized coefficients
Standardized Canonical Discriminant Function Coefficients
.378
.748
-.453
.394
In a typical week how
much do you spend
on fresh or frozen
chicken (Euro)?
From the butcher
Supermarkets
Age
1
Function
Unstandardized coefficients depend
on the measurement unit
Standardized
coefficients do not
depend on the
measurement unit
Most important
predictor
Trust in
supermarkets
has a â sign (thus
it reduces the
discriminant
score)
17. CENTROIDS
17
Prior Probabilities for Groups
.660 277 277.000
.340 143 143.000
1.000 420 420.000
Butcher
no
yes
Total
Prior Unweighted Weighted
Cases Used in Analysis
Functions at Group Centroids
-.307
.594
Butcher
no
yes
1
Function
Unstandardized canonical discriminant
functions evaluated at group means
These are the means of the
discriminant score for each of the
two groups
Thus, the group of those not
purchasing chicken at the butcherâs
shop have a negative centroid
With two groups, the discriminating score is zero
This can be computed by weighting the centroids with the initial probabilities
From these prior probabilities it follows
that the discriminating score is -0.307 x
0.66 + 0.594 x 0.34 = 0
18. OUTPUT â CLASSIFICATION SUCCESS
18
Classification Resultsa
244 33 277
88 55 143
1 1 2
88.1 11.9 100.0
61.5 38.5 100.0
50.0 50.0 100.0
Butcher
no
yes
Ungrouped cases
no
yes
Ungrouped cases
Count
%
Original
no yes
Predicted Group
Membership
Total
71.2% of original grouped cases correctly classified.a.
Using the discriminant function, it is possible to correctly classify 71.2% of
original cases (244 no-no + 55 yes-yes)/420
19. DIAGNOSTICS (1)
⢠Boxâs M test. This tests whether covariances are equal
across groups
⢠Wilksâ Lambda (or U statistic) tests discrimination
between groups. It is related to analysis of variance.
⢠Individual WilksâLambda for each of the predictors in a discriminant function; univariate ANOVA
(are there significant differences in the predictorâs means between the groups?), p-value from the
F distribution.
⢠Wilksâ Lambda for the function as a whole. Are there significant differences in the group means for
the discriminant function p-value from the Chi-square distribution?
⢠The overall Wilksâ Lambda is especially helpful in
multiple discriminant analysis as it allows one to discard
those functions which do not contribute towards
explaining differences between groups.
19
20. DIAGNOSTICS (2)
DA returns one eigenvalue (or more eigenvalues for
MDA) of the discriminant function.
These can be interpreted as in principal component
analysis
In MDA (more than one discriminant function)
eigenvalues are exploited to compute how each
function contributes to explain variability
The canonical correlation measures the intensity of the
relationship between the groups and the single
discriminant function
20
21. TRUST EXAMPLE: DIAGNOSTICS
21
Statistic P-value
Box's M statistic 37.3 0.000
Overall Wilks' Lambda 0.85 0.000
Wilks Lambda for
Expenditure 0.98 0.002
Age 0.97 0.001
Safer for Butcher 0.91 0.000
Trust in Supermarket 0.98 0.002
Eigenvalue 0.18
Canonical correlation 0.39
% OF CORRECT
PREDICTIONS
71.2%
Covariance matrices are not
equal
The overall discriminating
power of the DF is good
All of the predictors are
relevant to discriminating
between the two groups
The eigenvalue is the ratio
between variances between
and variance within groups (the
larger the better)
Square root of the ratio between variability
between and total variability
22. MDA
22
To run MDA in SPSS the only
difference is that the range has
more than two categories
23. PREDICTORS
23
Test Results
65.212
1.382
45
53286.386
.045
Box's M
Approx.
df1
df2
Sig.
F
Tests null hypothesis of equal population covariance matrices.
Tests of Equality of Group Means
.981 1.798 3 282 .148
.971 2.761 3 282 .042
.960 3.878 3 282 .010
.982 1.679 3 282 .172
.919 8.272 3 282 .000
Age
Tasty food
Value for money
Animal welfare
Please indicate your
gross annual household
income range
Wilks'
Lambda F df1 df2 Sig.
Three predictors only appear to be relevant in
discriminating among preferred types of
chicken
Null rejected at 95% c.l.,
but not at 99% c.l.
24. DISCRIMINANT FUNCTIONS
24
Eigenvalues
.102a 61.0 61.0 .304
.051a 30.8 91.8 .221
.014a 8.2 100.0 .116
Function
1
2
3
Eigenvalue % of Variance Cumulative %
Canonical
Correlation
First 3 canonical discriminant functions were used in the
analysis.
a.
Three discriminant functions (four groups minus one) can be
estimated
Wilks' Lambda
.851 45.098 15 .000
.938 17.904 8 .022
.986 3.818 3 .282
Test of Function(s)
1 through 3
2 through 3
3
Wilks'
Lambda Chi-square df Sig.
The first two discriminant
functions have a
significant discriminating
power.
25. COEFFICIENTS
25
Discriminant functionsâ coefficients
Unstandardized Standardized
1 2 1 2
Value for money -.043 .603 -.053 .746
Age -.009 -.013 -.148 -.208
Tasty food .169 .416 .152 .374
Animal welfare .186 -.132 .313 -.222
Please indicate your gross
annual household income
range
.652 -.033 .870 -.044
(Constant) -2.298 -4.868
Income is very
relevant for
the first
function
Value for money is
very relevant for the
second function
26. STRUCTURE MATRIX
26
Structure Matrix
.929* -.021 .078
.390* -.206 .125
-.010 .891* .168
.241 .660* .273
-.217 -.204 .944*
Please indicate your
gross annual household
income range
Animal welfare
Value for money
Tasty food
Age
1 2 3
Function
Pooled within-groups correlations between discriminating
variables and standardized canonical discriminant functions
Variables ordered by absolute size of correlation within function.
Largest absolute correlation between each variable and
any discriminant function
*.
The values in the structure
matrix are the correlations
between the individual
predictors and the scores
computed on the discriminant
functions.
For example, the income
variable has a strong
correlation with the scores of
the first function
The structure matrix help
interpreting the functions
Income
Value and
taste
Age
27. CENTROIDS
27
Functions at Group Centroids
-.673 -.262 -.040
.058 .156 -.065
.525 -.470 -.030
.003 .052 .242
In a typical week, what
type of fresh or frozen
chicken do you buy for
your household's
home consumption?
'Value' chicken
'Standard' chicken
'Organic' chicken
'Luxury' chicken
1 2 3
Function
Unstandardized canonical discriminant functions evaluated at
group means
The first function discriminates
well between value and
organic (income matters to
organic buyers)
The second allows some discrimination
standard-organic, value-standard,
organic-luxury (taste and value matter)
28. PLOT OF TWO FUNCTIONS
28
The âterritorial mapâ shows the
scores for the first two functions
considering all groups
Tick âseparate-groupsâ to show
graphs of the first two functions
for each individual group
29. PLOTS: INDIVIDUAL GROUPS
29
Example: organic chicken
Most cases tend to be relatively
high on function 1 (income)
Example: organic chicken
Most cases tend to be relatively
high on function 1 (income)
31. PREDICTION RESULTS
31
Classification Resultsa
3 38 0 0 41
2 154 1 0 157
1 30 4 0 35
1 51 1 0 53
0 51 3 0 54
7.3 92.7 .0 .0 100.0
1.3 98.1 .6 .0 100.0
2.9 85.7 11.4 .0 100.0
1.9 96.2 1.9 .0 100.0
.0 94.4 5.6 .0 100.0
In a typical week, what
type of fresh or frozen
chicken do you buy for
your household's
home consumption?
'Value' chicken
'Standard' chicken
'Organic' chicken
'Luxury' chicken
Ungrouped cases
'Value' chicken
'Standard' chicken
'Organic' chicken
'Luxury' chicken
Ungrouped cases
Count
%
Original
'Value'
chicken
'Standard'
chicken
'Organic'
chicken
'Luxury'
chicken
Predicted Group Membership
Total
56.3% of original grouped cases correctly classified.a.
The functions do not predict well; most
units are allocated to standard chicken â
on average only 56.3% of the cases are
allocated correctly
32. STEPWISE DISCRIMINANT ANALYSIS
As for linear regression it is possible to decide whether all predictors
should appear in the equation regardless of their role in
discriminating (the Enter option) or a sub-set of predictors is chosen
on the basis of their contribution to discriminating between groups
(the Stepwise method)
32
33. THE STEP-WISE METHOD
1. A one-way ANOVA test is run on each of the predictors, where the
target grouping variable determines the treatment levels. The ANOVA
test provides a criterion value and tests statistics (usually the Wilks
Lambda). According to the criterion value, it is possible to identify
the predictor which is most relevant in discriminating between the
groups
2. The predictor with the lowest Wilks Lambda (or which meets an
alternative optimality criterion) enters the discriminating function,
provided the p-value is below the set threshold (for example 5%).
3. An ANCOVA test is run on the remaining predictors, where the
covariates are the target grouping variables and the predictors that
have already entered the model. The Wilks Lambda is computed for
each of the ANCOVA options.
4. Again, the criteria and the p-value determine which variable (if any)
enter the discriminating function (and possibly whether some of the
entered variables should leave the model).
5. The procedure goes back to step 3 and continues until none of the
excluded variables have a p-value below the threshold and none of
the entered variables have a p-value above the threshold (the
stopping rule is met).
33
36. OUTPUT OF THE STEP-WISE METHOD
36
Variables in the Analysis
1.000 8.272
1.000 8.241 .960
1.000 3.863 .919
Please indicate your
gross annual household
income range
Please indicate your
gross annual household
income range
Value for money
Step
1
2
Tolerance F to Remove
Wilks'
Lambda
Variables Not in the Analysis
1.000 1.000 1.798 .981
1.000 1.000 2.761 .971
1.000 1.000 3.878 .960
1.000 1.000 1.679 .982
1.000 1.000 8.272 .919
.988 .988 1.507 .905
.991 .991 2.437 .896
1.000 1.000 3.863 .883
.992 .992 1.052 .909
.987 .987 1.549 .868
.821 .821 .793 .875
.992 .992 1.057 .873
Age
Tasty food
Value for money
Animal welfare
Please indicate your
gross annual household
income range
Age
Tasty food
Value for money
Animal welfare
Age
Tasty food
Animal welfare
Step
0
1
2
Tolerance
Min.
Tolerance F to Enter
Wilks'
Lambda
Only two predictors are
kept in the model
37. APPLICATIONS IN MARKETING:
After getting to know the Technical Aspect of this
useful concept,
we can conclude that DA has the following applications
in the field
of Marketing:
⢠Discriminate analysis, a multivariate technique used
for market segmentation and predicting group
membership is often used for this type of problem
because of its ability to classify individuals or
experimental units into two or more uniquely defined
populations.
38. ⢠Product research â Distinguish between heavy,
medium, and light users of a product in terms of
their consumption habits and lifestyles.
⢠Perception/Image research â Distinguish between
customers who exhibit favorable perceptions of a
store or company and those who do not.
⢠Advertising research â Identify how market segments
differ in media consumption habits.
⢠Direct marketing â Identify the characteristics of
consumers who will respond to a direct marketing
campaign and those who will not.