Improving Correlation with Human Judgments by Integrating Second-Order Vectors with Semantic Similarity

Improving Correlation with Human
Judgments by Integrating Second-
Order Vectors with Semantic Similarity
Bridget T. McInnes, PhD
Virginia Commonwealth University
Ted Pedersen, PhD
University of Minnesota, Duluth
tpederse@d.umn.edu
http://www.d.umn.edu/~tpederse

Measuring Similarity & Relatedness
● Similarity != Relatedness (!!!)
● Assign scores to pairs of concepts
● Compare to scores decided on by humans
● Measure correlation
– Often by rank, because scales differ
– Spearman's rank correlation coefficient
● Think about ways to do better

3
Contribution of this Work?
● We show that integrating a similarity measure
into a 2nd
order measure of relatedness
improves correlation to human judgments
– Compare impact of various similarity measures
– Compare to other methods including word2vec
● Focus is on UMLS and medical concepts
although ideas apply more generally

4
Similar or Related?
● Similarity based on is-a relations
– How much is X like Y?
– Share ancestor in is-a hierarchy
● LCS : least common subsumer
● Closer / deeper the ancestor the more similar
● Tetanus and strep_throat are similar
– both are kinds-of bacterial infections

6
Measures of Similarity
● Path based
– Is-a hierarchy
● Path + Depth
– Is-a hierarchy
● Feature
– Is-a hierarchy
● Information Content
– Is-a hierarchy + corpus

7
Similar or Related?
● Relatedness more general
– How much is X related to Y?
– Many ways to be related
● is-a, part-of, treats, affects, symptom-of, …
● Tetanus and puncture_wound are related but
they really aren't similar
– (puncture wounds can cause tetanus)
● All similar concepts are related, but not all
related concepts are similar

8
Measures of Relatedness
● Path based
– Hirst & St-Onge, 1998 (hso)
● Definition based
– Adapted Lesk / Extended Gloss Overlaps (lesk)
● Banerjee & Pedersen, 2003
● Definition + distributional
– 2nd
order Gloss vector (vector)
● Patwardhan & Pedersen, 2006
● Distributional
– Word2Vec, Mikolov et al., 2013

9
Definition Based Relatedness
● Related concepts defined using many of the
same terms
● Concepts don't need to be connected via
relations or paths to measure them
– Lesk, 1986
– Adapted Lesk, Banerjee & Pedersen, 2003

10
BUT! ...
● Definitions are brief, potentially inconsistent
– Alopecia : … a result of cancer_treatment
– Thrush : … a side_effect of chemotherapy
● Lesk matching won't recognize the similarity
between result and side_effect, or between
cancer_treatment and chemotherapy
– Will find alopecia and thrush totally unrelated

11
Gloss Vector Measure
● Rely on co-occurrences of terms
● Allows for a fuzzier notion of matching
● Exploits second order co-occurrences
– Friend of a friend relation
– Suppose cancer_treatment and chemotherapy
don't occur in text with each other. But,
suppose that “survival” occurs with each.
– cancer_treatment and chemotherapy are
second order co-occurrences via “survival”

12
Gloss Vector Measure
● Replace words or terms in definitions with
vector of co-occurrence counts from corpus
● Represent defined concept by the average of
all the vectors of the words in its definition
● Measure relatedness of concepts via cosine
between their respective vectors
● Patwardhan and Pedersen, 2006 (vector)
– Schutze, 1998
– Latent Semantic Analysis

Can We Improve Gloss Vector?
● Instead of constructing second order vectors
using frequency counts or measures of
association...
● Use semantic similarity measures!
– Not all pairs of concepts will have similarity
values, but some do!
– Weight co-occurrences based on how similar
they are...

Integrated 2nd
Order Vector
● Construct a co-occurrence matrix from an
external corpus
– NLM Medline Bigram data
– https://mbr.nlm.nih.gov
– Bigram counts from 2014 Medline baseline
● 44 million bigrams
● Replace co-occurrence counts in matrix with
similarity measure scores
– UMLS::Similarity
– http://umls-similarity.sourceforge.net

Integrated 2nd
Order Vector
● Build second order vector for each concept
● Obtain definitions of concept (from UMLS)
– Augment with definitions of parents (PAR),
children (CHD), broader than (RB), and
narrower than (RN) relations
– Look up vector for each word in definition
– Average these vectors together
– Resulting averaged vector represents the
concept

16
● Path
– Rada et al., 1989 (path)
– Caviedes & Cimino, 2004 (cdist)
● Path + Depth
– Wu & Palmer, 1994 (wup)
– Leacock & Chodorow, 1998 (lch)
– Pekar & Staab, 2002 (pks)
– Zhong et al., 2002 (zhong)
– Nguyen & Al-Mubaid, 2006 (nam)

17
● Feature
– Maedche & Staab, 2001 (cmatch)
– Batet & Valls, 2011 (batet)
● Information Content
– Resnik, 1995 (res)
– Jiang & Conrath, 1997 (jcn)
– Lin, 1998, (lin)
– Pirro & Euzenat, 2010 (faith)

Reference Standards
● UMNSRS : 587 pairs ranked for relatedness,
566 for similarity, both by medical residents
– We used subsets of 430 and 401 pairs
– ICC > .7
● MayoSRS : 101 pairs ranked by physicians
and (separately) by medical coders
– MiniMayoSRS – 30 pair subset
● http://www.people.vcu.edu/~btmcinnes/

Results
UMNSRS
similarity
UMNSRS
relatedness
MiniMayoSRS
MD
relatedness
MiniMayoSRS
coder
relatedness
1st order
distributional
.47 .36 .43 .54
Extended
Lesk
.49 .33 .52 .56
2nd order
gloss vector
.54 .45 .63 .59
Vector-faith .59 .42 .58 .63
Vector-res .58 .41 .58 .65

Thresholds
● Remove all similarity scores less than a given
threshold
● Experiments with thresholds with res and faith
showed that results improved significantly with
some threshold settings

Vector-res Results with Thresholds
T #
bigrams
UMN
sim
UMN
rel
Mayo
MD
Mayo
Coder
0 851k .58 .41 .58 .65
1 166k .56 .39 .60 .67
2 65k .64 .47 .56 .62
3 28k .60 .46 .62 .71
4 11k .56 .43 .75 .76
5 3k .26 .16 .36 .36

Discussion
● Information content measures fared well!
● Clear advantage to filtering low similarity
scores, but ...
– How low, and how do we set?
– Values of thresholds vary with measures
● With reference standard?
● With corpora used for co-occurrences?

Related work
● Various studies using word2vec
– UMNSRS, MayoSRS, and MiniMayoSRS
– CBow and / or skip-grams with various
different kinds of corpora
● Vector retrofitting (Yu et al., 2016) very related!
– Map terms to MeSH terms, build vectors
based on documents assigned those terms
– Include semantically related words from UMLS
– MiniMayoSRS

Word2vec UMNSRS comparisons
Similarity Relatedness
Vector-res (T=0) .58 (N=401) .41 (N=430)
Vector-res (T=0) .59 (N=566) .48 (N=587)
Vector-faith (T=0) .59 (N=401) .42 (N=430)
Vector-faith (T=0) .61 (N=566) .49 (N=587)
Sajadi et al., 2015 .39 (N=566) .39 (N=587)
Pakhomov et al., 2016 .62 (N=449) .58 (N=458)
Muneeb et al., 2015 .52 (N=462) .45 (N=465)
Chiu et al., 2016 .65 (N=???) .60 (N=???)

Future Work
● Why more improvement on similarity results?
● Regularize comparisons (!!!)
● Which corpora for co-occurrences?
● Which definitions to represent concepts?
● How can threshold be automatically set?
● What about WordNet and general English?

Relatedness dragon eats similarity tail

27
Resources
● UMLS::Similarity
– http://umls-similarity.sourceforge.net
● Corpus for co-occurrences
– https://mbr.nlm.nih.gov/
● Reference Standards
– http://www.people.vcu.edu/~btmcinnes/data.html

Improving Correlation with Human Judgments by Integrating Second-Order Vectors with Semantic Similarity

Recommended

Recommended

More Related Content

Similar to Improving Correlation with Human Judgments by Integrating Second-Order Vectors with Semantic Similarity

Similar to Improving Correlation with Human Judgments by Integrating Second-Order Vectors with Semantic Similarity (20)

Recently uploaded

Recently uploaded (20)

Improving Correlation with Human Judgments by Integrating Second-Order Vectors with Semantic Similarity