SlideShare a Scribd company logo
1 of 40
RESQU: A FRAMEWORK FOR AUTOMATIC EVALUATION OF
KNOWLEDGE-DRIVEN AUTOMATIC SUMMARIZATION
MASTERS THESIS DEFENSE
NISHITA JAYKUMAR
MAY 26, 2016
MASTERS COMMITTEE
AMIT P. SHETH (ADVISOR)
THOMAS C. RINDFLESCH (NIH)
DELROY CAMERON (APPLE INC.)
KRISHNAPRASAD THIRUNARAYAN
1
Main Issue: Indirect Information access
PubMed Search Service
2
3
Acetaminophen TREATS Migraine Disorders
Sumatriptan TREATS Migraine Disorder
Topiramate PREVENTS Migraine Disorders
More direct Information access
Semantic MEDLINE
3
Thesis Motivation
• Automatically evaluate summaries in Semantic MEDLINE.
• Identify features that impact summary quality.
• Improve semantic summaries it generates.
4
Outline
• Automatic summarization
- Extractive, abstractive
- Summarization in semantic MEDLINE and ResQu
• Automatic summarization evaluation
- Intrinsic, extrinsic
• Datasets
- UMLS, SemRep, MetaMap
• Approach
- Summary transformation
- Semantic similarity
• Experimental evaluation
• Conclusion
5
• What is an effective summary?
- Saliency
- Compressed format
• Approaches to Automatic Summarization
Automatic Summarization
Extractive Abstractive
6
Extractive summary
A randomized, placebo-controlled trial of
acetaminophen for treatment of migraine
headache.
Long-term evaluation of sumatriptan and
naproxen sodium for the acute treatment of
migraine in adolescents.
…………….
Mapping from disease-specific measures to
health-state utility values in individuals with
migraine.
Abstractive summary
Acetaminophen TREATS Migraine Disorders
Sumatriptan TREATS Migraine Disorders
…………….
Migraine Disorders PROCESS_OF Individuals
Semantic MEDLINE Summarization System Overview
Source
Documents
Conceptual
Representation
Conceptual
Condensate
Semantic
Predications
Semantic
Predications
Feature application:
• Relevance
• Connectivity
• Novelty
• Saliency
Interpretation Transformation
Reduction
Generalization
SemRep
Semantic
Summary
Generalization
7
Aspirin TREATS Coronary artery
disease
Coronary artery disease
COEXISTS_WITH Inflammation
Coronary artery disease ISA
Vascular disease
tomography DIAGNOSIS Coronary
artery disease
Intrinsic Evaluation:
- Compared to a human-curated gold standard.
- Using document similarity measures.
• Evaluating Summary Quality
Evaluating Summaries
Extrinsic evaluation:
- Based on a secondary task.
- Through a discrete scoring system.
8
Intrinsic Evaluation of Extractive Summariztion
• Pyramid Approach [Nenkova et al., 2004]
- Summary Content Units (SCU)
• Louis et al [2009]
• Distribution of terms
• Kullback-Liebler
• Jensen-Shannon
Nenkova, Ani, and Rebecca Passonneau. "Evaluating content selection in summarization: The pyramid method."
(2004). Louis, Annie, and Ani Nenkova. "Automatic summary evaluation without human models." Notebook
Papers and Results, Text Analysis Conference (TAC-2008), Gaithersburg, Maryland (USA). 2008.
9
• Information Misalignment
• Semantic summary – structured background knowledge.
• Gold standard – textual.
• Proposed Solution
• Summary transformation: predications to text.
• Semantic similarity computation.
Intrinsic Evaluation of Abstractive Summarization
10
Approach: ResQu
We can use the words that co-occur with the semantic predications in a summary to represent
the meaning of the semantic predications based on distributional semantics.
By generating multiple summaries with features held-out, we can effectively evaluate the impact
of each feature.
Word Co-occurrence
Leave-one-Out
11
A semantic summary can be understood and potentially improved by leveraging distributional
statistics between the structured knowledge that comprises the semantic summary and the
words with which these structured constructs co-occur, across the corpus.
Thesis Statement
12
3
valproic acid TREAT migraine
Sumatriptan TREATS Migraine Disorders
lamotrigine TREATS Migraine with Aura
Dihydroergotamine TREATS Migraine Disorders
Acetaminophen TREATS Migraine Disorders
Aspirin TREATS Migraine Disorders
zolmitriptan TREATS Migraine Disorders
eletriptan TREATS Migraine Disorders
Analgesics TREATS Migraine Disorders
ziconitide TREATS Migraine Disorders
Semantic Predications
…
…
Proposed Solution (ResQu)
Co-occurring arguments
Semantic summary
vector
13
…
• Similarity between SS and GS
- Cosine similarity, Euclidean distance, Jensen-Shannon divergence
• Root Mean-Squared Error
• For each summary generated with a feature held-out
Measuring Similarity
The summary that is least similar to the gold standard has the most important feature.
14
6
Assertional
Knowledge
Definitional
Knowledge
ComplementaryDisjoint
65 Attributes:
62 Provenance Metadata 3
Semantic Attributes
MEDLINE
(1865 – 2015)
Largest Biomedical
Knowledgebase,
>25 million abstracts,
PubMed, PMC
Semantic Predications
Medical Subject Headings (MeSH)
15 Unique Trees, Max Depth – 15
~27,000 Terms
SPECIALIST Lexicon
Semantic Network
Metathesaurus
>300k concepts
>100 Vocabularies
9 million triples
134 Types
15 Groups
54 predicates
Unified Medical Language System (UMLS)
MeSH Indexing
d1
d2
d3
dn
Resource-Rich
Biomedical Knowledge
15
1
ResQu System Architecture
User Query
Processor
Document
Selector
Predication
Mapper
Concept
Mapper
Summarizer
(Schema
Summarizer)
Vectorizer
Predication
Extractor
(SemRep)
Graph
Generator
ResQu
Summary
Vectors
MEDLINE
15
Jericho Crawler
Gold standard
Vectors
Similarity
Computation
Module
Gold standard
creation module
User Query
• l: label of an entity (or concept) in the UMLS,
- Migraine Disorders: C0149931
• c1: Humans[MH] and c2: Clinical Trial [PTYP]
• dt: the date range of documents
• ub: is the upper bound (default = 5000)
q = (l, c1, c2, dt, ub)
17
8
q = (Migraine Disorders[MH] AND Humans[MH] AND Clinical Trial
[PTYP] AND 1860/01:2014/08[DCOM])
User Query Instance
18
9
• Query from the User Query Processor.
• Retrieves the set of MEDLINE documents.
• D = {d1; d2;. . . ; dn}
• Uses the MEDLINE Entrez Search API.
Document Selection
20
Document Selection
21
Semantic Predications Extractor
22
A randomized, placebo-controlled trial of acetaminophen for
treatment of migraine disorders
Acetaminophen Migraine disorders
treats
Automatic Summarizer
Inflammation mediated by the immune system is known to be important in carcinogenesis and, specifically, T helper 17 cells have been reported to play a role in tumor
progression by promoting neo-angiogenesis. The aim of this study was to investigate whether inflammatory cytokines and vascular endothelial growth factor (VEGF) levels
in exhaled breath condensate (EBC) and in serum were related to tumor size in patients with non-small cell lung cancer (NSCLC). Il-6, IL-17, TNF-α and VEGF levels were
measured in EBC and serum of 15 patients with stage I-IIA NSCLC and in 30 healthy controls by immunoassay. The tumor size was measured by a CT scan. The
concentrations of IL-6, IL-17 and VEGF were significantly higher in EBC of patients with lung cancer, compared with controls, while only serum IL-6 concentration was
higher in patients compared to controls. A significant correlation (r = 0.78, p = 0.001) was observed between EBC levels of IL-6 and IL-17; IL-17 was also correlated to EBC
levels of the VEGF (r = 0.83, p < 0.001) and TNF-α (r = 0.62, p = 0.014). The tumor diameter was significantly correlated with EBC concentrations of VEGF (r = 0.58, p =
0.039), IL-6 (r = 0.67, p = 0.013) and IL-17 (r = 0.66, p = 0.017). Our results show a significant relationship between inflammatory and angiogenic markers, measured in
EBC by a non-invasive method, and tumor mass. To assess whether polymorphisms of the interleukin-23 receptor (IL23R) gene are associated with bladder transitional cell
carcinoma because chronic inflammation contributes to bladder cancer and the IL23R is known to be critically involved in the carcinogenesis of various malignant tumors.
226 patients with bladder cancer and 270 age-matched controls were involved in the study. Polymerase chain reaction-restriction fragment length polymorphism was used
for genotyping. Genotype distribution and allelic frequencies between patients and controls were compared. In all three single nucleotide polymorphisms of IL23R studied,
the distribution of genotype and allele frequencies of rs10889677 differed significantly between patients and controls. The frequency of allele C of rs10889677 was
significantly increased in cases compared with controls (0.2898 vs. 0.1833, odds ratio 1.818, 95 % confidence interval 1.349-2.449). The result indicates that IL23R may
play an important role in the susceptibility of bladder cancer in Chinese population. For over a century, inactivated or attenuated bacteria have been employed in the clinic
as immunotherapies to treat cancer, starting with the Coley's vaccines in the 19th century and leading to the currently approved bacillus Calmette-Guérin vaccine for
bladder cancer. While effective, the inflammation induced by these therapies is transient and not designed to induce long-lasting tumor-specific cytolytic T lymphocyte
(CTL) responses that have proven so adept at eradicating tumors. Therefore, in order to maintain the benefits of bacteria-induced acute inflammation but gain long-lasting
anti-tumor immunity, many groups have constructed recombinant bacteria expressing tumor-associated antigens (TAAs) for the purpose of activating tumor-specific CTLs.
One bacterium has proven particularly adept at inducing powerful anti-tumor immunity, Listeria monocytogenes (Lm). Lm is a gram-positive bacterium that selectively
infects antigen-presenting cells wherein it is able to efficiently deliver tumor antigens to both the MHC Class I and II antigen presentation pathways for activation of tumor-
targeting CTL-mediated immunity. Lm is a versatile bacterial vector as evidenced by its ability to induce therapeutic immunity against a wide-array of TAAs and specifically
infect and kill tumor cells directly. It is for these reasons, among others, that Lm-based immunotherapies have delivered impressive therapeutic efficacy in preclinical
models of cancer for two decades and are now showing promise clinically. The result indicates that IL23R may play an important role in the susceptibility of bladder cancer
in Chinese population. For over a century, inactivated or attenuated bacteria have been employed in the clinic as immunotherapies to treat cancer, starting with the Coley's
vaccines in the 19th century and leading to the currently approved bacillus Calmette-Guérin vaccine for bladder cancer. While effective, the inflammation induced by these
therapies is transient and not designed to induce long-lasting tumor-specific cytolytic T lymphocyte (CTL) responses that have proven so adept at eradicating tumors.
Therefore, in order to maintain the benefits of bacteria-induced acute inflammation but gain long-lasting anti-tumor immunity, many groups have constructed recombinant
bacteria expressing tumor-associated antigens (TAAs) for the purpose of activating tumor-specific CTLs. One bacterium has proven particularly adept at inducing powerful
anti-tumor immunity, Listeria monocytogenes (Lm). Lm is a gram-positive bacterium that selectively infects antigen-presenting cells wherein it is able to efficiently deliver
tumor antigens to both the MHC Class I and II antigen presentation pathways for activation of tumor-targeting CTL-mediated immunity. Lm is a versatile bacterial vector as
evidenced by its ability to induce therapeutic immunity against a wide-array of TAAs and specifically infect and kill tumor cells directly. It is for these reasons, among others,
that Lm-based immunotherapies have delivered impressive therapeutic efficacy in preclinical models of cancer for two decades and are now showing promise clinically.
inflammation contributes to bladder cancer and the IL23R is known to be critically involved in the carcinogenesis of various malignant tumors. 226 patients with bladder
cancer and 270 age-matched controls were involved in the study. Polymerase chain reaction-restriction fragment length polymorphism was used for genotyping. Genotype
distribution and allelic frequencies between patients and controls were compared. In all three single nucleotide polymorphisms of IL23R studied, the distribution of genotype
and allele frequencies of rs10889677 differed significantly between patients and controls. The frequency of allele C of rs10889677 was significantly increased in cases
compared with controls (0.2898 vs. 0.1833, odds ratio 1.818, 95 % confidence
Ibuprofen
Topiramate
Headache
Acetaminophen
TREATS
PREVENTS
ISA
LOCATION_OF
Migraine
Disorders
Migraine
Disorders
Migraine
Disorders
Migraine
Disorders
TREATS
Migraine
Disorders
Migraine
Disorders
Vestibule
Pain
ISA
24
Semantic Summary
25
Step 1: get all documents for each concept in semantic summary.
Step 2: create bag-of-words for each concept (term-frequency).
Step 3: then aggregate the bag-of-words for each concept in the entire
semantic summary.
Step 4: we use the idfs for each words in the corpus to create the tf-idf vector for the
given semantic summary.
Summary Transformation
𝑡𝑓𝑖𝑑𝑓 𝑡, 𝑑, 𝐷 = 𝑡𝑓 𝑡, 𝑑 ∗ log
𝑁
𝑛 𝑡
26
Bag-of-words Model
We used hemofiltration to treat a patient with digoxin overdose that was
complicated by refractory hyperkalemia.
bow = [(we,1), (used,1), . . ., (hyperkalemia,1)]
bow_sparse_vector =[(678,1), (2,1), . . ., (999,1)]
27
Dictionary Creation
28
Term Index Document id
ibuprofen 0 1,3,…,3000
.
.
.
migraine 5 5,6,…,475
Documents
ibuprofen is …. migraine
Ibuprofen is effective in treating Migraine
Gold Standard Dataset
29
Gold Standard Vectorization
Step 1: iterate over the each document in the gold standard.
Step 2: tokenize each sentence.
Step 3: create the bag-of-words model.
Step 4: we use the idfs for each word from the dictionary to create the tf-idf
vector for the gold standard.
Problem: data sparsity.
30
Gold Standard Vectorization Enhancement
Step 1: MetaMap the gold standard document.
Step 2: create bag-of-words for each concept (term frequency).
Step 3: then aggregate the bag-of-words for each concept bag-of-words for
summary.
Step 4: we use the idfs for each word from the dictionary to create the tf-idf
vector for the gold standard.
Solution: enhance with context clues from corpus.
31
Step 1: select 20 disease as topics for an information need.
Step 2: use each query to generate a semantic summary.
Step 3: transform each semantic summary into semantic summary vectors.
Step 4: transform each gold standard into a gold standard tf-idf vectors.
Step 5: compute the similarity between a semantic summary vector and its associated
gold standard vector under different features.
Step 6: determine the features that generate the most informative summary in each
scenario.
Evaluation: Overall Approach
32
• Cosine Similarity
• Euclidean distance
→
• Jensen-Shannon Distance
Summarization Evaluation Metrics
𝑠, 𝑇 =
𝑠 ⋅ 𝑇
𝑠 𝑇
cosine
ⅇ 𝑠, 𝑇 =
𝑖=1
𝑛
𝜔𝑖 − 𝑡𝑖
2
𝐽𝑆𝐷( 𝑠| 𝑇 =
1
2
[𝐾𝐿( 𝑠| 𝑀 + 𝐾𝐿 𝑇 𝑀 ,
K𝐿 𝑠||𝑇 =
i=1
𝑛
p w 𝑖
log
P w 𝑖
P 𝑡 𝑖
where 𝑀 =
1
2
(𝑠′ + 𝑇)
33
32
Cosine Similarity
3 00 00 02 0 0 03 22
5 42 53 61 3 1 20 00
– Gold standard vector
– semantic summary vector
𝑇
𝑠
𝑇
𝑠
𝑠, 𝑇 =
𝑠 ⋅ 𝑇
𝑠 𝑇
cosine
w1 w2 w6 w7 w8 w9 w10 w11 w12 w|W|w3 w4 w5
W – {w1, w2, . . . , wn}
3333
Euclidean Distance
3 00 00 02 0 0 03 22
5 42 53 61 3 1 20 00
w1 w2 w6 w7 w8 w9 w10 w11 w12 w|W|w3 w4 w5
𝑇
𝑠
ⅇ 𝑠, 𝑇 =
𝑖=1
𝑛
𝜔𝑖 − 𝑡𝑖
2
– Gold standard vector
– semantic summary vector
𝑇
𝑠
W – {w1, w2, . . . , wn}
(3 − 5)2+ (2 − 1)2+(3 − 0)2+(2 − 0)2+(0 − 3)2+(0 − 2)2+(0 − 5)2+. . . +(0 − 2)2
= 122
= 11.04
Semantic Similarity Comparison
34
Root Mean-Squared Error
35
𝐸 = (𝑒1,, 𝑒2,. . . , 𝑒20 )
𝐸𝑆 = (𝑆1
′
, 𝑆2
′
, . . ., 𝑆20
′
)
𝐸𝑆 = (𝑇1
′
, 𝑇2
′
, . . ., 𝑇20
′
)
cos 𝐸𝑆, 𝐸𝑇 = (𝑐𝑜𝑠1, 𝑐𝑜𝑠2, . . . , 𝑐𝑜𝑠20)
ⅇu𝑑 𝐸𝑆, 𝐸𝑇 = (ⅇu𝑑1, ⅇu𝑑2, . . . , ⅇu𝑑20)
JS 𝐸𝑆, 𝐸𝑇 = (𝑗𝑠1, 𝑗𝑠2, . . . , 𝑗𝑠20)
Root Mean-Squared Error
36
𝑆𝐼𝑀 = {𝑠𝑖𝑚1, 𝑠𝑖𝑚2, . . . , 𝑠𝑖𝑚20}
𝑅𝑀𝑆𝐸 𝑆𝐼𝑀 = 𝑖=1
𝑛
𝑠𝑖𝑚𝑖
2
𝑛
Method Cosine-RMSE Euclidean-RMSE JS-RMSE
Leave-out-relevancy 0.263 0.315 0.187
Leave-out-connectivity 0.263 0.335 0.143
Leave-out-novelty 0.254 0.329 0.252
Leave-out-saliency 0.237 0.333 0.281
Evaluation
Saliency is the most important feature.
37
• We propose a method for intrinsic evaluation of abstractive summarization.
• We transform semantic summaries in an equivalent textual representation.
• We evaluate the impact of these features using numerous similarity metrics.
• We adopt a leave-one-out strategy to identify and evaluate the features that impact
automatically generated semantic summaries.
Contributions
38
Limitations and Future Work
1. Query diversity
- 20 disease treatments
2. Concept-based bag-of-words
3. Gold standard impurities
- Diluted quality based on co-occurrence
39
Use machine learning and a larger query set
Involve more domain experts and consider
other gold standard creation techniques
Use facts instead of concepts
40
THANK YOU!
Prof. Amit P. Sheth
(Advisor)
Prof. Krishnaprasad
Thirunarayan
Thomas C. Rindflesch Delroy Cameron
Acknowledgements

More Related Content

What's hot

HDAC4 and HDAC7 Promote Breast and Ovarian Cancer Cell Migration by Regulatin...
HDAC4 and HDAC7 Promote Breast and Ovarian Cancer Cell Migration by Regulatin...HDAC4 and HDAC7 Promote Breast and Ovarian Cancer Cell Migration by Regulatin...
HDAC4 and HDAC7 Promote Breast and Ovarian Cancer Cell Migration by Regulatin...CrimsonpublishersCancer
 
Review of Adoptive T-Cell Immunotherapy
Review of Adoptive T-Cell ImmunotherapyReview of Adoptive T-Cell Immunotherapy
Review of Adoptive T-Cell ImmunotherapyLuke Brennan
 
Asbestos-related diseases - mechanisms and causation at Helsinki Asbestos 2014
Asbestos-related diseases - mechanisms and causation at Helsinki Asbestos 2014Asbestos-related diseases - mechanisms and causation at Helsinki Asbestos 2014
Asbestos-related diseases - mechanisms and causation at Helsinki Asbestos 2014Työterveyslaitos
 
Destructive impact of t-lymphocytes, NK and mast cells on basal cell layers:...
Destructive impact of t-lymphocytes, NK and mast  cells on basal cell layers:...Destructive impact of t-lymphocytes, NK and mast  cells on basal cell layers:...
Destructive impact of t-lymphocytes, NK and mast cells on basal cell layers:...Enrique Moreno Gonzalez
 
The Immunosuppressive Significance of Lactate Dehydrogenase (LDH) Blood Level...
The Immunosuppressive Significance of Lactate Dehydrogenase (LDH) Blood Level...The Immunosuppressive Significance of Lactate Dehydrogenase (LDH) Blood Level...
The Immunosuppressive Significance of Lactate Dehydrogenase (LDH) Blood Level...CrimsonpublishersCancer
 
Immunotherapy the Present and Future of Cancer Treatment
Immunotherapy the Present and Future of Cancer TreatmentImmunotherapy the Present and Future of Cancer Treatment
Immunotherapy the Present and Future of Cancer TreatmentCrimsonpublishersCancer
 
Expression of a LINE-1 endonuclease variant in gastric cancer: its associatio...
Expression of a LINE-1 endonuclease variant in gastric cancer: its associatio...Expression of a LINE-1 endonuclease variant in gastric cancer: its associatio...
Expression of a LINE-1 endonuclease variant in gastric cancer: its associatio...Enrique Moreno Gonzalez
 
Methylation Subtypes of EAC/BE
Methylation Subtypes of EAC/BE Methylation Subtypes of EAC/BE
Methylation Subtypes of EAC/BE Sean Maden
 
BiPday 2014 --Creanza Teresa
BiPday 2014 --Creanza TeresaBiPday 2014 --Creanza Teresa
BiPday 2014 --Creanza Teresaeventi-ITBbari
 
Epidemiology and trends of asbestos-related diseases at Helsinki Asbestos 2014
Epidemiology and trends of asbestos-related diseases at Helsinki Asbestos 2014Epidemiology and trends of asbestos-related diseases at Helsinki Asbestos 2014
Epidemiology and trends of asbestos-related diseases at Helsinki Asbestos 2014Työterveyslaitos
 
Intra-Tumoral Lymphocytes in Breast Cancer: Real Perspectives?
Intra-Tumoral Lymphocytes in Breast Cancer: Real Perspectives?Intra-Tumoral Lymphocytes in Breast Cancer: Real Perspectives?
Intra-Tumoral Lymphocytes in Breast Cancer: Real Perspectives?Healthcare and Medical Sciences
 
Chemokine (C-X-C) ligand 1 (CXCL1) protein expression is increased in aggress...
Chemokine (C-X-C) ligand 1 (CXCL1) protein expression is increased in aggress...Chemokine (C-X-C) ligand 1 (CXCL1) protein expression is increased in aggress...
Chemokine (C-X-C) ligand 1 (CXCL1) protein expression is increased in aggress...Enrique Moreno Gonzalez
 
Novel_technologies_and _emerging_biomarkers_for_personalized_cancer_immunothe...
Novel_technologies_and _emerging_biomarkers_for_personalized_cancer_immunothe...Novel_technologies_and _emerging_biomarkers_for_personalized_cancer_immunothe...
Novel_technologies_and _emerging_biomarkers_for_personalized_cancer_immunothe...TOKBLS
 
Cancer Precision Medicine Physiological Function of C MYC as Targeted Molecule
Cancer Precision Medicine Physiological Function of C MYC as Targeted MoleculeCancer Precision Medicine Physiological Function of C MYC as Targeted Molecule
Cancer Precision Medicine Physiological Function of C MYC as Targeted Moleculeijtsrd
 
Assessing the clinical utility of cancer genomic and proteomic data across tu...
Assessing the clinical utility of cancer genomic and proteomic data across tu...Assessing the clinical utility of cancer genomic and proteomic data across tu...
Assessing the clinical utility of cancer genomic and proteomic data across tu...Gul Muneer
 

What's hot (20)

Pham2018
Pham2018Pham2018
Pham2018
 
HDAC4 and HDAC7 Promote Breast and Ovarian Cancer Cell Migration by Regulatin...
HDAC4 and HDAC7 Promote Breast and Ovarian Cancer Cell Migration by Regulatin...HDAC4 and HDAC7 Promote Breast and Ovarian Cancer Cell Migration by Regulatin...
HDAC4 and HDAC7 Promote Breast and Ovarian Cancer Cell Migration by Regulatin...
 
Review of Adoptive T-Cell Immunotherapy
Review of Adoptive T-Cell ImmunotherapyReview of Adoptive T-Cell Immunotherapy
Review of Adoptive T-Cell Immunotherapy
 
Asbestos-related diseases - mechanisms and causation at Helsinki Asbestos 2014
Asbestos-related diseases - mechanisms and causation at Helsinki Asbestos 2014Asbestos-related diseases - mechanisms and causation at Helsinki Asbestos 2014
Asbestos-related diseases - mechanisms and causation at Helsinki Asbestos 2014
 
Destructive impact of t-lymphocytes, NK and mast cells on basal cell layers:...
Destructive impact of t-lymphocytes, NK and mast  cells on basal cell layers:...Destructive impact of t-lymphocytes, NK and mast  cells on basal cell layers:...
Destructive impact of t-lymphocytes, NK and mast cells on basal cell layers:...
 
The Immunosuppressive Significance of Lactate Dehydrogenase (LDH) Blood Level...
The Immunosuppressive Significance of Lactate Dehydrogenase (LDH) Blood Level...The Immunosuppressive Significance of Lactate Dehydrogenase (LDH) Blood Level...
The Immunosuppressive Significance of Lactate Dehydrogenase (LDH) Blood Level...
 
Hamilton.nature.comms
Hamilton.nature.commsHamilton.nature.comms
Hamilton.nature.comms
 
Immunotherapy the Present and Future of Cancer Treatment
Immunotherapy the Present and Future of Cancer TreatmentImmunotherapy the Present and Future of Cancer Treatment
Immunotherapy the Present and Future of Cancer Treatment
 
Expression of a LINE-1 endonuclease variant in gastric cancer: its associatio...
Expression of a LINE-1 endonuclease variant in gastric cancer: its associatio...Expression of a LINE-1 endonuclease variant in gastric cancer: its associatio...
Expression of a LINE-1 endonuclease variant in gastric cancer: its associatio...
 
Methylation Subtypes of EAC/BE
Methylation Subtypes of EAC/BE Methylation Subtypes of EAC/BE
Methylation Subtypes of EAC/BE
 
BiPday 2014 --Creanza Teresa
BiPday 2014 --Creanza TeresaBiPday 2014 --Creanza Teresa
BiPday 2014 --Creanza Teresa
 
Epidemiology and trends of asbestos-related diseases at Helsinki Asbestos 2014
Epidemiology and trends of asbestos-related diseases at Helsinki Asbestos 2014Epidemiology and trends of asbestos-related diseases at Helsinki Asbestos 2014
Epidemiology and trends of asbestos-related diseases at Helsinki Asbestos 2014
 
npjsba201634-2
npjsba201634-2npjsba201634-2
npjsba201634-2
 
Ijsrp p10758
Ijsrp p10758Ijsrp p10758
Ijsrp p10758
 
Intra-Tumoral Lymphocytes in Breast Cancer: Real Perspectives?
Intra-Tumoral Lymphocytes in Breast Cancer: Real Perspectives?Intra-Tumoral Lymphocytes in Breast Cancer: Real Perspectives?
Intra-Tumoral Lymphocytes in Breast Cancer: Real Perspectives?
 
Chemokine (C-X-C) ligand 1 (CXCL1) protein expression is increased in aggress...
Chemokine (C-X-C) ligand 1 (CXCL1) protein expression is increased in aggress...Chemokine (C-X-C) ligand 1 (CXCL1) protein expression is increased in aggress...
Chemokine (C-X-C) ligand 1 (CXCL1) protein expression is increased in aggress...
 
Gene_Identification_Report
Gene_Identification_ReportGene_Identification_Report
Gene_Identification_Report
 
Novel_technologies_and _emerging_biomarkers_for_personalized_cancer_immunothe...
Novel_technologies_and _emerging_biomarkers_for_personalized_cancer_immunothe...Novel_technologies_and _emerging_biomarkers_for_personalized_cancer_immunothe...
Novel_technologies_and _emerging_biomarkers_for_personalized_cancer_immunothe...
 
Cancer Precision Medicine Physiological Function of C MYC as Targeted Molecule
Cancer Precision Medicine Physiological Function of C MYC as Targeted MoleculeCancer Precision Medicine Physiological Function of C MYC as Targeted Molecule
Cancer Precision Medicine Physiological Function of C MYC as Targeted Molecule
 
Assessing the clinical utility of cancer genomic and proteomic data across tu...
Assessing the clinical utility of cancer genomic and proteomic data across tu...Assessing the clinical utility of cancer genomic and proteomic data across tu...
Assessing the clinical utility of cancer genomic and proteomic data across tu...
 

Viewers also liked

Don’t like RDF Reification? Making Statements about Statements Using Singleto...
Don’t like RDF Reification? Making Statements about Statements Using Singleto...Don’t like RDF Reification? Making Statements about Statements Using Singleto...
Don’t like RDF Reification? Making Statements about Statements Using Singleto...Vinh Nguyen
 
Extraction Based automatic summarization
Extraction Based automatic summarizationExtraction Based automatic summarization
Extraction Based automatic summarizationAbdelaziz Al-Rihawi
 
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersKno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersAmit Sheth
 
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Artificial Intelligence Institute at UofSC
 
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Amit Sheth
 

Viewers also liked (9)

Thesis 2016
Thesis 2016Thesis 2016
Thesis 2016
 
Don’t like RDF Reification? Making Statements about Statements Using Singleto...
Don’t like RDF Reification? Making Statements about Statements Using Singleto...Don’t like RDF Reification? Making Statements about Statements Using Singleto...
Don’t like RDF Reification? Making Statements about Statements Using Singleto...
 
Extraction Based automatic summarization
Extraction Based automatic summarizationExtraction Based automatic summarization
Extraction Based automatic summarization
 
Web and Complex Systems Lab @ Kno.e.sis
Web and Complex Systems Lab @ Kno.e.sisWeb and Complex Systems Lab @ Kno.e.sis
Web and Complex Systems Lab @ Kno.e.sis
 
Trust Management: A Tutorial
Trust Management: A TutorialTrust Management: A Tutorial
Trust Management: A Tutorial
 
2015 Kno.e.sis Center Annual Review
2015 Kno.e.sis Center Annual Review2015 Kno.e.sis Center Annual Review
2015 Kno.e.sis Center Annual Review
 
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersKno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
 
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
 
Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...Smart Data - How you and I will exploit Big Data for personalized digital hea...
Smart Data - How you and I will exploit Big Data for personalized digital hea...
 

Similar to ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Summarization

CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...
CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...
CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...semualkaira
 
CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...
CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...
CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...semualkaira
 
111318, 10(24 PMThe Civil War and Industrialization Scoring .docx
111318, 10(24 PMThe Civil War and Industrialization Scoring .docx111318, 10(24 PMThe Civil War and Industrialization Scoring .docx
111318, 10(24 PMThe Civil War and Industrialization Scoring .docxdrennanmicah
 
Automated Extraction Of Reported Statistical Analyses Towards A Logical Repr...
Automated Extraction Of Reported Statistical Analyses  Towards A Logical Repr...Automated Extraction Of Reported Statistical Analyses  Towards A Logical Repr...
Automated Extraction Of Reported Statistical Analyses Towards A Logical Repr...Nat Rice
 
Bioinformatics-R program의 실례
Bioinformatics-R program의 실례Bioinformatics-R program의 실례
Bioinformatics-R program의 실례mothersafe
 
Effective strategies to monitor clinical risks using biostatistics - Pubrica.pdf
Effective strategies to monitor clinical risks using biostatistics - Pubrica.pdfEffective strategies to monitor clinical risks using biostatistics - Pubrica.pdf
Effective strategies to monitor clinical risks using biostatistics - Pubrica.pdfPubrica
 
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...Servio Fernando Lima Reina
 
Controlling informative features for improved accuracy and faster predictions...
Controlling informative features for improved accuracy and faster predictions...Controlling informative features for improved accuracy and faster predictions...
Controlling informative features for improved accuracy and faster predictions...Damian R. Mingle, MBA
 
Zhe_2014JointSummits_v6
Zhe_2014JointSummits_v6Zhe_2014JointSummits_v6
Zhe_2014JointSummits_v6Zhe (Henry) He
 
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...Damian R. Mingle, MBA
 
Text Mining Radiology Reports for Deep Learning Radiology Images
Text Mining Radiology Reports for Deep Learning Radiology Images Text Mining Radiology Reports for Deep Learning Radiology Images
Text Mining Radiology Reports for Deep Learning Radiology Images Yifan Peng
 
A fuzzy inference system for assessment of the severity of the peptic ulcers
A fuzzy inference system for assessment of the severity of the peptic ulcersA fuzzy inference system for assessment of the severity of the peptic ulcers
A fuzzy inference system for assessment of the severity of the peptic ulcerscsandit
 
A fuzzy inference system for assessment of the severity of the peptic ulcers
A fuzzy inference system for assessment of the severity of the peptic ulcersA fuzzy inference system for assessment of the severity of the peptic ulcers
A fuzzy inference system for assessment of the severity of the peptic ulcerscsandit
 
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Benjamin Good
 
Anjali_Ganguly_Siemens_2014
Anjali_Ganguly_Siemens_2014Anjali_Ganguly_Siemens_2014
Anjali_Ganguly_Siemens_2014Anjali Ganguly
 
Effective strategies to monitor clinical risks using biostatistics - Pubrica....
Effective strategies to monitor clinical risks using biostatistics - Pubrica....Effective strategies to monitor clinical risks using biostatistics - Pubrica....
Effective strategies to monitor clinical risks using biostatistics - Pubrica....Pubrica
 
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Seattle DAML meetup
 
Digital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineDigital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineJoel Saltz
 

Similar to ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Summarization (20)

CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...
CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...
CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...
 
CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...
CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...
CXCL1, CCL20, STAT1 was Identified and Validated as a Key Biomarker Related t...
 
111318, 10(24 PMThe Civil War and Industrialization Scoring .docx
111318, 10(24 PMThe Civil War and Industrialization Scoring .docx111318, 10(24 PMThe Civil War and Industrialization Scoring .docx
111318, 10(24 PMThe Civil War and Industrialization Scoring .docx
 
Automated Extraction Of Reported Statistical Analyses Towards A Logical Repr...
Automated Extraction Of Reported Statistical Analyses  Towards A Logical Repr...Automated Extraction Of Reported Statistical Analyses  Towards A Logical Repr...
Automated Extraction Of Reported Statistical Analyses Towards A Logical Repr...
 
Bioinformatics-R program의 실례
Bioinformatics-R program의 실례Bioinformatics-R program의 실례
Bioinformatics-R program의 실례
 
Effective strategies to monitor clinical risks using biostatistics - Pubrica.pdf
Effective strategies to monitor clinical risks using biostatistics - Pubrica.pdfEffective strategies to monitor clinical risks using biostatistics - Pubrica.pdf
Effective strategies to monitor clinical risks using biostatistics - Pubrica.pdf
 
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
EUSFLAT 2019: explainable neuro fuzzy recurrent neural network to predict col...
 
MCC 2011 - Slide 24
MCC 2011 - Slide 24MCC 2011 - Slide 24
MCC 2011 - Slide 24
 
Controlling informative features for improved accuracy and faster predictions...
Controlling informative features for improved accuracy and faster predictions...Controlling informative features for improved accuracy and faster predictions...
Controlling informative features for improved accuracy and faster predictions...
 
Zhe_2014JointSummits_v6
Zhe_2014JointSummits_v6Zhe_2014JointSummits_v6
Zhe_2014JointSummits_v6
 
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...
A discriminative-feature-space-for-detecting-and-recognizing-pathologies-of-t...
 
Text Mining Radiology Reports for Deep Learning Radiology Images
Text Mining Radiology Reports for Deep Learning Radiology Images Text Mining Radiology Reports for Deep Learning Radiology Images
Text Mining Radiology Reports for Deep Learning Radiology Images
 
A fuzzy inference system for assessment of the severity of the peptic ulcers
A fuzzy inference system for assessment of the severity of the peptic ulcersA fuzzy inference system for assessment of the severity of the peptic ulcers
A fuzzy inference system for assessment of the severity of the peptic ulcers
 
A fuzzy inference system for assessment of the severity of the peptic ulcers
A fuzzy inference system for assessment of the severity of the peptic ulcersA fuzzy inference system for assessment of the severity of the peptic ulcers
A fuzzy inference system for assessment of the severity of the peptic ulcers
 
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
Poster: Microtask crowdsourcing for disease mention annotation in PubMed abst...
 
Anjali_Ganguly_Siemens_2014
Anjali_Ganguly_Siemens_2014Anjali_Ganguly_Siemens_2014
Anjali_Ganguly_Siemens_2014
 
Effective strategies to monitor clinical risks using biostatistics - Pubrica....
Effective strategies to monitor clinical risks using biostatistics - Pubrica....Effective strategies to monitor clinical risks using biostatistics - Pubrica....
Effective strategies to monitor clinical risks using biostatistics - Pubrica....
 
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
Machine Learning in Biology and Why It Doesn't Make Sense - Theo Knijnenburg,...
 
Apendicitis articulo pubmed
Apendicitis articulo pubmedApendicitis articulo pubmed
Apendicitis articulo pubmed
 
Digital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineDigital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision Medicine
 

Recently uploaded

Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 

Recently uploaded (20)

Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 

ResQu: A Framework for Automatic Evaluation of Knowledge-Driven Automatic Summarization

  • 1. RESQU: A FRAMEWORK FOR AUTOMATIC EVALUATION OF KNOWLEDGE-DRIVEN AUTOMATIC SUMMARIZATION MASTERS THESIS DEFENSE NISHITA JAYKUMAR MAY 26, 2016 MASTERS COMMITTEE AMIT P. SHETH (ADVISOR) THOMAS C. RINDFLESCH (NIH) DELROY CAMERON (APPLE INC.) KRISHNAPRASAD THIRUNARAYAN 1
  • 2. Main Issue: Indirect Information access PubMed Search Service 2
  • 3. 3 Acetaminophen TREATS Migraine Disorders Sumatriptan TREATS Migraine Disorder Topiramate PREVENTS Migraine Disorders More direct Information access Semantic MEDLINE 3
  • 4. Thesis Motivation • Automatically evaluate summaries in Semantic MEDLINE. • Identify features that impact summary quality. • Improve semantic summaries it generates. 4
  • 5. Outline • Automatic summarization - Extractive, abstractive - Summarization in semantic MEDLINE and ResQu • Automatic summarization evaluation - Intrinsic, extrinsic • Datasets - UMLS, SemRep, MetaMap • Approach - Summary transformation - Semantic similarity • Experimental evaluation • Conclusion 5
  • 6. • What is an effective summary? - Saliency - Compressed format • Approaches to Automatic Summarization Automatic Summarization Extractive Abstractive 6 Extractive summary A randomized, placebo-controlled trial of acetaminophen for treatment of migraine headache. Long-term evaluation of sumatriptan and naproxen sodium for the acute treatment of migraine in adolescents. ……………. Mapping from disease-specific measures to health-state utility values in individuals with migraine. Abstractive summary Acetaminophen TREATS Migraine Disorders Sumatriptan TREATS Migraine Disorders ……………. Migraine Disorders PROCESS_OF Individuals
  • 7. Semantic MEDLINE Summarization System Overview Source Documents Conceptual Representation Conceptual Condensate Semantic Predications Semantic Predications Feature application: • Relevance • Connectivity • Novelty • Saliency Interpretation Transformation Reduction Generalization SemRep Semantic Summary Generalization 7 Aspirin TREATS Coronary artery disease Coronary artery disease COEXISTS_WITH Inflammation Coronary artery disease ISA Vascular disease tomography DIAGNOSIS Coronary artery disease
  • 8. Intrinsic Evaluation: - Compared to a human-curated gold standard. - Using document similarity measures. • Evaluating Summary Quality Evaluating Summaries Extrinsic evaluation: - Based on a secondary task. - Through a discrete scoring system. 8
  • 9. Intrinsic Evaluation of Extractive Summariztion • Pyramid Approach [Nenkova et al., 2004] - Summary Content Units (SCU) • Louis et al [2009] • Distribution of terms • Kullback-Liebler • Jensen-Shannon Nenkova, Ani, and Rebecca Passonneau. "Evaluating content selection in summarization: The pyramid method." (2004). Louis, Annie, and Ani Nenkova. "Automatic summary evaluation without human models." Notebook Papers and Results, Text Analysis Conference (TAC-2008), Gaithersburg, Maryland (USA). 2008. 9
  • 10. • Information Misalignment • Semantic summary – structured background knowledge. • Gold standard – textual. • Proposed Solution • Summary transformation: predications to text. • Semantic similarity computation. Intrinsic Evaluation of Abstractive Summarization 10
  • 11. Approach: ResQu We can use the words that co-occur with the semantic predications in a summary to represent the meaning of the semantic predications based on distributional semantics. By generating multiple summaries with features held-out, we can effectively evaluate the impact of each feature. Word Co-occurrence Leave-one-Out 11
  • 12. A semantic summary can be understood and potentially improved by leveraging distributional statistics between the structured knowledge that comprises the semantic summary and the words with which these structured constructs co-occur, across the corpus. Thesis Statement 12 3
  • 13. valproic acid TREAT migraine Sumatriptan TREATS Migraine Disorders lamotrigine TREATS Migraine with Aura Dihydroergotamine TREATS Migraine Disorders Acetaminophen TREATS Migraine Disorders Aspirin TREATS Migraine Disorders zolmitriptan TREATS Migraine Disorders eletriptan TREATS Migraine Disorders Analgesics TREATS Migraine Disorders ziconitide TREATS Migraine Disorders Semantic Predications … … Proposed Solution (ResQu) Co-occurring arguments Semantic summary vector 13 …
  • 14. • Similarity between SS and GS - Cosine similarity, Euclidean distance, Jensen-Shannon divergence • Root Mean-Squared Error • For each summary generated with a feature held-out Measuring Similarity The summary that is least similar to the gold standard has the most important feature. 14 6
  • 15. Assertional Knowledge Definitional Knowledge ComplementaryDisjoint 65 Attributes: 62 Provenance Metadata 3 Semantic Attributes MEDLINE (1865 – 2015) Largest Biomedical Knowledgebase, >25 million abstracts, PubMed, PMC Semantic Predications Medical Subject Headings (MeSH) 15 Unique Trees, Max Depth – 15 ~27,000 Terms SPECIALIST Lexicon Semantic Network Metathesaurus >300k concepts >100 Vocabularies 9 million triples 134 Types 15 Groups 54 predicates Unified Medical Language System (UMLS) MeSH Indexing d1 d2 d3 dn Resource-Rich Biomedical Knowledge 15 1
  • 16. ResQu System Architecture User Query Processor Document Selector Predication Mapper Concept Mapper Summarizer (Schema Summarizer) Vectorizer Predication Extractor (SemRep) Graph Generator ResQu Summary Vectors MEDLINE 15 Jericho Crawler Gold standard Vectors Similarity Computation Module Gold standard creation module
  • 17. User Query • l: label of an entity (or concept) in the UMLS, - Migraine Disorders: C0149931 • c1: Humans[MH] and c2: Clinical Trial [PTYP] • dt: the date range of documents • ub: is the upper bound (default = 5000) q = (l, c1, c2, dt, ub) 17 8
  • 18. q = (Migraine Disorders[MH] AND Humans[MH] AND Clinical Trial [PTYP] AND 1860/01:2014/08[DCOM]) User Query Instance 18 9
  • 19. • Query from the User Query Processor. • Retrieves the set of MEDLINE documents. • D = {d1; d2;. . . ; dn} • Uses the MEDLINE Entrez Search API. Document Selection 20
  • 21. Semantic Predications Extractor 22 A randomized, placebo-controlled trial of acetaminophen for treatment of migraine disorders Acetaminophen Migraine disorders treats
  • 22. Automatic Summarizer Inflammation mediated by the immune system is known to be important in carcinogenesis and, specifically, T helper 17 cells have been reported to play a role in tumor progression by promoting neo-angiogenesis. The aim of this study was to investigate whether inflammatory cytokines and vascular endothelial growth factor (VEGF) levels in exhaled breath condensate (EBC) and in serum were related to tumor size in patients with non-small cell lung cancer (NSCLC). Il-6, IL-17, TNF-α and VEGF levels were measured in EBC and serum of 15 patients with stage I-IIA NSCLC and in 30 healthy controls by immunoassay. The tumor size was measured by a CT scan. The concentrations of IL-6, IL-17 and VEGF were significantly higher in EBC of patients with lung cancer, compared with controls, while only serum IL-6 concentration was higher in patients compared to controls. A significant correlation (r = 0.78, p = 0.001) was observed between EBC levels of IL-6 and IL-17; IL-17 was also correlated to EBC levels of the VEGF (r = 0.83, p < 0.001) and TNF-α (r = 0.62, p = 0.014). The tumor diameter was significantly correlated with EBC concentrations of VEGF (r = 0.58, p = 0.039), IL-6 (r = 0.67, p = 0.013) and IL-17 (r = 0.66, p = 0.017). Our results show a significant relationship between inflammatory and angiogenic markers, measured in EBC by a non-invasive method, and tumor mass. To assess whether polymorphisms of the interleukin-23 receptor (IL23R) gene are associated with bladder transitional cell carcinoma because chronic inflammation contributes to bladder cancer and the IL23R is known to be critically involved in the carcinogenesis of various malignant tumors. 226 patients with bladder cancer and 270 age-matched controls were involved in the study. Polymerase chain reaction-restriction fragment length polymorphism was used for genotyping. Genotype distribution and allelic frequencies between patients and controls were compared. In all three single nucleotide polymorphisms of IL23R studied, the distribution of genotype and allele frequencies of rs10889677 differed significantly between patients and controls. The frequency of allele C of rs10889677 was significantly increased in cases compared with controls (0.2898 vs. 0.1833, odds ratio 1.818, 95 % confidence interval 1.349-2.449). The result indicates that IL23R may play an important role in the susceptibility of bladder cancer in Chinese population. For over a century, inactivated or attenuated bacteria have been employed in the clinic as immunotherapies to treat cancer, starting with the Coley's vaccines in the 19th century and leading to the currently approved bacillus Calmette-Guérin vaccine for bladder cancer. While effective, the inflammation induced by these therapies is transient and not designed to induce long-lasting tumor-specific cytolytic T lymphocyte (CTL) responses that have proven so adept at eradicating tumors. Therefore, in order to maintain the benefits of bacteria-induced acute inflammation but gain long-lasting anti-tumor immunity, many groups have constructed recombinant bacteria expressing tumor-associated antigens (TAAs) for the purpose of activating tumor-specific CTLs. One bacterium has proven particularly adept at inducing powerful anti-tumor immunity, Listeria monocytogenes (Lm). Lm is a gram-positive bacterium that selectively infects antigen-presenting cells wherein it is able to efficiently deliver tumor antigens to both the MHC Class I and II antigen presentation pathways for activation of tumor- targeting CTL-mediated immunity. Lm is a versatile bacterial vector as evidenced by its ability to induce therapeutic immunity against a wide-array of TAAs and specifically infect and kill tumor cells directly. It is for these reasons, among others, that Lm-based immunotherapies have delivered impressive therapeutic efficacy in preclinical models of cancer for two decades and are now showing promise clinically. The result indicates that IL23R may play an important role in the susceptibility of bladder cancer in Chinese population. For over a century, inactivated or attenuated bacteria have been employed in the clinic as immunotherapies to treat cancer, starting with the Coley's vaccines in the 19th century and leading to the currently approved bacillus Calmette-Guérin vaccine for bladder cancer. While effective, the inflammation induced by these therapies is transient and not designed to induce long-lasting tumor-specific cytolytic T lymphocyte (CTL) responses that have proven so adept at eradicating tumors. Therefore, in order to maintain the benefits of bacteria-induced acute inflammation but gain long-lasting anti-tumor immunity, many groups have constructed recombinant bacteria expressing tumor-associated antigens (TAAs) for the purpose of activating tumor-specific CTLs. One bacterium has proven particularly adept at inducing powerful anti-tumor immunity, Listeria monocytogenes (Lm). Lm is a gram-positive bacterium that selectively infects antigen-presenting cells wherein it is able to efficiently deliver tumor antigens to both the MHC Class I and II antigen presentation pathways for activation of tumor-targeting CTL-mediated immunity. Lm is a versatile bacterial vector as evidenced by its ability to induce therapeutic immunity against a wide-array of TAAs and specifically infect and kill tumor cells directly. It is for these reasons, among others, that Lm-based immunotherapies have delivered impressive therapeutic efficacy in preclinical models of cancer for two decades and are now showing promise clinically. inflammation contributes to bladder cancer and the IL23R is known to be critically involved in the carcinogenesis of various malignant tumors. 226 patients with bladder cancer and 270 age-matched controls were involved in the study. Polymerase chain reaction-restriction fragment length polymorphism was used for genotyping. Genotype distribution and allelic frequencies between patients and controls were compared. In all three single nucleotide polymorphisms of IL23R studied, the distribution of genotype and allele frequencies of rs10889677 differed significantly between patients and controls. The frequency of allele C of rs10889677 was significantly increased in cases compared with controls (0.2898 vs. 0.1833, odds ratio 1.818, 95 % confidence Ibuprofen Topiramate Headache Acetaminophen TREATS PREVENTS ISA LOCATION_OF Migraine Disorders Migraine Disorders Migraine Disorders Migraine Disorders TREATS Migraine Disorders Migraine Disorders Vestibule Pain ISA 24
  • 24. Step 1: get all documents for each concept in semantic summary. Step 2: create bag-of-words for each concept (term-frequency). Step 3: then aggregate the bag-of-words for each concept in the entire semantic summary. Step 4: we use the idfs for each words in the corpus to create the tf-idf vector for the given semantic summary. Summary Transformation 𝑡𝑓𝑖𝑑𝑓 𝑡, 𝑑, 𝐷 = 𝑡𝑓 𝑡, 𝑑 ∗ log 𝑁 𝑛 𝑡 26
  • 25. Bag-of-words Model We used hemofiltration to treat a patient with digoxin overdose that was complicated by refractory hyperkalemia. bow = [(we,1), (used,1), . . ., (hyperkalemia,1)] bow_sparse_vector =[(678,1), (2,1), . . ., (999,1)] 27
  • 26. Dictionary Creation 28 Term Index Document id ibuprofen 0 1,3,…,3000 . . . migraine 5 5,6,…,475 Documents ibuprofen is …. migraine Ibuprofen is effective in treating Migraine
  • 28. Gold Standard Vectorization Step 1: iterate over the each document in the gold standard. Step 2: tokenize each sentence. Step 3: create the bag-of-words model. Step 4: we use the idfs for each word from the dictionary to create the tf-idf vector for the gold standard. Problem: data sparsity. 30
  • 29. Gold Standard Vectorization Enhancement Step 1: MetaMap the gold standard document. Step 2: create bag-of-words for each concept (term frequency). Step 3: then aggregate the bag-of-words for each concept bag-of-words for summary. Step 4: we use the idfs for each word from the dictionary to create the tf-idf vector for the gold standard. Solution: enhance with context clues from corpus. 31
  • 30. Step 1: select 20 disease as topics for an information need. Step 2: use each query to generate a semantic summary. Step 3: transform each semantic summary into semantic summary vectors. Step 4: transform each gold standard into a gold standard tf-idf vectors. Step 5: compute the similarity between a semantic summary vector and its associated gold standard vector under different features. Step 6: determine the features that generate the most informative summary in each scenario. Evaluation: Overall Approach 32
  • 31. • Cosine Similarity • Euclidean distance → • Jensen-Shannon Distance Summarization Evaluation Metrics 𝑠, 𝑇 = 𝑠 ⋅ 𝑇 𝑠 𝑇 cosine ⅇ 𝑠, 𝑇 = 𝑖=1 𝑛 𝜔𝑖 − 𝑡𝑖 2 𝐽𝑆𝐷( 𝑠| 𝑇 = 1 2 [𝐾𝐿( 𝑠| 𝑀 + 𝐾𝐿 𝑇 𝑀 , K𝐿 𝑠||𝑇 = i=1 𝑛 p w 𝑖 log P w 𝑖 P 𝑡 𝑖 where 𝑀 = 1 2 (𝑠′ + 𝑇) 33
  • 32. 32 Cosine Similarity 3 00 00 02 0 0 03 22 5 42 53 61 3 1 20 00 – Gold standard vector – semantic summary vector 𝑇 𝑠 𝑇 𝑠 𝑠, 𝑇 = 𝑠 ⋅ 𝑇 𝑠 𝑇 cosine w1 w2 w6 w7 w8 w9 w10 w11 w12 w|W|w3 w4 w5 W – {w1, w2, . . . , wn}
  • 33. 3333 Euclidean Distance 3 00 00 02 0 0 03 22 5 42 53 61 3 1 20 00 w1 w2 w6 w7 w8 w9 w10 w11 w12 w|W|w3 w4 w5 𝑇 𝑠 ⅇ 𝑠, 𝑇 = 𝑖=1 𝑛 𝜔𝑖 − 𝑡𝑖 2 – Gold standard vector – semantic summary vector 𝑇 𝑠 W – {w1, w2, . . . , wn} (3 − 5)2+ (2 − 1)2+(3 − 0)2+(2 − 0)2+(0 − 3)2+(0 − 2)2+(0 − 5)2+. . . +(0 − 2)2 = 122 = 11.04
  • 35. Root Mean-Squared Error 35 𝐸 = (𝑒1,, 𝑒2,. . . , 𝑒20 ) 𝐸𝑆 = (𝑆1 ′ , 𝑆2 ′ , . . ., 𝑆20 ′ ) 𝐸𝑆 = (𝑇1 ′ , 𝑇2 ′ , . . ., 𝑇20 ′ ) cos 𝐸𝑆, 𝐸𝑇 = (𝑐𝑜𝑠1, 𝑐𝑜𝑠2, . . . , 𝑐𝑜𝑠20) ⅇu𝑑 𝐸𝑆, 𝐸𝑇 = (ⅇu𝑑1, ⅇu𝑑2, . . . , ⅇu𝑑20) JS 𝐸𝑆, 𝐸𝑇 = (𝑗𝑠1, 𝑗𝑠2, . . . , 𝑗𝑠20)
  • 36. Root Mean-Squared Error 36 𝑆𝐼𝑀 = {𝑠𝑖𝑚1, 𝑠𝑖𝑚2, . . . , 𝑠𝑖𝑚20} 𝑅𝑀𝑆𝐸 𝑆𝐼𝑀 = 𝑖=1 𝑛 𝑠𝑖𝑚𝑖 2 𝑛
  • 37. Method Cosine-RMSE Euclidean-RMSE JS-RMSE Leave-out-relevancy 0.263 0.315 0.187 Leave-out-connectivity 0.263 0.335 0.143 Leave-out-novelty 0.254 0.329 0.252 Leave-out-saliency 0.237 0.333 0.281 Evaluation Saliency is the most important feature. 37
  • 38. • We propose a method for intrinsic evaluation of abstractive summarization. • We transform semantic summaries in an equivalent textual representation. • We evaluate the impact of these features using numerous similarity metrics. • We adopt a leave-one-out strategy to identify and evaluate the features that impact automatically generated semantic summaries. Contributions 38
  • 39. Limitations and Future Work 1. Query diversity - 20 disease treatments 2. Concept-based bag-of-words 3. Gold standard impurities - Diluted quality based on co-occurrence 39 Use machine learning and a larger query set Involve more domain experts and consider other gold standard creation techniques Use facts instead of concepts
  • 40. 40 THANK YOU! Prof. Amit P. Sheth (Advisor) Prof. Krishnaprasad Thirunarayan Thomas C. Rindflesch Delroy Cameron Acknowledgements

Editor's Notes

  1. Hello everyone, good morning, thank you for gathering here, Today I am going to talk about my work titled: “” This is the work that I started as a part of my internship at NLM with Dr. Rindflesch and his team
  2. I am sure all of us are aware.. For those of us who are not…PubMed is the search service that queries the MEDLINE database to retrieve relevant documents for a user’s information need. MEDLINE itself… So what is the problem with PubMed?, Well the problem with PubMed is that is presents the information as a list So if a user wanted to find information on migraine disorders, and he constructed this specific query and presented it to PubMed, it would retrieve 2171 results, then he would have to search and sift thru this entire collect to find relevant answers. though the information is contained in the resultset, it is not directly accessible
  3. To alleviate this problem, with the research by Tom and him team, they developed this tool called Sem. Med. which is a tool used for automatically summarizing biomedical literature. So as we see here for the same user query, in addition to presenting the information as a list, Sem Med., extracts the salient information from the search resultset as facts and represents them as a graph And this set of facts are called semantic predications or triples. This provides more direct access to the information and, from this graph the user can understand the following facts on migraine disorders.. amongst many other
  4. The motivation for this work is very specific to Semantic MEDLINE and in particular, we want to automatically evaluate the summaries
  5. So this is the outline for the remainder of the talk We will discuss about automatic summarization and its types then automatic summarization evaluation and its types later about summarization in Sem. Med. and ResQu Then discuss about the different datasets used for this work very briefly then we move on to the core of the approach. Finally we talk about experimental evaluation
  6. in this general scope of automatic summarization, an interesting question is what is an effective summary in the first place. So how might one be able to quantify that to evaluate it. So, what is an effective summary? an effective summary is something that convey the most important information or the salient information from the search result set in a compressed and concise format. Extractive: is where the summary contains most important information from the source is added to the summary in an unaltered format. whereas, Abstractive: is where the summary is a condensed abstract representation of the source, the content is usually rephrased or paraphrased. Semantic Med. performs abstractive summarization. Saliency - Conveys most important information from SOURCE
  7. So, how does summarization take place in Semanitc MEDLINE in the broad sense: Well, first we have ……….. SemRep is a program that extracts semantic predications (subject-relation-object triples) from biomedical free text. then a series of 4 features are applied in the reduction step to produce a summary Relevancy: is a knowledge-based feature derived by selecting semantic predications that address the user-selected seed topic for the summary Connectivity: is a feature that ensures the summary will also include “useful” additional predications, such as based on the connectedness of relevant concepts Novelty – is a knowledge-based feature that uses the hierarchical structure of the Metathesaurus to eliminate predications with generic (and hence uninformative) arguments Saliency – is a feature that assigns bias to semantic predications that occur frequently
  8. One of the critical limitations of Sem. Med. is that, it is difficult to evaluate the quality of the automatically generated summary. Now, how might one actually evaluate the quality of an automatic summary or summaries in general, Well…
  9. Some of the popular work in this area is the work by Nenkova et al titled pyramid approach, where they focus in creating summaries using SCU, which are extracts. They evaluate using an intrinsic Similarily in another work by Louis et al, they perform intrinsic evaluation of the extractive summaries, by comparing the distribution of terms summary to input using different …
  10. so we propose a possible solution to this problem which involves summary transformation
  11. Specifically, we approach this problem with two broad ideas
  12. Here we have a semantic summary which is a list of predications We take each summary produced by Semantic MEDLINE and for each of the facts in it, we express them as a distribution of terms in which the predications co-occur. Then we aggregate these words to create a bag of words model, this way we will be able to represent the semantic summary as a vector on which we can then perform similarity scoring
  13. Once we have the semantic summary vector and the gold standard vector we evaluate them using different similarity scoring techniques as suggested by the literature Further, to understand which feature is influential in the quality of the summary we perform the RMSE computation, using a leave-one-out approach, and we state that
  14. The Medical Subject Headings (MeSH) is a controlled vocabulary and thesaurus of biomedical terms, organized in a hierarchical structure. Subject headings in MeSH are often used as search terms in PubMed to retrieve relevant documents. In terms of organization, the semantic network is comparable to an ontology schema, while the Metathesaurus is comparable to the instances in the ontology. The Metathesaurus is the biggest component of the UMLS. It is a large biomedical thesaurus The SPECIALIST Lexicon is a large syntactic lexicon of biomedical and general English terms, designed to provide the information needed for information extraction by various tools and natural language processing system
  15. So here is the overall system architecture for ResQu, that we have developed: First we have the UQP: this module is used for constructing the query based of the users input, then passes this on to the DS DS is responsible for retrieving all documents that match the query, for this we use the MEDLINE ENTREZ API Then this set of PubMed articles are sent to the Predication Extractor, in this module we use the SemRep API to extract the semantic predications or facts from the articles Then we use the summarizer, which is responsible for applying the features we previous mentioned, relevance, novelty, connectivity and saliency to create a focused summary The Concept Mapper and the Predication mapper are responsible for transforming the semantic summaries into their textual summarization these components creates the initial model which is fed into the Vectorizer, which vecorizes the summary to create the ResQu summary vectors
  16. So what is a user query in ResQu? Well we represent a user query to be a tuple with 5 elements. Migraine is mapped to Migraine Disorders : C0149931Migraine Disorders[MH] c1 and c2 are MeSH filters these are 2 MesH (medical subject heading) indexing terms, citations in Pubmed are indexed using these MeSH terms assigned by human
  17. From the search results, the PubMed identifier(or PMID) of each article in D is then passed to the Semantic Predication Extractor
  18. In ResQu we evaluate the summaries using 20 scenarios, this is a carefully chosen list of diseases which contains both well known and rarely occurring diseases. with an upper bound of 5000 documents per disease. For this study we were mainly interested in understanding the drug treatments for these diseases
  19. For ex, in this sentence, the semantic predication extractor, will first try to do this… then use the indicator rules and.. The predications graph is then delivered as input to the Summarizer, which applies various features to filter our non-informative semantic predications and create a more concise semantic summary reflective of the salient aspects of the search result set.
  20. by the application of the reduction transformation rules here we see some of the predications that it accepts, while these are some of the predications it rejects
  21. finally we end up with this semantic summary
  22. However…. as we have noted earlier it is challenging to evaluate such a summary, hence here is the steps for summary transformation We implement the summary transformation in the following 4 steps: First 1) then aggregate the bag-of-words for each concept, in the entire summary 2) then we use the idf as the inverse document frequency for each word in the corpus to create the tf-idf vector for a summary. 3) to create both the bow_sparse_vector and the idfs, we create a dictionary for the corpus 4) at this stage we have a semantic summary represented as a transformed
  23. Here is a simple bag-of-words model for this snippet of text A bag-of-words model simply for every document it will create a list of tuples, with the word and its frequency The bag-of-words model can be used as a sparse vector for a document, by simply replacing the word with the id of the word in a feature space
  24. Step 1: Iterate over each document in the corpus Step 2: Tokenize each sentence Step 3: Add each token to the dictionary with a unique id (index position) Step 4: Keeps track of document frequency for each token (id of the documents that the terms occurs in) at this point we have a semantic summary transformed as a summary vector
  25. At this point we have a semantic summary, represented as a vector 1) These were the 3 resources that were considered for the creation of the gold standard. 3) these were resources that were selected by domain experts from NLM as authoritative sources of drug treatments for diseases 4) We use the Jericho crawler to extract text present in structured and unstructured formats in theses resources
  26. We found that the gold standard vectors were sparse.
  27. to overcome this data sparsity problem we enhance the gold standard vectors using contextual clues from the corpus and repeated step 2, 3 and 4 as previous
  28. using the RMSE, which we will discuss in the next section
  29. If we let the semantic summary be S-prime and the gold standard summary be T, then the cosine-similarity between the GS-vector and a SS-vector is computed as shown in equation 1. Which is nothing but ht dot product between the 2 vectors divided by the square-root of the square of the sum of the squares for EU-distances, the distance is computed as the sum of the squared differences , between each of the corresponding points in the vector JS-divergence is a bit more complicated, computed as the function of the symmetric Kullback-liebler divergence, KL: assuming the SS-vector, S-prime & GS-vector as T, then KL-divergence is the sum for each of the weights of the word w-I & the corresponding word in the GS, Product of the probability of the word in the semantic summary into the log of the probability of the word in the SS divided by the probability of the word in the GS. given the KL-divergence then, we can compute JS as follows
  30. First we have the baseline summary, which a summary with no feature left out, which is compared to the gold standard, then we compute the semantic similarity with the relevancy feature removed, which is in green and similarly for all other features It is difficult to discern which of the features is important but just looking at these graphs So in order to assess more quantitatively which of these features is important, we instead compute the RMSE for each of the distributions
  31. so just to put the things into perspective, lets just take the baseline dataset. So we have 20 queries, so for the baseline we generate 20 semantic summaries and we will have as well 20 gold standard summary vectors 20 cosine similarity values, 20 Euclidean distance values and 20 Jensen-Shannon values
  32. it is the summation of the square of the similarity scores Now when we compute the RMSE for just the baseline, that number in isolation is not very informative, however if we compute the RMSE for each of the held of the features, then we are able to estimate the importance of each feature
  33. So what we would like to see for Cosine similarity is, when the most important feature has been removed and we compute the cosine similarity across all 20 queries, then the RMSE value should become very low. which is what we see. For the Euclidean we expect to see the opposite. So having done this across the 20 queries and these different metrics, what we see is that the saliency for cosine sim and JS is the most important feature and for the Euclidean distance connectivity is the highest and saliency is the second highest, which leads us to conclude that saliency is the most important feature for generating sem sum.