Keynote talk at HILDA'2023 at SIGMOD on June 18, 2023.
Abstract: The ability to build large-scale knowledge bases that capture and extend the implicit knowledge of human experts is the foundation for many AI systems. We use an ontology-driven approach for the building, growing and serving of such knowledge bases. This approach relies on several well-known building blocks: document conversion, natural language processing, entity resolution, data transformation and fusion. In this talk, I will discuss wide range of real-world challenges related to the building of these blocks and present our work to address these challenges via better human-machine cooperation.
6. Human-in-the-Loop Throughout the Entire Life Cycle
of KG Construction, Growth, and Services
Data Labeling Development Deployment
Learner
raw data labeled data
1. Improve quality
2. Increase efficiency
3. Decrease skill requirements
7. Example 1: Scale Fact Collection
Missing / stale facts
Missing
Facts
Query
Synthesizer
QA System
candidate facts
Baseline
New
Facts
8. Example 1: Scale Fact Collection
Missing / stale facts
Missing
Facts
Query
Synthesizer
QA System
candidate facts
Baseline
New
Facts
Query-by-Committee
Missing
Facts
Query
Synthesizer
QA System
candidate facts
New
Facts
QA System
Q1
QA System
… …
… …
…
Qn
QbC
Selector
AnswerSet1
AnswerSetn
[EMNLP-DaSH’2022] Improving Human Annotation Effectiveness for Fact Collection by Identifying the Most Relevant Answers
Success Rate
fact collection
25%
9. Example 1: Scale Fact Collection
Missing / stale facts
Missing
Facts
Query
Synthesizer
QA System
candidate facts
Baseline
New
Facts
Open Domain Knowledge Extraction
[SIGMOD’23] Growing and Serving Large Open-domain Knowledge Graphs.
Throughput vs.
manual fact collection
>100x
Missing
Facts
Query
Synthesizer
Web Search
candidate facts w/
lower-confidence
New
Facts
Knowledge
Extractor
Fact
Corroboration
11. Example 2a: Crowd-in-the-Loop Curation
An hybrid approach
Corpus
raw data
Corpus
predicated
annotations
Annotation
Task
Corpus
curated
annotations
Task
Router
Difficult tasks are curated by experts
Easier tasks are curated by crowd
[EMNLP’17] CROWD-IN-THE-LOOP: A Hybrid Approach for Annotating Semantic Roles
13. Example 2b: Better Workflow Performs Ever Better
vs. SRL model
↑
Expert efforts
↓
10% F1
vs. SRL model
↑ 87.3%
Expert efforts
↓
Filter
unlikely options
Select
from likely options
Expert
resolve hard cases
[EMNLP’20 (Finding)] A Novel Workflow for Accurately and Efficiently
Crowdsourcing Predicate Senses and Argument Labels
14. Human-in-the-Loop Throughout the Entire Life Cycle
of KG construction, growth, and services
Data Labeling Development Deployment
Learner
Scale data labeling
raw data labeled data
IDE
Better IDE for model building
15. Different Tooling for Different Users
Full-fledged IDE
AI Engineers AI Engineers/Data Scientists
Visual IDE
[ACL’12] WizIE: A Best Practices Guided Development Environment for
Information Extraction
[CHI’13] I can do text analytics!: designing development tools for novice
developers
[VLDB’15] VINERy: A Visual IDE for Information Extraction
[KDD’19] Declarative Text Understanding with SystemT. (hands-on tutorial)
Entity Extraction in AIOps https://www.ibm.com/cloud/blog/entity-extraction-in-aiops
IBM InfoSphere BigInsights Text Analytics Eclipse Tooling
IBM Watson Knowledge Studio. Advanced Rule Editor http://ibm.biz/VineryIE
16. Human-in-the-Loop Throughout the Entire Life Cycle
of KG construction, growth, and services
Data Labeling Development Deployment
Learner
Scale data labeling
raw data labeled data
IDE
Better IDE for model building
Learner
Human-machine co-creation
17. Transparent Linguistic Models for Contract Understanding
Watson Discovery Content Intelligence
[NAACL’21] Development of an Enterprise-Grade Contract Understanding System, (Industry Track)
18. HEIDL: Human & Machine Co-Creation via Neural-Symbolic AI
[ACL’19] HEIDL: Learning Linguistic Expressions with Deep Learning and Human-in-the-Loop.
[EMNLP’20] Learning Explainable Linguistic Expressions with Neural Inductive Logic Programming for Sentence Classification
In use for major IBM customer engagements
Raises the abstraction level for domain experts to interact with
19. HEIDL: Human & Machine Co-Creation via Neural-Symbolic AI
[ACL’19] HEIDL: Learning Linguistic Expressions with Deep Learning and Human-in-the-Loop.
[EMNLP’20] Learning Explainable Linguistic Expressions with Neural Inductive Logic Programming for Sentence Classification
In use for major IBM customer engagements
Raises the abstraction level for domain experts to interact with
20. Human-in-the-Loop Throughout the Entire Life Cycle
of KG construction, growth, and services
Data Labeling Development Deployment
Learner
Scale data labeling
raw data labeled data
IDE
Better IDE for model building
Learner
Human-machine co-creation
Learner
Curb data hunger with interactive learning
21. Case 1: Example-Driven Extraction
Via pattern induction
[CHI’17] SEER: Auto-Generating Information Extraction Rules from User-Specified Examples
[SIGMOD’17] Synthesizing Extraction Rules from User Examples with SEER. SIGMOD’2017
[AAAI’22 (demo)] InteractEva: A Simulation-based Evaluation Framework for Interactive AI Systems
[AAAI’22] A Simulation-Based Evaluation Framework for Interactive AI Systems and Its Application.
IBM Watson Discovery (Beta in Plus since Oct. 2021) http://ibm.biz/SEER_IE, https://ibm.biz/WDSPressReleaseNov
22. Case 1: Example-Driven Extraction
Via pattern induction
[CHI’17] SEER: Auto-Generating Information Extraction Rules from User-Specified Examples
[SIGMOD’17] Synthesizing Extraction Rules from User Examples with SEER. SIGMOD’2017
[AAAI’22 (demo)] InteractEva: A Simulation-based Evaluation Framework for Interactive AI Systems
[AAAI’22] A Simulation-Based Evaluation Framework for Interactive AI Systems and Its Application.
IBM Watson Discovery (Beta in Plus since Oct. 2021) http://ibm.biz/SEER_IE, https://ibm.biz/WDSPressReleaseNov
23. Case 1: Example-Driven Extraction
Via pattern induction
[CHI’17] SEER: Auto-Generating Information Extraction Rules from User-Specified Examples
[SIGMOD’17] Synthesizing Extraction Rules from User Examples with SEER. SIGMOD’2017
[AAAI’22 (demo)] InteractEva: A Simulation-based Evaluation Framework for Interactive AI Systems
[AAAI’22] A Simulation-Based Evaluation Framework for Interactive AI Systems and Its Application.
IBM Watson Discovery (Beta in Plus since Oct. 2021) http://ibm.biz/SEER_IE, https://ibm.biz/WDSPressReleaseNov
24. Case 1: Example-Driven Extraction
Via pattern induction
[CHI’17] SEER: Auto-Generating Information Extraction Rules from User-Specified Examples
[SIGMOD’17] Synthesizing Extraction Rules from User Examples with SEER. SIGMOD’2017
[AAAI’22 (demo)] InteractEva: A Simulation-based Evaluation Framework for Interactive AI Systems
[AAAI’22] A Simulation-Based Evaluation Framework for Interactive AI Systems and Its Application.
IBM Watson Discovery (Beta in Plus since Oct. 2021) http://ibm.biz/SEER_IE, https://ibm.biz/WDSPressReleaseNov
25. Case 2: Entity Normalization & Variant Generation
Learning Structured Representations
Capture Entity Semantic Structure
[COLING’2018] Exploiting Structure in Representation of Named Entities using Active Learning.
[ICDE’2018] LUSTRE: An Interactive System for Entity Structured Representation and Variant
Generation.
Generated normalizers for Watson Discovery
[AAAI’2020] PARTNER: Human-in-the-Loop Entity Name Understanding with Deep
Learning.
[EMNLP’2020] Learning Structured Representations of Entity Names using Active
Learning and Weak Supervision.
“Bank of America N.A.” “Bank of America National Association”
Synthesizing Normalization and
Variant Generation Functions
26. Case 3: Deep Document Understanding
Document Ingestion
[WACV 2021] Global Table Extractor (GTE): A Framework for Joint Table Identification
and Cell Structure Recognition Using Visual Context.
[AAAI’21] KAAPA: Knowledge Aware Answers from PDF Analysis.
[ACL-CORD-19’21] CORD-19: The COVID-19 Open Research Dataset
Bringing IBM NLP capabilities to the CORD-19 Dataset. http://ibm.biz/CORD19-IBM
IBM Watson Discovery
JSON/HTML
Wide Variety in PDF Tables
Table with
graphic lines
Table with
visual clues only
Complex
table with
multi-row/column
headers
Table interleaved
with text and charts
27. Case 3: TableLab
TableLab: Easy Customization via Adaptive Deep Learning
[IUI’2021] TableLab: An Interactive Table Extraction System with Adaptive Deep Learning.
28. Case 3: Deep Document Understanding
TableLab: Easy Customization via Adaptive Deep Learning
[IUI’2021] TableLab: An Interactive Table Extraction System with Adaptive Deep Learning.
29. Case 3: Customization vis TableLab
Table Boundary Detection
Preliminary Results
Method CEDAR EDGAR Invoices Appraisals Health
Docs
GTE 0.94 0.84 0.47 0.85 0.93
GTE with
Retraining
0.96 0.91 0.92 0.96 0.98
Method CEDAR EDGAR Invoices Appraisals Health
Docs
GTE 0.88 0.62 0.42 0.71 0.55
GTE with
Retraining
0.90 0.82 0.68 0.90 0.77
Cell Adjacency Detection
Dataset
20 pages with tables per category: 10 for
retraining, 10 for testing
Evaluation Metric
F1 metric for Table Boundary and Cell Adjacency
as de
fi
ned in [1]
[1] Göbel et al. “A Methodology for Evaluating Algorithms for Table Understanding in PDF
Documents”. DocEng '12
30. Case 4: Label Sleuth
An open-source no-code system for text annotation and building text classifiers
[EMNLP’2022] Label Sleuth: From Unlabeled Text to a Classifier in a Few Hours
https://www.label-sleuth.org
1. From task definition to working
model in hours!
2. Extensible backend to integrate new
model architectures or active
learning techniques
31. Human-in-the-Loop Throughout the Entire Life Cycle
of KG construction, growth, and services
Data Labeling Development Deployment
Learner
Scale data labeling
raw data labeled data
IDE
Better IDE for model building
Learner
Human-machine co-creation
Learner
Curb data hunger with interactive learning
AutoML
Scale model building via AutoML
32. AutoAI for Text
AutoText
[AAAI’21] AutoText: An End-to-End AutoAI Framework for Text.
[NeurIPS 2022] AUTOMATA: Gradient Based Data Subset Selection for Compute-Efficient Hyper-parameter Tuning.
IBM Developer API https://developer.ibm.com/learningpaths/get-started-autoai-for-text-api
Example Use Case: Scale ML Product
Model for Text Classification
>30%
Reduction in combined
training and prediction
time
Auto weight
tuning & HPO
>10x
Speed-up in training at
comparable quality
Auto classifier
selection
33. Human-in-the-Loop Throughout the Entire Life Cycle
of KG construction, growth, and services
Data Labeling Development Deployment
Learner
Scale data labeling
raw data labeled data
IDE
Better IDE for model building
Learner
Human-machine co-creation
Learner
Curb data hunger with interactive learning
AutoML
Scale model building via AutoML
34. Human-in-the-Loop Throughout the Entire Life Cycle
of KG construction, growth, and services
Data Labeling Development Deployment
Learner
Scale data labeling
raw data labeled data
IDE
Better IDE for model building
Learner
Human-machine co-creation
Learner
Curb data hunger with interactive learning
AutoML
Scale model building via AutoML
Query
Log Tickets
User feedback influence the entire life cycle
35. Quality Evaluation
1. Measure what matters for end users
2. Identify the root cause of failures
3. Track improvements in individual
components as they evolve
The key requirements
- Who won the Paris
Paris, France Paris Masters
36. Overall Evaluation Framework
A Human-in-the-Loop Process
Annotation Quality Metrics
Dataset Collection
Tooling and annotation guidelines
for graders
Evaluation
Human in the loop to annotate/
grade queries
Logs
Synthetic Queries
Knowledge Graph Metrics
End to End Metrics
Query Understanding Metrics
37. Visual Tooling of Metrics and Loss Buckets
- Example Errors:
- Entity Prediction Error: “Who won Paris” (Paris Masters/Paris–Roubaix)
- Missing Fact: “When is the oscars in 2026”
- Fact is not present because date/location is not published yet)
- Unrecognized Entity in KG: ”Who is princess noor horse”
Facilitate Opportunity Analysis
39. ModelLens
Visual interactive tool for model improvement
[CSCW’19] ModelLens: An Interactive System to Support the Model Improvement Practices of Data Science Teams.
40. So how will EVERYTHING
change with LLMs?
Many exciting challenges and opportunities
41. Thanks!
IBM (including interns):
• Shivakumar Vaithyanathan
• Lucian Popa
• Ron Fagin
• Sriram Raghavan
• Rajasekar Krishnamurthy
• Fred Reiss
• Laura Chiticariu
• Benny Kimelfeld
• Mauricio Hernadez
• Eser Kandogan
• Huaiyu Zhu
• Kun Qian
• Dakuo Wang
• Maeda Hanafi
Many amazing collaborators and interns …
Apple (including interns):
• Ihab Ilyas
• Theodoros Rekatsinas
• Umar Farooq Minhas
• Ali Mousavi
• Jefferey Pound
• Anil Pacaci
• Shihabur R. Chowdhury
• Hongyu Ren
• Jason Mohoney
• Kun Qian
• Yiwen Sun
• Yisi Sang
• Saloni Potdar
• … …
Universities:
• Azza Abouzeid (NYU-Abu Dhabi)
• H. V. Jagadish (U. Of Michigan)
• Fei Xia (U. Of Washington)
• Kevin Chen-Chuan Chang (UIUC)
• ChengXiang Zhai (UIUC)
• Domenico Lembo(Sapienza
University of Rome)
• Dragomir R. Radev (Yale)
• Jonathan K. Kummerfeld (U. Of
Michigan)
• Walter S. Lasecki (U. Of Michigan)
• Toby Li (U. of Notre Dame)
• Rishabh Iyer (UT Dallas)
• Eduard C. Dragut (Temple Univ.)
• … ….
• Douglas Burdick’
• Alan Akbik
• Nancy Wang
• Prithiviraj Sen
• Marina Danilevsky
• Poornima Chozhiyath Raman
• Sudarshan Rangarajan
• Ramiya Venkatachalam
• Kiran Kate
• Eyal Shnarch
• Ishan Jindal
• Yiwei Yang
• Nikita Bhutani
• … ….