Temporal Relations in Queries of Electronic Patient Records. Our main scenario covers the patient identification and recruitment process for clinical trials. For this purpose an extension of the EHR4CR workbench to support patient recruitment was created. This workbench covers following requirements:
Need for built-in privacy protection.
Patient identification and recruitment tracking.
Availability at clinical sites in the form of a workbench with an user-friendly interface.
Each participating clinical site has its own installation only used locally.
Ability to generate queries with temporal relations and constraints for eligibility criteria to find candidate patients.
Our development is based on the fact that queries in EHRs often have a temporal component. But available user interfaces allow only the generation of simple queries with basic temporal relations. Time points and time intervals are therefore the main concepts that must be considered. Time points are related to instantaneous events (e.g. a single myocardial infarction), or to situations lasting for a span of time (e.g. a drug therapy for 2 weeks). Intervals can be represented using time points by their upper and lower temporal boundaries: the start and end. Temporal relations (e.g before, after) can be expressed via additional anchors. The dates of these anchor events can be retrieved and event dates relative to an anchor event can be calculated. EHR4CR decided to build its workbench upon a simple, time-stamp database concept. To each patient’s attribute a time-stamp, which corresponds to the time of the attribute’s occurrence was assigned. The processing of temporal intervals is necessary for EHR4CR since many questions dealing with inclusion / exclusion criteria often involve complex temporal periodes. A graphical interface to use boxes for querying with temporal relations was therefore created. The idea is that the easiest way to specify temporal operators is with an user interface based on the combination of boxes. Temporal operators based on Allen’s algebra were included. Expressions are displayed as graphic boxes and combined by
operators. Events are specified and a temporal operator selected from a predefined list.
Temporal relations in queries of ehr data for research
1. Temporal Relations in Queries
of Electronic Patient Records
Joint Semantic Group and CRI Group Seminar
UDUS, Duesseldorf, Germany
25.9.2016
W. Kuchinke
3. TRANSFoRm – Information model, GCP-compliance
validation, privacy model
p-medicine – legal framework, collaboration,
interoperability, system validation
EHR4CR – Requirement engineering, business model,
evaluation of pilot installation
BioMedBridges – LAT, CRIM, Usage scenarios of data
bridges, interoperability
ECRIN-IA – data management
Participation in EU projects
W. Kuchinke (2016)
4. EHR4CR Project
● 5 years (2011–2015) with a budget of about 16
million Euros
● 34 academic and private partners (10 pharma
companies)
● Developing adaptable, reusable and scalable
solutions (tools and services) for reusing data
from EHR systems for clinical research purposes
● The consortium also includes 11 hospital sites in
France, Germany, Poland, Switzerland and the
United Kingdom.
5. Data from Electronic Health Records
(EHR)
EHR data
Clinical care
medical
treatment
Clinical
research
6. Patient identification and recruitment
● The main scenario: Patient identification and recruitment
● Extension of EHR4CR workbench to support the patient recruitment
process
● Need for built-in privacy protection
– No individual patient-level information is exchanged and leaves the hospital
– Only aggregated numbers are shared
– Additional data perturbation is used to avoid potential patient re-identification
● Patient identification and recruitment tracking tools have been made
available to the clinical sites in the form of the workbench
● Each participating clinical site has its own installation only used
locally
● Ability to generate queries for eligibility criteria to find candidate
patients
W. Kuchinke (2016)
7. Using electronic health records for
clinical research
● The use of Electronic Health Records (EHRs)
spreads und EHR usage extends to clinical
research
● Queries in EHRs often have a temporal
component
– For example: Find all patients who were discharged from the
hospital and admitted again within 2 weeks
● Currently available user interfaces allow only
simple queries with simple temporal relations
8. EHR4CR’s use of electronic health records
● The EHR4CR project aims to build a platform that will
allow the use of data from hospital EHR systems
● This requires full compliance with the ethical,
regulatory and data protection policies and
requirements of each participating country
● The EHR4CR platform supports distributed querying to
assist in clinical trials feasibility assessment and
patient recruitment
● A workbench enables clinical researchers to obtain key
information about a patient’s health and healthcare
history before they arrive for the visit
W. Kuchinke (2016)
9. EHR4CR technical framework
EHR
EHR
EHR
EHR
Legacy systems Interface layer
Semantic
transformation,
patient ID
management,
mapping
Datasource endpoint
DB
Local
appl
modules
Local
appl
modules
Service platform
Services
Services
Services
user
W. Kuchinke (2016)
11. Temporal dimensions in data mining
● Clinical databases store large amounts of information about
patients and their medical conditions
● Data mining techniques can extract relationships and patterns
● Typical structure of medical data is represented as sequence of
observations of clinical parameters taken at different time points
● The temporal dimension of data is a fundamental variable that
must be taken in account in the mining process
● The classical framework of sequential pattern mining is limiting
– It focuses on the sequentiality of events, without extracting the exact time elapsing
between different events
– Time-annotated sequences (IAS) is a mining technique that solves this problem
● In IAS each transition between two events is annotated with a
typical transition time that is found in the data
12. Time points and time intervals
● The concepts of time point and time interval are of importance in
medical informatics
● Is related to instantaneous events (e.g. a single myocardial infarction),
or to situations lasting for a span of time (e.g. a drug therapy for 2
weeks)
● Associaton with clinical entities, such as symptoms, therapies, and
pathologies
● Example: a myocardial infarction can be an instantaneous event, within
the overall clinical history of the patient, or an interval-based concept, if
observed during a staying in the intensive care unit
● As basic time entities, time points are often adopted
● Intervals are then represented by their upper and lower temporal
bounds (start and end time points)
● Most systems employed in medical informatics applications have used a
time point based approach, similar to McDermott‘s approach
13. Relative and absolute time
● Absolute position on the time axis: e.g. the calendaric time 2029,
November, 4
● This is a common approach adopted by data models underlying a
temporal clinical database
● Relative time references: e.g. several episodes of headache
during puberty
● Incorporation of purely relative time-oriented, interval-based
information within a standard temporal database is still difficult
● In modeling temporal relationships, Allen's interval algebra has
been widely used
● Temporal relationships include two main types: qualitative
(angina before headache) and quantitative (angina 2 hours after
headache)
W. Kuchinke (2016)
14. Temporal baseline
● Temporal queries are used in many situations, from
clinical trial recruitment, clinical research, or patient care
● Clinicians are always concerned about changes from
some baseline state
● Example: blood pressure of 90/60, which is normal for a
25-year old female, represents hypotension in a 65-year
old male hypertensive patient whose blood pressure
during visits was 160/100
● In these scenarios, changes from the baseline determine
whether or not an intervention should be taken
W. Kuchinke (2016)
15. Time-anchors to organise events
● Temporal relations (e.g before, after) can be
expressed via anchors
● The dates of these anchor events can be
retrieved and event dates relative to this
anchor event can be calculated
– e.g. event 2 happened at least one year after event 1
– Example: find all women with diagnosis of cancer of the
breast diagnosed with an age between 60 and 70 years
● The year of diagnosis of cancer of the breast = “age anchor”
● Calculate “age anchor” minus “year of birth” between 60 and 70
W. Kuchinke (2016)
16. Necessity to implement temporal
relations
● EHR4CR approach for temporal relations in queries
● First step: specification of requirements for the patient
identification and recruitment project scenario
● Two different approaches to deal with temporal relations in
queries were identified
– Creation of a sophisticated internal database representation of time in the
database with a temporal query language
– Use of a simple, time-stamped database with a temporal query language
● Because EHR4CR uses data collected from EHRs based on
standard databases the second approach was followed
● The possibilities of a query language to define temporal
relations depends heavily on the temporal data model
17. Analysis of eligibilty criteria: 40% with
temporal relations
● We analysed eligibility criteria of clinical trials in a random sample of
studies from ClinicalTrials.gov
● These eligibility criteria are the basis for querying patient for eligibility
● Almost 40% of the eligibility criteria contained temporal relations
– In 1/3 of them the timing of clinical assessments or interventions were at least well defined
– But in 2/3 of them timing was not precisely defined
● For query generation two conclusions were drawn
– Specification of temporal relations must be possible
– Need for flexibility in defining temporal relations
● Many query generators do not provide the handling of temporal
relations
● Several query generators capable of temporal relations were evaluated
for use in the EHR4CR workbench
18. Conclusions for EHR4CR
● Iterative operations on the data are often difficult
● Implementation of an user interface to generate queries is often not
user friendly
● The introduction of anchor’s allows adequate representation of
temporal operators
● A more straightforward implementation of temporal operators can
be achieved via an intuitive user interface
● Temportal sequences between any number of queries should be
specified
● Need for a modern, user friendly interfaces for the generation of
queries
● Limitations of queries only to clinical trials (not EHR) is too limiting
● Limitations on binary operators exist
W. Kuchinke (2016)
19. Relations of temporal intervals
Relation Meaning (time interval X
and Y)
before
meets
overlaps
during
starts
finishes
equals
X Y
X
Y
X
Y
X
Y
X
Y
X
Y
X Y
W. Kuchinke (2016)
20. Example 1: for temporal relation
Event = coughing
Event point = start
Relation = before
Direction is minus
Anchor = admission
Anchor point = start
21. Example 2: for temporal relation
On April 10 we did a nasopharyngeal speculum which showed nasopharyngeal ulcer
Time:
Date
Event:
Test
Relation: Before
before
Event:
Problem
Relation: Before
before
overlap
22. Time-stamps for EHR4CR
● The progress of a disease is normally stored in databases with time-stamps
● But the vast majority of data tend to be recorded with only a single time
stamp
● Time Periodes (interval) data can be derived, if two time-stamps are
available for an event’s start and end time
● EHR4CR decided to build upon a simple, time-stamp database
– To each patient’s attribute a time-stamp, which corresponds to the time of the attribute’s
occurrence was assigned
– Many temporal operations can be performed on this form of data
● But temporal intervals cannot be elicitated from such a simple schema
without the use of application-level processing
● The processing of temporal intervals is necessary for EHR4CR since many
questions dealing with inclusion / exclusion criteria often involve complex
temporal periodes
W. Kuchinke (2016)
24. Electronic Health Records and temporal
patterns in patient histories
● Queries for patient eligibility often have a temporal component
– For example: Find all patients who were discharged from the emergency dep and
then admitted again within one week
– Find all patients who had a normal serum creatinine test less than 2 days before a
radiology test with intravenous contrast, followed by an increase in serum
creatinine by more than 50% and of more than 1.0 mg/dl within 5 days after the
contrast administration
● Currently available user interfaces for query generation allow only
simple queries such as “Find patients who had a radiology test
with contrast and a high value of creatinine”
● Specifying temporal queries in SQL is difficult even for specialists
● Database researchers have made progress in representing
temporal abstractions and executing complex temporal queries
25. Proposal for a graphical interface
● The easiest way is to specify temporal operators with an
adequate user interface based on the combination of boxes
● This makes constraints by limitations (like anchors, binary
operators, etc.) for the operators superfluous
● Temporal operators based on Allen’s algebra are specified
● Expressions are displayed as graphic boxes and combined by
operators
● Events are specified and a temporal operator selected from a
predefined list
● Events and the temporal operator are bound graphically in a box
● Extension is possible by adding another event with another time
operator or by including bounded temporal information in a
Boolean expression
26. Temporal Expressiveness in Querying a
Time-stamp
● The temporal representation of most health care databases
consists of time-stamped events, such as time of admission, time
a laboratory specimen was received, time a laboratory result was
reported, and time medication was dispensed
● Far more rare are databases in which intervals are represented,
such as the start and end of the administration of a medication
● TimeLine SQL and the Chronus System support interval semantics
● These systems provide a framework for both the storage and the
manipulation of time-based clinical data, which includes
specification of intervals of time
● This approach calls for the addition of a temporal dimension at
the level of the database tuple, with an associated temporal
extension to SQL to allow queries of complex temporal features
27. Discover suitable patients for clinical
trials
● Biiggest bottleneck in clinical trials is a slow and costly patient
recruitment
● The time dedicated to patient recruitment represents about 30 %
of the total length of clinical trials
● The project EHR4CR aimed to improve the design of clinical trials
by developing a platform that provides access to existing patient
electronic health record systems (EHRs)
● They must show the particular condition that is under
investigation
● Finding a group that fit all criteria can be time-consuming and
expensive
● EHR4CR allows researchers to search medical records in hospitals
across Europe to discover potentially suitable patients
W. Kuchinke (2016)
28. Requirements for temporal relations in
EHR
● Temporal filters for events (event-anchors) should be possible
● Identification of the earliest, most recent, any event or all events
● Age-anchor should exist (e.g. diagnosis between 60 and 70
years)
● Handling of temporal relations between events (time-instant
based data) or time-interval based data should be possible
● Following operators are essential: before, after, within x prior to,
within y following, equal
● No substantial restriction on the definition of Boolean
combinations within one inclusion/exclusion criterion
● Combination of single events or group of combined events
should be possible using a time operator
W. Kuchinke (2016)
29. Using boxes to create queries with
temporal relations
Box 1
Event 1
Box 2
Event 2
Time span
Time point
Time span
Time point
relation
30. Graphical representation of a query
*
Boolean Group 3
Lab Values
Creatinin >= 1 mg/dl
Diagnosis Sex
AND AND Add sub group
female
¬
radiotherapy
chemotherapy
OR
Boolean combination of inclusion/exclusion criteria
Box 1
Inclusion / exclusion
Criterion 1
Box 2
Inclusion / exclusion
Criterion 2
connector
31. *
Boolean combination with temporal filter and temporal relations
Temporal Group 1
EG 1
Malignant of...
EG2
radiotherapy
chemotherapy
OR
EG 3
HBA1c <= 6 mg/dl
Add follow up
follows
WITHIN 1 year
follows
BY 5 years
earliest
any earliest latest
Graphical representation of a query
Box 1
Inclusion / exclusion
Criterion 1
Box 2
Inclusion / exclusion
Criterion 2
Temporal condition Temporal condition
Temporal
relation
33. Use of Ontologies
● One of the most challenging tasks for EHR systems is to
achieve semantic interoperability
● Interoperable EHR by useing an ontology-based
approach to facilitate exchange of information
● Ability to automatically sort data items, e.g. diagnoses
based on properties such as anatomical location, has
inspired developments in clinical vocabularies,
specifically the SNOMED Clinical Terms (SNOMED CT)
● Important: to use ontologies already during data
collection
34. Ontology involvement already during
data capture
EHR EDC
ontology 1 ontology 2
Data input
from patient
Data
warehouse
Data input from
trial participants
35. Example: Time Ontology
• OWL-Time is an ontology of temporal concepts
• It provides a vocabulary for expressing facts about topological
relations among instants and intervals, together with
information about durations, and temporal expressions like
date-time information
• https://www.w3.org/TR/owl-time/
• Relations between intervals are based on Allen's analysis
• Thirteen elementary relations
36. Ontology of temporal terms
Simplified from: Tao C, et.al. CNTRO 2.0. Jt Summits Transl Sci Proc.
37. Example: CNTRO Ontology for temporal
relations
• CNTRO describes the temporal relations of clinical events
• Temporal relations are between two events, or an event and a
time
• Complication: In many clinical narratives, the relations
between two events is described without time stamps
– Example: The patient’s Melantoin is elevated after the
second cycle of chemotherapy
• The Event class connects to the time-associated classes by an
object property called hasTimeStamp
• It is possible to infer temporal relations between two events
based on their associated temporal information