Abstract:- Artificial Intelligence is an important topic in the fight against cancer. Clinical Trails are at the frontier of innovation. I will discuss techniques, data sets and platforms we use at Deep 6 to bring patients to clinical trials. The focus will be on practical, repeatable methods I've developed at MySpace, Greenplum, UCLA and the US Intelligence Community.
A Practical Use of Artificial Intelligence in the Fight Against Cancer by Brian Dolan
1. PRACTICAL
ANALYSIS IN THE
FIGHT AGAINST
CANCER:
Advice for data scientists
deep6.ai
Brian Dolan, Chief Scientist + Co-founder
2. • Deep 6 AI and the fight against cancer
• Why graphs for massive data?
• Applications of graphs
• Guidance on NLP
• Sage-like wisdom
• Final thoughts / Q+A
WHAT
WE’LL
TALK
ABOUT
4. “DEEP 6 AI IS A
GAME-CHANGER.”
CUSTOMER TESTIMONIAL
WEBINAR
TOP 100 MOST DISRUPTIVE
COMPANIES IN THE WORLD
Deep 6 AI applies AI and NLP to clinical
data to find patients for clinical trials in
minutes, not months.
ALUM OF TECHSTARS, STARTX,
AND HEALTHBOX ACCELERATORS
WIN AT SXSW 2017 ACCELERATOR:
ENTERPRISE + SMART DATA
5. INNOVATION HAPPENS IN CLINICAL TRIALS
… BUT TOO FEW PEOPLE PARTICIPATE
3.7MPATIENT
SHORTFALL
5.9M
goal
2015
SOURCES: clinicaltrials.gov, CISCRP
(2017: 6.7M)
2.2M
trial participants
11. WHY GRAPHS?
> Not mole
> Not mole
> Not mole
! Stage IV mole attack
Hidden correlation structures make a huge
difference in mole attacks
The field of Algebraic Graph Theory is quite well
developed and offers a lot of machinery for analysis
12. SOME GRAPH ANALYTICS
• Basic descriptions include:
• Connectedness: can you go from any node to another node?
• Degree of a vertex
• Transitivity
• Betweeness
• Community detection is a variety of methods to find
dense sub-networks
• Read “Statistical Analysis of Network Data” by Kolaczyk
(supplement by Csardi is really good, too)
• Strong body of Algebraic Graph Theory (next!)
13. ADJACENCY MATRICES
B
C D
A B C D E F G
A 0 1 0 0 1 0 0
B 0 0 0 0 0 0 0
C 0 0 0 0 0 0 0
D 0 0 0 0 0 0 0
E 0 1 0 0 0 0 0
F 0 0 0 0 0 0 1
G 0 0 0 1 0 0 0
E
G
F
A
# {Length n Paths}
Directed edge =
Asymmetric matrix
Transformation of
Graph by Graph
• Just like that, we have turned an arbitrary collection of objects into a
Linear Algebra problem
• Any PCA, Spectral Decomp you do will translate into the Edge space
14. BIGRAPH: PATIENTS / SYMPTOM
PAIRS
Assume X is n x m.
D
E
W
X
Y
Z
B
C
A
• X is a rectangular matrix – in our case, very tall
• X’X is the number of Patients the Symptoms
have in common and corresponds to a
Symptom Graph
• X X’ is the number of Symptoms the Patients
have in common and corresponds to a Patient
Graph
• Both matrices are square
• Both can be analyzed as directed graphs
15. EXAMPLE: SUPERVISED LEARNING IN
GRAPHS
Discoloration (mole)
Mole
(rodent)
Lyme
Disease
Malignant
neoplasms
Neoplasms of
the lung
Neoplasms of
the skin
Malignancies
Disease
vector
Lexically related
Semantically related
+ User labels - User labels
16. EXAMPLE: SUPERVISED LEARNING IN
GRAPHS
Discoloration (mole)
Mole
(rodent)
Lyme
Disease
Malignant
neoplasms
Neoplasms of
the lung
Neoplasms of
the skin
Malignancies
Disease
vector
Lexically related
Semantically related
+ User labels - user labels
17. DECIDE THE DOMAIN
• Analyze X’X to find patterns in Symptoms.
• Unlike methods like k-means, you are operating on the
relationships between the objects, not the objects themselves
• By default, everything you do is context-sensitive.
“This thing makes more less sense in the presence of that thing”
• That is semantic analysis at a primitive, but extremely practical
and effective level.
19. PITFALLS OF NLP IN PRACTICE
“D. tested negative for the following: sepsis,
secondary infection, metastatic nodules.”
Negations are VERY hard and the subject of active
research. Ubiquitous in non-trivial domains, e.g., not
Twitter or movie reviews
tf/idf rewards the wrong things, ignores contextual
queues and has few theoretical underpinnings.
Latent Dirichlet Allocation assumes topics can be
expressed as permutations of tokens.
Because there will always be domains of knowledge, there will always be domains in NLP.
And it follows that there will always be some degree of feature engineering. In humans, this
is analogous to “college.”
20. BUT BRIAN,
WHAT
ABOUT
DEEP
LEARNING?
• Pretty cool results in limited domains
• Almost certainly require more data than
you have in your domain
• Long-Short-Term-Memory assumes you
want to predict the “next token” or mimic a
series of tokens
• The corpus needs to provide similar context
with different tokens A LOT of times
• There are always relationships that appear
to be errors, but actually occur in the data
• Violates own promise of “no feature
engineering”
21. WISDOM OF THE ANCIENT
• Indexing data is not analyzing
data
•Storing data is just kicking the
can to the next guy
•We must try to be smarter,
better and more
relevant to the
world
•Let’s generate
universal truths if
we can
• Don’t ask your software
what analyses you should
do
• Learn the math from first
principles
• Take time to align the
methods to the problem,
don’t rely on mental
furniture
Don’t ask your barber if you
need a haircut
The map is not the terrain
22. THINGS HOLDING MY INTEREST NOW
igraph
Politics family ice hockey robots
coffee tacos PDEs Irish music
management theory VCs
vacation with my wife naps,
solar energy health care for all
marine ecosystems Blender
biking Markov Chains Americana
Sleeping in a Wigwam! AYSO
sales gun control…
THE FUTURE IS MY
RESPONSIBILTY
Why Graphs?
Very well studied mathematically, making a comeback with modern computing
Diseases are expressed as clusters or constellations of symptoms
The feature space of symptoms shifts over time
Systems of relationship define the status of an illness, not just the symptoms
Why Graphs?
Very well studied mathematically, making a comeback with modern computing
Diseases are expressed as clusters or constellations of symptoms
The feature space of symptoms shifts over time
Systems of relationship define the status of an illness, not just the symptoms
Why Graphs?
Very well studied mathematically, making a comeback with modern computing
Diseases are expressed as clusters or constellations of symptoms
The feature space of symptoms shifts over time
Systems of relationship define the status of an illness, not just the symptoms
Why Graphs?
Very well studied mathematically, making a comeback with modern computing
Diseases are expressed as clusters or constellations of symptoms
The feature space of symptoms shifts over time
Systems of relationship define the status of an illness, not just the symptoms
You can describe a graph with n nodes as and nxn matrix with the entries as edge strength
You can take a matrix X and make it a graph G
Because of this, you can multiply a Graph with another Graph
And your favorite Markov Chain is also a graph
Directed graphs, including bigraphs, have asymmetric matrices*
LINDA: This slide is a visual mess right now
Semantic Analysis
Term co-opted from linguists by computer scientists
Now generally means “understanding context of data points”
Think going graph with no edges to graph with edges
Deep Learning Techniques
Pretty cool results
Almost certainly require more data than you have in your domain
Not bad, but
Cool results on some domains
Requires a lot of data
and
Let’s be realistic about how much “Science” we are doing.
Science has always been about data, hypothesis testing and peer review
Many people in that role now are simply throwing pre-packaged routines against data, and they haven’t checked the assumptions of the models
That job title is going to be obviated by better software packages
We must try to be smarter, better and more relevant to the world
Let’s generate universal truths if we can