SlideShare a Scribd company logo
1 of 41
Using Cheminformatics Approaches to
Develop a Structure Searchable
Database of Analytical Methods
Antony Williams, Greg Janesch, Sakuntala
Sivasupramaniam, Brian Meyer and Erik Carr
Center for Computational Toxicology & Exposure, U.S. Environmental Protection Agency
http://www.orcid.org/0000-0002-2668-4821
May 2023: American Oil Chemists Society meeting, Denver, CO.
The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
Disclaimer
• The views expressed in this presentation are those of the
author and do not necessarily reflect the views or policies of
the U.S. EPA
• This presentation is on a proof-of-concept tool in development
– NOT yet publicly available
1
Building a Methods Database
• Simple Vision: I want to find the best method(s) associated with a
chemical and/or class of chemicals
• Answer the question “I cannot find a method for my chemical” - HELP
• The Approach:
– Aggregate MS method documents (and adjust the definition of “what is a useful method”)
– Extract chemistry (mostly CASRN and Names)
– Map CASRN and Names to structures
– Deliver a proof-of-concept application to search a database by names, CASRNs, InChIKeys
and ultimately structure
2
Most people start with Google
• People have many places to search for methods, and
there is no one integration hub, except for search engines
• Search engines can return so many hits – then you filter
by analytes, matrix, analytical methodology, so many
synonyms and abbreviations for so many chemicals
3
Synonyms, Abbreviations and Chemicals
4
• CAS Numbers, Names
and Abbreviations can
limit what’s possible…
Might this be a better view?
5
Might this be a better view?
6
When methods are mapped to chemistry…
• The advantages of mapping chemicals directly to methods
– When chemicals are mapped it opens access to many other tools
– Chemical structures allow for QSAR modeling
7
…and what if we could then profile toxicity?
8
…and what if we could then profile toxicity?
9
…or simply harvest data from the
CompTox Chemicals Dashboard
10
What data would you like???
11
Introducing AMOS
Analytical Methods and Spectra Database
• Three types of data in the database:
– Methods (regulatory, lab manuals and SOPs, publications, tech notes)
– Spectra (from public domain and our own laboratories)
– Monographs (harvested from SWGDRUG and other sites)
• Some methods have associated spectra
• Some data are just externally linked
• Currently contains around 135,000 spectra, 600,000
external links, 650 monographs, and ~2000 methods
• ALL data are growing in number
12
Where are there methods?
• Agency-based methods
– EPA
– USGS
– USDA
– CDC
– FDA
• Vendor application notes – Thermo, Waters, Agilent, Sciex,
Shimadzu, LECO, ….
• Peer-reviewed articles
• Laboratory Documents – lab manuals, SOPs
13
A view of the methods list
14
Filtering the list for interests…
• Look for pesticides studied in water, by GC/MS, after 1990
15
Where are there methods?
• 900 method documents from the EPA harvested
16
Many Scanned Documents!!!
17
• Methods generally developed by
the agrochemical companies
• Include parents plus degradation
products
• Lots of scanned, old, documents
but the historical records are still
of significant use
• Electronic document forms of old
documents still of benefit
Embedding the old Method PDFs
18
Embedding New Method PDFs
19
When Methods are OPEN Access
20
When Methods are PubMed OPEN Access
21
Proprietary Methods for INTERNAL Access
22
If there is no method for your chemical
• Use “Chemical Similarity Searching” so that you can find
chemicals that are similar in structure space
• Use the “Tanimoto Similarity Search
23
Searching for a chemical – CASRN, Name
Direct structure searching coming
24
When Methods are Not Enough
• EPA is highly active in the field of non-targeted analysis
• We have been applying lots of cheminformatics approaches
25
Building a spectrum library to search against
26
Linking to actual spectra
27
Linking to actual spectra
28
There are errors EVERYWHERE: 110-75-8
It can be challenging
9/365 chemicals…
There are MANY errors in public
spectral databases
31
There are MANY errors in public
spectral databases
32
There are MANY errors in public
spectral databases
33
There are MANY errors in public
spectral databases
34
• Chemical structure representations would ideally
be standardized…consider tautomeric forms
• Not all substances are explicit and can be
ambiguous representations
My first time at AOCS…what did I learn?
• Some of your chemicals of interest: TAGs, PAHs, Toxins
of different types
• Methods for PAHs, Aflatoxins and Microsystins, and tri-
acyl glycerols all extracted and added to database
35
My first time at AOCS…what did I learn?
• Index Mark Collison’s methods 
36
My first time at AOCS…what did I learn?
37
General Comments
• It is possible we can extract chemicals from AOCS methods
and structure enable the method, WITHOUT making the
method itself publicly accessible – like ASTM and ISO
• Our focus right now is on mass spectrometry methods but
can support many other methods
38
Conclusions
• We have built (I think) the first chemical structure indexed
database of “methods”
• Methods are not just “approved methods” but also
standard operating procedures, application notes, lab
manuals, regulatory methods etc.
• Integrating methods to experimental spectral data will
serve our non-targeted analysis efforts
39
If you want to help…
• Send information regarding analytical methods and method
articles to williams.antony@epa.gov
• If anyone is interested in a live demo I can do one in the break
40

More Related Content

What's hot

Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...Ed Fernandez
 
Semantic Web - Ontologies
Semantic Web - OntologiesSemantic Web - Ontologies
Semantic Web - OntologiesSerge Linckels
 
StackStorm DevOps Automation Webinar
StackStorm DevOps Automation WebinarStackStorm DevOps Automation Webinar
StackStorm DevOps Automation WebinarStackStorm
 
The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...Anubhav Jain
 
The Case for Graphs in Supply Chains
The Case for Graphs in Supply ChainsThe Case for Graphs in Supply Chains
The Case for Graphs in Supply ChainsNeo4j
 
3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine LearningNeo4j
 
Machine Learning In Production
Machine Learning In ProductionMachine Learning In Production
Machine Learning In ProductionSamir Bessalah
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Sri Ambati
 
Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101QuantUniversity
 
Introduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainIntroduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainNeo4j
 
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Intel® Software
 
Building a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesBuilding a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesNeo4j
 
Strata - Scaling Jupyter with Jupyter Enterprise Gateway
Strata - Scaling Jupyter with Jupyter Enterprise GatewayStrata - Scaling Jupyter with Jupyter Enterprise Gateway
Strata - Scaling Jupyter with Jupyter Enterprise GatewayLuciano Resende
 
Experimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsExperimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsDatabricks
 
Overview of cheminformatics
Overview of cheminformaticsOverview of cheminformatics
Overview of cheminformaticsBenjamin Bucior
 
Pedagogin digitystä 26.1.23
Pedagogin digitystä 26.1.23Pedagogin digitystä 26.1.23
Pedagogin digitystä 26.1.23Matleena Laakso
 
Introduction to Neo4j
Introduction to Neo4jIntroduction to Neo4j
Introduction to Neo4jNeo4j
 
Building Applications with a Graph Database
Building Applications with a Graph DatabaseBuilding Applications with a Graph Database
Building Applications with a Graph DatabaseTobias Lindaaker
 
Cheminformatics, concept by kk sahu sir
Cheminformatics, concept by kk sahu sirCheminformatics, concept by kk sahu sir
Cheminformatics, concept by kk sahu sirKAUSHAL SAHU
 

What's hot (20)

Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
 
Semantic Web - Ontologies
Semantic Web - OntologiesSemantic Web - Ontologies
Semantic Web - Ontologies
 
StackStorm DevOps Automation Webinar
StackStorm DevOps Automation WebinarStackStorm DevOps Automation Webinar
StackStorm DevOps Automation Webinar
 
The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...The Materials Project: A Community Data Resource for Accelerating New Materia...
The Materials Project: A Community Data Resource for Accelerating New Materia...
 
The Case for Graphs in Supply Chains
The Case for Graphs in Supply ChainsThe Case for Graphs in Supply Chains
The Case for Graphs in Supply Chains
 
3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning3. Relationships Matter: Using Connected Data for Better Machine Learning
3. Relationships Matter: Using Connected Data for Better Machine Learning
 
Machine Learning In Production
Machine Learning In ProductionMachine Learning In Production
Machine Learning In Production
 
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
Design Patterns for Machine Learning in Production - Sergei Izrailev, Chief D...
 
Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101Automatic machine learning (AutoML) 101
Automatic machine learning (AutoML) 101
 
Introduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & BahrainIntroduction to Neo4j for the Emirates & Bahrain
Introduction to Neo4j for the Emirates & Bahrain
 
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
 
Building a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesBuilding a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and Ontologies
 
Strata - Scaling Jupyter with Jupyter Enterprise Gateway
Strata - Scaling Jupyter with Jupyter Enterprise GatewayStrata - Scaling Jupyter with Jupyter Enterprise Gateway
Strata - Scaling Jupyter with Jupyter Enterprise Gateway
 
Experimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOpsExperimentation to Industrialization: Implementing MLOps
Experimentation to Industrialization: Implementing MLOps
 
Overview of cheminformatics
Overview of cheminformaticsOverview of cheminformatics
Overview of cheminformatics
 
Pedagogin digitystä 26.1.23
Pedagogin digitystä 26.1.23Pedagogin digitystä 26.1.23
Pedagogin digitystä 26.1.23
 
Introduction to Neo4j
Introduction to Neo4jIntroduction to Neo4j
Introduction to Neo4j
 
Building Applications with a Graph Database
Building Applications with a Graph DatabaseBuilding Applications with a Graph Database
Building Applications with a Graph Database
 
Understanding MLOps
Understanding MLOpsUnderstanding MLOps
Understanding MLOps
 
Cheminformatics, concept by kk sahu sir
Cheminformatics, concept by kk sahu sirCheminformatics, concept by kk sahu sir
Cheminformatics, concept by kk sahu sir
 

Similar to Using Cheminformatics Approaches to Develop a Structure Searchable Database of Analytical Methods

Similar to Using Cheminformatics Approaches to Develop a Structure Searchable Database of Analytical Methods (20)

Applying Cheminformatics to Develop a Structure Searchable Database of Analyt...
Applying Cheminformatics to Develop a Structure Searchable Database of Analyt...Applying Cheminformatics to Develop a Structure Searchable Database of Analyt...
Applying Cheminformatics to Develop a Structure Searchable Database of Analyt...
 
Data delivery from the US-EPA Center for Computational Toxicology and Exposur...
Data delivery from the US-EPA Center for Computational Toxicology and Exposur...Data delivery from the US-EPA Center for Computational Toxicology and Exposur...
Data delivery from the US-EPA Center for Computational Toxicology and Exposur...
 
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
 
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
 
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...
US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...
US-EPA Cheminformatics Support for Delivering Data Related to Chemicals of E...
 
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted AnalysisThe US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
 
AMOS: the EPA database of analytical methods and open mass spectral database ...
AMOS: the EPA database of analytical methods and open mass spectral database ...AMOS: the EPA database of analytical methods and open mass spectral database ...
AMOS: the EPA database of analytical methods and open mass spectral database ...
 
Cheminformatics Support for MS Supporting Exposomics
Cheminformatics Support for MS Supporting ExposomicsCheminformatics Support for MS Supporting Exposomics
Cheminformatics Support for MS Supporting Exposomics
 
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
TRIANGLE AREA MASS SPECTOMETRY MEETING: Structure Identification Approaches U...
 
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
Applications of the US EPA’s CompTox Chemistry Dashboard to support structure...
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
 
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
 
Accessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data DashboardsAccessing Environmental Chemistry Data via Data Dashboards
Accessing Environmental Chemistry Data via Data Dashboards
 
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
 
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Chemis...
Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Chemis...Integrating Mass Spectrometry  Non-Targeted Analysis and Computational Chemis...
Integrating Mass Spectrometry Non-Targeted Analysis and Computational Chemis...
 
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
Accessing information for Per- & Polyfluoroalkyl Substances using the US EPA ...
 
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
 

Recently uploaded

High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicinesherlingomez2
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and ClassificationsAreesha Ahmad
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learninglevieagacer
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Monika Rani
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxRizalinePalanog2
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Joonhun Lee
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .Poonam Aher Patil
 

Recently uploaded (20)

High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
IDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicineIDENTIFICATION OF THE LIVING- forensic medicine
IDENTIFICATION OF THE LIVING- forensic medicine
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptxSCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
SCIENCE-4-QUARTER4-WEEK-4-PPT-1 (1).pptx
 
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
Feature-aligned N-BEATS with Sinkhorn divergence (ICLR '24)
 
Factory Acceptance Test( FAT).pptx .
Factory Acceptance Test( FAT).pptx       .Factory Acceptance Test( FAT).pptx       .
Factory Acceptance Test( FAT).pptx .
 

Using Cheminformatics Approaches to Develop a Structure Searchable Database of Analytical Methods

  • 1. Using Cheminformatics Approaches to Develop a Structure Searchable Database of Analytical Methods Antony Williams, Greg Janesch, Sakuntala Sivasupramaniam, Brian Meyer and Erik Carr Center for Computational Toxicology & Exposure, U.S. Environmental Protection Agency http://www.orcid.org/0000-0002-2668-4821 May 2023: American Oil Chemists Society meeting, Denver, CO. The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA
  • 2. Disclaimer • The views expressed in this presentation are those of the author and do not necessarily reflect the views or policies of the U.S. EPA • This presentation is on a proof-of-concept tool in development – NOT yet publicly available 1
  • 3. Building a Methods Database • Simple Vision: I want to find the best method(s) associated with a chemical and/or class of chemicals • Answer the question “I cannot find a method for my chemical” - HELP • The Approach: – Aggregate MS method documents (and adjust the definition of “what is a useful method”) – Extract chemistry (mostly CASRN and Names) – Map CASRN and Names to structures – Deliver a proof-of-concept application to search a database by names, CASRNs, InChIKeys and ultimately structure 2
  • 4. Most people start with Google • People have many places to search for methods, and there is no one integration hub, except for search engines • Search engines can return so many hits – then you filter by analytes, matrix, analytical methodology, so many synonyms and abbreviations for so many chemicals 3
  • 5. Synonyms, Abbreviations and Chemicals 4 • CAS Numbers, Names and Abbreviations can limit what’s possible…
  • 6. Might this be a better view? 5
  • 7. Might this be a better view? 6
  • 8. When methods are mapped to chemistry… • The advantages of mapping chemicals directly to methods – When chemicals are mapped it opens access to many other tools – Chemical structures allow for QSAR modeling 7
  • 9. …and what if we could then profile toxicity? 8
  • 10. …and what if we could then profile toxicity? 9
  • 11. …or simply harvest data from the CompTox Chemicals Dashboard 10
  • 12. What data would you like??? 11
  • 13. Introducing AMOS Analytical Methods and Spectra Database • Three types of data in the database: – Methods (regulatory, lab manuals and SOPs, publications, tech notes) – Spectra (from public domain and our own laboratories) – Monographs (harvested from SWGDRUG and other sites) • Some methods have associated spectra • Some data are just externally linked • Currently contains around 135,000 spectra, 600,000 external links, 650 monographs, and ~2000 methods • ALL data are growing in number 12
  • 14. Where are there methods? • Agency-based methods – EPA – USGS – USDA – CDC – FDA • Vendor application notes – Thermo, Waters, Agilent, Sciex, Shimadzu, LECO, …. • Peer-reviewed articles • Laboratory Documents – lab manuals, SOPs 13
  • 15. A view of the methods list 14
  • 16. Filtering the list for interests… • Look for pesticides studied in water, by GC/MS, after 1990 15
  • 17. Where are there methods? • 900 method documents from the EPA harvested 16
  • 18. Many Scanned Documents!!! 17 • Methods generally developed by the agrochemical companies • Include parents plus degradation products • Lots of scanned, old, documents but the historical records are still of significant use • Electronic document forms of old documents still of benefit
  • 19. Embedding the old Method PDFs 18
  • 21. When Methods are OPEN Access 20
  • 22. When Methods are PubMed OPEN Access 21
  • 23. Proprietary Methods for INTERNAL Access 22
  • 24. If there is no method for your chemical • Use “Chemical Similarity Searching” so that you can find chemicals that are similar in structure space • Use the “Tanimoto Similarity Search 23
  • 25. Searching for a chemical – CASRN, Name Direct structure searching coming 24
  • 26. When Methods are Not Enough • EPA is highly active in the field of non-targeted analysis • We have been applying lots of cheminformatics approaches 25
  • 27. Building a spectrum library to search against 26
  • 28. Linking to actual spectra 27
  • 29. Linking to actual spectra 28
  • 30. There are errors EVERYWHERE: 110-75-8
  • 31. It can be challenging 9/365 chemicals…
  • 32. There are MANY errors in public spectral databases 31
  • 33. There are MANY errors in public spectral databases 32
  • 34. There are MANY errors in public spectral databases 33
  • 35. There are MANY errors in public spectral databases 34 • Chemical structure representations would ideally be standardized…consider tautomeric forms • Not all substances are explicit and can be ambiguous representations
  • 36. My first time at AOCS…what did I learn? • Some of your chemicals of interest: TAGs, PAHs, Toxins of different types • Methods for PAHs, Aflatoxins and Microsystins, and tri- acyl glycerols all extracted and added to database 35
  • 37. My first time at AOCS…what did I learn? • Index Mark Collison’s methods  36
  • 38. My first time at AOCS…what did I learn? 37
  • 39. General Comments • It is possible we can extract chemicals from AOCS methods and structure enable the method, WITHOUT making the method itself publicly accessible – like ASTM and ISO • Our focus right now is on mass spectrometry methods but can support many other methods 38
  • 40. Conclusions • We have built (I think) the first chemical structure indexed database of “methods” • Methods are not just “approved methods” but also standard operating procedures, application notes, lab manuals, regulatory methods etc. • Integrating methods to experimental spectral data will serve our non-targeted analysis efforts 39
  • 41. If you want to help… • Send information regarding analytical methods and method articles to williams.antony@epa.gov • If anyone is interested in a live demo I can do one in the break 40