SlideShare a Scribd company logo
1 of 28
Download to read offline
FAIRSpectra
Enabling the FAIRification of
Spectroscopy and Spectrometry
Alex Henderson
University of Manchester
Office for Open Research
https://fairspectra.net
https://alexhenderson.info
Thanks…
• For financial support
• University of Manchester’s Office for Open Research
• SurfaceSpectra Ltd.
• For in-kind support (free exhibition space)
• UK Surface Analysis Users Forum (UKSAF)
• SIMS Europe
• SpringSciX 2024
• 101st IUVSTA Workshop(The International Union for Vacuum Science, Technique and Applications)
• Zulip (free upgrade)
SIMS Europe
Office for Open Research
What is FAIR?
The FAIR Guiding Principles
• Findable
• Accessible
• Interoperable
• Reusable https://www.go-fair.org/
What are Spectroscopy
and Spectrometry?
• Instrumentation-based chemical analysis
• mass spectrometry (MALDI, SIMS, DESI)
• UV-vis / infrared / Raman spectroscopies
• NMR
• X-ray diffraction
• …
• Variants of each technique have own requirements
• Need to consider combination of techniques → data fusion
Modalities
• Single spectrum
• Collections of spectra
• Spectral maps
• Multispectral images
• Hyperspectral images
• 3D images
Modalities
• Single spectrum
• Collections of spectra
• Spectral maps
• Multispectral images
• Hyperspectral images
• 3D images
Typical infrared spectrum of tissue. Courtesy Peter Gardner @ Manchester
Modalities
• Single spectrum
• Collections of spectra
• Spectral maps
• Multispectral images
• Hyperspectral images
• 3D images
X-ray photoelectron spectra. Range of carbon functionalities in polymers. SurfaceSpectra Ltd.
Modalities
• Single spectrum
• Collections of spectra
• Spectral maps
• Multispectral images
• Hyperspectral images
• 3D images
Secondary ion mass spectrometry spectra from a range of urinary-tract infection bacteria. John Fletcher @ Manchester now Gothenburg
Modalities
• Single spectrum
• Collections of spectra
• Spectral maps
• Multispectral images
• Hyperspectral images
• 3D images
Co-located infrared and Raman spectroscopy spectra of neuroglioma cells. Photothermal Inc.
Modalities
• Single spectrum
• Collections of spectra
• Spectral maps
• Multispectral images
• Hyperspectral images
• 3D images
Secondary ion mass spectrometry image of arsenic, sulfur and phosphate abundance in rice. Katie Moore @ Manchester
What is hyperspectral imaging?
• Operates more like a camera, with multiple image elements
• 128 × 128 pixels, liquid nitrogen cooled
• Mosaic these ‘pictures’ to cover large areas
Courtesy of Peter Gardner @ Manchester
Modalities
• Single spectrum
• Collections of spectra
• Spectral maps
• Multispectral images
• Hyperspectral images
• 3D images
Infrared spectroscopy hyperspectral image of prostate cancer tissue. False coloured Random Forests classification of spectra.Peter Gardner @ Manchester
H&E optical FTIR
Epithelium
Smooth Muscle
Lymphocytes
Blood
Concretion
Fibrous Stroma
ECM
Modalities
• Single spectrum
• Collections of spectra
• Spectral maps
• Multispectral images
• Hyperspectral images
• 3D images
Secondary ion mass spectrometry 3D image depth profile, topography corrected, false coloured green=lipid (DPPC), R=nucleic acid (Adenine). John Fletcher & Alex Henderson @ Manchester
Modalities
• Single spectrum
• Collections of spectra
• Spectral maps
• Multispectral images
• Hyperspectral images
• 3D images
Secondary ion mass spectrometry 3D depth profile image showing cholesterol distribution in surface of frog oocyte. John Fletcher @ Manchester, now Gothenburg
Modalities
• Single spectrum
• Collections of spectra
• Spectral maps
• Multispectral images
• Hyperspectral images
• 3D images
X-ray photoelectron spectroscopy image of nitrogen and specific carbon functionality in caffeine tablet. Alex Henderson @ Manchester with Kratos Analytical.
What are the issues?
Academia
Funders require ‘data’ to be deposited in (open) repositories
But…
• No dedicated repositories
• Metadata terms are patchy
• Instrument data in proprietary file formats
• Many software packages not compatible with open formats
Researchers willing to share, but don’t know how
What are the issues?
Commercial activity needs to be considered
Barriers
• FAIR often confused with Open
• In-house processes considered good enough
• Worry about certain metadata usage giving secrets away
Benefits
• Easier to share data in-house, between labs and (overseas) sites
• FAIR practises lead to better records retention
• Acquisitions and mergers become more straightforward
• Third-party (open source) software becomes easily accessible
• Incoming staff already familiar with systems
Instrument manufacturer buy-in is vital
Moving forward
https://xkcd.com/2116/
What is FAIRSpectra?
Community driven initiative
Focus on hyperspectral imaging techniques
• File formats for hyperspectral imaging
• No standards exist right now
• Software tools to support these
• Metadata requirements
• Education and training
• Raising awareness
What is FAIRSpectra?
https://fairspectra.net https://fairspectra.zulipchat.com https://github.com/FAIRSpectra
Which metadata are required?
• Sampling method
• Storage conditions
• Chemical modifications
• Physical state
• Pre-treatment
• …
Sample
• Experiment plan
• Substrate material
• Mounting method
• Region analysed
• Instrument params
• …
Experiment
• Artifact removal
• Pre-processing
• Algorithm choice
• Hyperparameters
• Validation method
• …
Analysis
Downstream reporting
Upstream sample provenance
Where do we start?
• Sampling method
• Storage conditions
• Chemical modifications
• Physical state
• Pre-treatment
• …
Sample
• Experiment plan
• Substrate material
• Mounting method
• Region analysed
• Instrument params
• …
Experiment
• Artifact removal
• Pre-processing
• Algorithm choice
• Hyperparameters
• Validation method
• …
Analysis
Downstream reporting
Upstream sample provenance
Born-digital metadata
in data files
Limited/common options
Where do we start?
• Sampling method
• Storage conditions
• Chemical modifications
• Physical state
• Pre-treatment
• …
Sample
• Experiment plan
• Substrate material
• Mounting method
• Region analysed
• Instrument params
• …
Experiment
• Artifact removal
• Pre-processing
• Algorithm choice
• Hyperparameters
• Validation method
• …
Analysis
Downstream reporting
Upstream sample provenance
Many workflows have
common steps
Default hyperparameters
Where do we start?
• Sampling method
• Storage conditions
• Chemical modifications
• Physical state
• Pre-treatment
• …
Sample
• Experiment plan
• Substrate material
• Mounting method
• Region analysed
• Instrument params
• …
Experiment
• Artifact removal
• Pre-processing
• Algorithm choice
• Hyperparameters
• Validation method
• …
Analysis
Downstream reporting
Upstream sample provenance
Samples so varied makes
this very difficult
Where do we start?
• Sampling method
• Storage conditions
• Chemical modifications
• Physical state
• Pre-treatment
• …
Sample
• Experiment plan
• Substrate material
• Mounting method
• Region analysed
• Instrument params
• …
Experiment
• Artifact removal
• Pre-processing
• Algorithm choice
• Hyperparameters
• Validation method
• …
Analysis
Downstream reporting
Upstream sample provenance
Workflows also include repeated steps
with poorly defined break points
Exhibition booths at 3 conferences/workshops
• More planned, another 3 before Christmas
Survey
• Positives
• Everyone wanted to see something done, not sure about how
• Barriers
• People have difficulty sharing
• Poor documentation
• Proprietary file formats – loss of information
• Raw data versus processed data – large file size
• Gazumping / IP & prior art / confidentiality
• Time consuming
Where are we now – 6 months in?
Data file formats
• Looking to re-purpose strategies from astronomy, climate science, and microscopy
• Some file convertors ready for testing
• Suitable test files an issue – everyone’s a critic
Two instrument vendors interested in getting involved
• Need to be careful not to go too fast
• Only one shot at changing instrument software
Discussions with journals just beginning
• Need minimum reporting requirement
• Need to convince referees this is important
Developing connections with BioFAIR, ELIXIR and ISO
Where are we now – 6 months in?
Summary
The researchers are willing, but their resources are weak
• Few solutions currently exist
• Metadata terms missing
• Proprietary file formats are a barrier
• Instrument vendor buy-in required
• Lack of awareness persists
But…
• Some low-hanging fruit
• Opportunity to make an impact
• Even Closed FAIR can still have benefits to industry
There’s lots to do, but FAIRSpectra is just getting started!
https://fairspectra.net

More Related Content

Similar to FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry

oreChem: Planning and Enacting Chemistry on the Semantic Web
oreChem: Planning and Enacting Chemistry on the Semantic WeboreChem: Planning and Enacting Chemistry on the Semantic Web
oreChem: Planning and Enacting Chemistry on the Semantic Web
Mark Borkum
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
Ken Karapetyan
 
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Spark Summit
 
Big Process for Big Data @ PNNL, May 2013
Big Process for Big Data @ PNNL, May 2013Big Process for Big Data @ PNNL, May 2013
Big Process for Big Data @ PNNL, May 2013
Ian Foster
 

Similar to FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry (20)

Working with Instrument Data (GlobusWorld Tour - UMich)
Working with Instrument Data (GlobusWorld Tour - UMich)Working with Instrument Data (GlobusWorld Tour - UMich)
Working with Instrument Data (GlobusWorld Tour - UMich)
 
oreChem: Planning and Enacting Chemistry on the Semantic Web
oreChem: Planning and Enacting Chemistry on the Semantic WeboreChem: Planning and Enacting Chemistry on the Semantic Web
oreChem: Planning and Enacting Chemistry on the Semantic Web
 
Our dire need to mandate data standards and expectations for scientific publi...
Our dire need to mandate data standards and expectations for scientific publi...Our dire need to mandate data standards and expectations for scientific publi...
Our dire need to mandate data standards and expectations for scientific publi...
 
Machine Learning Application Development
Machine Learning Application DevelopmentMachine Learning Application Development
Machine Learning Application Development
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
Research data zone: veilige en geoptimaliseerde netwerkomgeving voor onderzoe...
 
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platformsChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
 
Research Cyberinfrastructure at UCSD - David Minor - RDAP12
Research Cyberinfrastructure at UCSD - David Minor - RDAP12Research Cyberinfrastructure at UCSD - David Minor - RDAP12
Research Cyberinfrastructure at UCSD - David Minor - RDAP12
 
Tejasvi 13
Tejasvi 13Tejasvi 13
Tejasvi 13
 
Assay Development and Drug Repurposing Core
Assay Development and Drug Repurposing CoreAssay Development and Drug Repurposing Core
Assay Development and Drug Repurposing Core
 
Marrying ACDLabs technologies to eScience Projects at the Royal Society of C...
Marrying ACDLabs technologies to eScience Projects at the  Royal Society of C...Marrying ACDLabs technologies to eScience Projects at the  Royal Society of C...
Marrying ACDLabs technologies to eScience Projects at the Royal Society of C...
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
 
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
Escaping Flatland: Interactive High-Dimensional Data Analysis in Drug Discove...
 
2nd Microscopy Congress: Public archiving of bio-imaging data - perspectives,...
2nd Microscopy Congress: Public archiving of bio-imaging data - perspectives,...2nd Microscopy Congress: Public archiving of bio-imaging data - perspectives,...
2nd Microscopy Congress: Public archiving of bio-imaging data - perspectives,...
 
Digitised collections: Toward a digital strategy for for the NHM, London
Digitised collections: Toward a digital strategy for for the NHM, LondonDigitised collections: Toward a digital strategy for for the NHM, London
Digitised collections: Toward a digital strategy for for the NHM, London
 
Data sharing as part of the research ecosystem
Data sharing as part of the research ecosystemData sharing as part of the research ecosystem
Data sharing as part of the research ecosystem
 
Big Process for Big Data @ PNNL, May 2013
Big Process for Big Data @ PNNL, May 2013Big Process for Big Data @ PNNL, May 2013
Big Process for Big Data @ PNNL, May 2013
 
Workshop 4: Open Science & Open Data for Librarians/Ina Smith
Workshop 4: Open Science & Open Data for Librarians/Ina SmithWorkshop 4: Open Science & Open Data for Librarians/Ina Smith
Workshop 4: Open Science & Open Data for Librarians/Ina Smith
 
Graham Pryor
Graham PryorGraham Pryor
Graham Pryor
 

More from Alex Henderson

The Class Imbalance Problem: AdaBoost to the Rescue?
The Class Imbalance Problem: AdaBoost to the Rescue?The Class Imbalance Problem: AdaBoost to the Rescue?
The Class Imbalance Problem: AdaBoost to the Rescue?
Alex Henderson
 
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopyWhat's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
Alex Henderson
 

More from Alex Henderson (12)

Hyperspectral Data Issues
Hyperspectral Data IssuesHyperspectral Data Issues
Hyperspectral Data Issues
 
The Class Imbalance Problem: AdaBoost to the Rescue?
The Class Imbalance Problem: AdaBoost to the Rescue?The Class Imbalance Problem: AdaBoost to the Rescue?
The Class Imbalance Problem: AdaBoost to the Rescue?
 
Getting started with chemometric classification
Getting started with chemometric classificationGetting started with chemometric classification
Getting started with chemometric classification
 
Too good to be true? How validate your data
Too good to be true? How validate your dataToo good to be true? How validate your data
Too good to be true? How validate your data
 
2020 Vision (Dubious Design Decisions)
2020 Vision (Dubious Design Decisions)2020 Vision (Dubious Design Decisions)
2020 Vision (Dubious Design Decisions)
 
To bag, or to boost? A question of balance
To bag, or to boost? A question of balanceTo bag, or to boost? A question of balance
To bag, or to boost? A question of balance
 
Digging into Data: Analysis and Visualisation in 3D
Digging into Data: Analysis and Visualisation in 3DDigging into Data: Analysis and Visualisation in 3D
Digging into Data: Analysis and Visualisation in 3D
 
Rise of the Machines: The Use of Machine Learning in SIMS Data Analysis
Rise of the Machines: The Use of Machine Learning in SIMS Data AnalysisRise of the Machines: The Use of Machine Learning in SIMS Data Analysis
Rise of the Machines: The Use of Machine Learning in SIMS Data Analysis
 
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopyWhat's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
What's mine is yours (and vice versa) Data sharing in vibrational spectroscopy
 
How to validate your model
How to validate your modelHow to validate your model
How to validate your model
 
Interpretation of Static SIMS Spectra
Interpretation of Static SIMS SpectraInterpretation of Static SIMS Spectra
Interpretation of Static SIMS Spectra
 
Secondary Ion Mass Spectrometry
Secondary Ion Mass SpectrometrySecondary Ion Mass Spectrometry
Secondary Ion Mass Spectrometry
 

Recently uploaded

Tuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notesTuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notes
jyothisaisri
 
Continuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discsContinuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discs
Sérgio Sacani
 
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 

Recently uploaded (20)

EU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdfEU START PROJECT. START-Newsletter_Issue_4.pdf
EU START PROJECT. START-Newsletter_Issue_4.pdf
 
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdfMODERN PHYSICS_REPORTING_QUANTA_.....pdf
MODERN PHYSICS_REPORTING_QUANTA_.....pdf
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) Enzymology
 
Mining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptxMining Activity and Investment Opportunity in Myanmar.pptx
Mining Activity and Investment Opportunity in Myanmar.pptx
 
Tuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notesTuberculosis (TB)-Notes.pdf microbiology notes
Tuberculosis (TB)-Notes.pdf microbiology notes
 
NuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdfNuGOweek 2024 programme final FLYER short.pdf
NuGOweek 2024 programme final FLYER short.pdf
 
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
 
THE GENERAL PROPERTIES OF PROTEOBACTERIA AND ITS TYPES
THE GENERAL PROPERTIES OF PROTEOBACTERIA AND ITS TYPESTHE GENERAL PROPERTIES OF PROTEOBACTERIA AND ITS TYPES
THE GENERAL PROPERTIES OF PROTEOBACTERIA AND ITS TYPES
 
Continuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discsContinuum emission from within the plunging region of black hole discs
Continuum emission from within the plunging region of black hole discs
 
Lubrication System in forced feed system
Lubrication System in forced feed systemLubrication System in forced feed system
Lubrication System in forced feed system
 
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
 
family therapy psychotherapy types .pdf
family therapy psychotherapy types  .pdffamily therapy psychotherapy types  .pdf
family therapy psychotherapy types .pdf
 
VILLAGE ATTACHMENT For rural agriculture PPT.pptx
VILLAGE ATTACHMENT For rural agriculture  PPT.pptxVILLAGE ATTACHMENT For rural agriculture  PPT.pptx
VILLAGE ATTACHMENT For rural agriculture PPT.pptx
 
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptxBiochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
Biochemistry and Biomolecules - Science - 9th Grade by Slidesgo.pptx
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
 
Information science research with large language models: between science and ...
Information science research with large language models: between science and ...Information science research with large language models: between science and ...
Information science research with large language models: between science and ...
 
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptx
 
Heads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdfHeads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdf
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
 

FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry

  • 1. FAIRSpectra Enabling the FAIRification of Spectroscopy and Spectrometry Alex Henderson University of Manchester Office for Open Research https://fairspectra.net https://alexhenderson.info
  • 2. Thanks… • For financial support • University of Manchester’s Office for Open Research • SurfaceSpectra Ltd. • For in-kind support (free exhibition space) • UK Surface Analysis Users Forum (UKSAF) • SIMS Europe • SpringSciX 2024 • 101st IUVSTA Workshop(The International Union for Vacuum Science, Technique and Applications) • Zulip (free upgrade) SIMS Europe Office for Open Research
  • 3. What is FAIR? The FAIR Guiding Principles • Findable • Accessible • Interoperable • Reusable https://www.go-fair.org/
  • 4. What are Spectroscopy and Spectrometry? • Instrumentation-based chemical analysis • mass spectrometry (MALDI, SIMS, DESI) • UV-vis / infrared / Raman spectroscopies • NMR • X-ray diffraction • … • Variants of each technique have own requirements • Need to consider combination of techniques → data fusion
  • 5. Modalities • Single spectrum • Collections of spectra • Spectral maps • Multispectral images • Hyperspectral images • 3D images
  • 6. Modalities • Single spectrum • Collections of spectra • Spectral maps • Multispectral images • Hyperspectral images • 3D images Typical infrared spectrum of tissue. Courtesy Peter Gardner @ Manchester
  • 7. Modalities • Single spectrum • Collections of spectra • Spectral maps • Multispectral images • Hyperspectral images • 3D images X-ray photoelectron spectra. Range of carbon functionalities in polymers. SurfaceSpectra Ltd.
  • 8. Modalities • Single spectrum • Collections of spectra • Spectral maps • Multispectral images • Hyperspectral images • 3D images Secondary ion mass spectrometry spectra from a range of urinary-tract infection bacteria. John Fletcher @ Manchester now Gothenburg
  • 9. Modalities • Single spectrum • Collections of spectra • Spectral maps • Multispectral images • Hyperspectral images • 3D images Co-located infrared and Raman spectroscopy spectra of neuroglioma cells. Photothermal Inc.
  • 10. Modalities • Single spectrum • Collections of spectra • Spectral maps • Multispectral images • Hyperspectral images • 3D images Secondary ion mass spectrometry image of arsenic, sulfur and phosphate abundance in rice. Katie Moore @ Manchester
  • 11. What is hyperspectral imaging? • Operates more like a camera, with multiple image elements • 128 × 128 pixels, liquid nitrogen cooled • Mosaic these ‘pictures’ to cover large areas Courtesy of Peter Gardner @ Manchester
  • 12. Modalities • Single spectrum • Collections of spectra • Spectral maps • Multispectral images • Hyperspectral images • 3D images Infrared spectroscopy hyperspectral image of prostate cancer tissue. False coloured Random Forests classification of spectra.Peter Gardner @ Manchester H&E optical FTIR Epithelium Smooth Muscle Lymphocytes Blood Concretion Fibrous Stroma ECM
  • 13. Modalities • Single spectrum • Collections of spectra • Spectral maps • Multispectral images • Hyperspectral images • 3D images Secondary ion mass spectrometry 3D image depth profile, topography corrected, false coloured green=lipid (DPPC), R=nucleic acid (Adenine). John Fletcher & Alex Henderson @ Manchester
  • 14. Modalities • Single spectrum • Collections of spectra • Spectral maps • Multispectral images • Hyperspectral images • 3D images Secondary ion mass spectrometry 3D depth profile image showing cholesterol distribution in surface of frog oocyte. John Fletcher @ Manchester, now Gothenburg
  • 15. Modalities • Single spectrum • Collections of spectra • Spectral maps • Multispectral images • Hyperspectral images • 3D images X-ray photoelectron spectroscopy image of nitrogen and specific carbon functionality in caffeine tablet. Alex Henderson @ Manchester with Kratos Analytical.
  • 16. What are the issues? Academia Funders require ‘data’ to be deposited in (open) repositories But… • No dedicated repositories • Metadata terms are patchy • Instrument data in proprietary file formats • Many software packages not compatible with open formats Researchers willing to share, but don’t know how
  • 17. What are the issues? Commercial activity needs to be considered Barriers • FAIR often confused with Open • In-house processes considered good enough • Worry about certain metadata usage giving secrets away Benefits • Easier to share data in-house, between labs and (overseas) sites • FAIR practises lead to better records retention • Acquisitions and mergers become more straightforward • Third-party (open source) software becomes easily accessible • Incoming staff already familiar with systems Instrument manufacturer buy-in is vital
  • 19. What is FAIRSpectra? Community driven initiative Focus on hyperspectral imaging techniques • File formats for hyperspectral imaging • No standards exist right now • Software tools to support these • Metadata requirements • Education and training • Raising awareness
  • 20. What is FAIRSpectra? https://fairspectra.net https://fairspectra.zulipchat.com https://github.com/FAIRSpectra
  • 21. Which metadata are required? • Sampling method • Storage conditions • Chemical modifications • Physical state • Pre-treatment • … Sample • Experiment plan • Substrate material • Mounting method • Region analysed • Instrument params • … Experiment • Artifact removal • Pre-processing • Algorithm choice • Hyperparameters • Validation method • … Analysis Downstream reporting Upstream sample provenance
  • 22. Where do we start? • Sampling method • Storage conditions • Chemical modifications • Physical state • Pre-treatment • … Sample • Experiment plan • Substrate material • Mounting method • Region analysed • Instrument params • … Experiment • Artifact removal • Pre-processing • Algorithm choice • Hyperparameters • Validation method • … Analysis Downstream reporting Upstream sample provenance Born-digital metadata in data files Limited/common options
  • 23. Where do we start? • Sampling method • Storage conditions • Chemical modifications • Physical state • Pre-treatment • … Sample • Experiment plan • Substrate material • Mounting method • Region analysed • Instrument params • … Experiment • Artifact removal • Pre-processing • Algorithm choice • Hyperparameters • Validation method • … Analysis Downstream reporting Upstream sample provenance Many workflows have common steps Default hyperparameters
  • 24. Where do we start? • Sampling method • Storage conditions • Chemical modifications • Physical state • Pre-treatment • … Sample • Experiment plan • Substrate material • Mounting method • Region analysed • Instrument params • … Experiment • Artifact removal • Pre-processing • Algorithm choice • Hyperparameters • Validation method • … Analysis Downstream reporting Upstream sample provenance Samples so varied makes this very difficult
  • 25. Where do we start? • Sampling method • Storage conditions • Chemical modifications • Physical state • Pre-treatment • … Sample • Experiment plan • Substrate material • Mounting method • Region analysed • Instrument params • … Experiment • Artifact removal • Pre-processing • Algorithm choice • Hyperparameters • Validation method • … Analysis Downstream reporting Upstream sample provenance Workflows also include repeated steps with poorly defined break points
  • 26. Exhibition booths at 3 conferences/workshops • More planned, another 3 before Christmas Survey • Positives • Everyone wanted to see something done, not sure about how • Barriers • People have difficulty sharing • Poor documentation • Proprietary file formats – loss of information • Raw data versus processed data – large file size • Gazumping / IP & prior art / confidentiality • Time consuming Where are we now – 6 months in?
  • 27. Data file formats • Looking to re-purpose strategies from astronomy, climate science, and microscopy • Some file convertors ready for testing • Suitable test files an issue – everyone’s a critic Two instrument vendors interested in getting involved • Need to be careful not to go too fast • Only one shot at changing instrument software Discussions with journals just beginning • Need minimum reporting requirement • Need to convince referees this is important Developing connections with BioFAIR, ELIXIR and ISO Where are we now – 6 months in?
  • 28. Summary The researchers are willing, but their resources are weak • Few solutions currently exist • Metadata terms missing • Proprietary file formats are a barrier • Instrument vendor buy-in required • Lack of awareness persists But… • Some low-hanging fruit • Opportunity to make an impact • Even Closed FAIR can still have benefits to industry There’s lots to do, but FAIRSpectra is just getting started! https://fairspectra.net