SlideShare a Scribd company logo
1 of 38
Download to read offline
1
ESA UNCLASSIFIED - For ESA Official Use Only
Solving Large-scale Data Challenges with ESA
Datalabs
Pablo Gómez
Data Science Section, SCI-SAS
24/11/2023
ESA ESAC
2
2
• Part of the Data Science and Archives Division
• Focused on science data exploitation
• Works with different missions & interdisciplinary
Background – Data Science Section
3
3
• Part of the Data Science and Archives Division
• Focused on science data exploitation
• Works with different missions & interdisciplinary
Background – Data Science Section
4
4
“Big” Data – Where we are and what is coming
Euclid First Images
5
5
“Big” Data – Where we are and what is coming
Gaia Data Release 3
6
6
Importance of archival data – Hubble Space Telescope
HST publications by type
https://archive.stsci.edu/hst/bibliography/pubstat.html
de Marchi & Merín, presented at EAS 2023
Not assigned
Partly Archival
Archival
General
Observer
7
7
“Big” Data – Where we are and what is coming
ESAC Science Data Center
8
8
ESA Datalabs – datalabs.esa.int
8
in beta mode
9
9
ESA Datalabs main functionality
System/Core
Discovery
Pipelines
10
10
Datalabs Catalogue
11
11
Example: JWST Data Analysis Tools Notebooks
12
12
A Platform Designed to Boost Science Collaboration
13
13
Web-Based & Desktop-Based Datalabs
14
14
A Platform Designed to Boost Research Productivity
14
SaaS
PaaS
IaaS
System Development
IT Development
Science Development
You can start HERE!
15
15
A Platform Designed to Boost Access to Data
SCI
… …
ESA
16
16
Leveraging on ESA’s Digital Ecosystem of Platforms
datalabs.esa.int gssc.esa.int
18
18
Data Discovery Portal / Volume Catalogue
19
19
Computing & Data Colocation – Data Volume Catalog
20
20
Datalab & Volume Integration
22
22
Pipelines Catalogue
23
23
Pipelines: Integrated Development Environment
24
24
Pipelines: Integrated Development Environment
Common Workflow Language - CWL
25
25
Upcoming in 0.10.0 – Datalabs Marketplace (like App Store)
26
26
Recent Events
• Euclid Consortium meeting June 2023
• 200+ new users
• Stress test
• Lots of feedback
• Focus on user experience
• With ESA missions
• Experimental onboarding of external projects
ideas for new use-cases; UI improvements
27
27
JWST @ ESA Datalabs: baseline JWST area
JWST area @ ESA Datalabs
• JWST calibration pipeline
• Astroquery (inc. ESA JWST module)
• pyESASky
• JDAVIZ
• astropy
• matplotlib
• ….
Access to JWST NFS volume:
• JWST calibration files
• Example notebooks for eJWST
• Example notebooks from STSCI
28
28
The ESA Space Science Exploitation Platform
• SCI Data available for researches to work on it, made easy
• Reusable for fast implementation of Scientific Processing Pipelines
• Reusable for fast implementation of Scientific Analysis and Visualisation Tools
High-level messages
Increase Space Science Operations Efficiency
Enable Collaboration and Open Science
• Share complex processing tools and data with your team
• Share your contributions with the community in SCI‘s AppStore
29
29
Catalogue of interacting galaxies in HST archives
One example use case of ESA: Datalabs
Harnessing the Hubble Space Telescope Archives: A
Catalogue of 21,926 Interacting Galaxies
O’Ryan et al. 2023, arXiv:2303.00366
➢ Access to data directly (open large
FITS file is a few seconds, 100k
cutouts created on the order of
minutes)
➢ 92 million cutouts produced (2.5 TB)
➢ Using fine-tuned Zoobot on a sample
of mergers from CANDELS &
COSMOS
➢ Predict interacting galaxies in HST
archives: 21,926 interacting galaxies
found with high confidence (p>0.95)
➢ Other gems: strong lenses, proto-
planetary disks
30
30
ESA Datalabs for Euclid pilot studies
Detecting Solar System Object Preserving Low-Surface Brightness
Detecting Transients Cosmology Likelihood for Observables in Euclid
32
Perspective – A typical ML project
1. Setup
Tools &
Frameworks
Local folders etc.
Getting the data
33
Perspective – A typical ML project
1. Setup
Tools &
Frameworks
Local folders etc.
Getting the data
34
Perspective – A typical ML project
1 - Setup
Tools &
Frameworks
Local folders etc.
Getting the data
2 - Data Prep
I/O
Data Cleaning
Data Labeling
Gaia Data Release 3
Bing
35
Perspective – A typical ML project
1 - Setup
Tools &
Frameworks
Local folders etc.
Getting the data
2 - Data Prep
I/O
Data Cleaning
Data Labeling
3 - Models
Training
Inference
Clustering
…
36
Perspective – A typical ML project
1 - Setup
Tools &
Frameworks
Local folders etc.
Getting the data
2 - Data Prep
I/O
Data Cleaning
Data Labeling
3 - Models
Training
Inference
Clustering
…
37
Perspective – What we can build
38
Datalabs – Quo vadis?
Anomaly Detection
Finding interesting things
Dealing with the flood
Etseneth et al. 2023
39
Datalabs – Quo vadis?
Anomaly Detection
Finding interesting things
Dealing with the flood
Learning with Few Labels
Etseneth et al. 2023
Get a few
labels
Train a semi-
supervised
model
Different Downstream Tasks
• Roughly sort unlabeled data
• Find other instances
• Incremental improvements
40
Datalabs – Quo vadis?
Anomaly Detection
Finding interesting things
Dealing with the flood
Learning with Few Labels
Etseneth et al. 2023
Get a few
labels
Train a semi-
supervised
model
Different Downstream Tasks
• Roughly sort unlabeled data
• Find other instances
• Incremental improvements
Standardized ML Data Preprocessing
41
Thanks!
Questions?

More Related Content

Similar to Pablo Gomez - Solving Large-scale Challenges with ESA Datalabs

Accelerating Data-driven Discovery in Energy Science
Accelerating Data-driven Discovery in Energy ScienceAccelerating Data-driven Discovery in Energy Science
Accelerating Data-driven Discovery in Energy ScienceIan Foster
 
Research Data Infrastructure for Geochemistry (DFG Roundtable)
Research Data Infrastructure for Geochemistry (DFG Roundtable)Research Data Infrastructure for Geochemistry (DFG Roundtable)
Research Data Infrastructure for Geochemistry (DFG Roundtable)Kerstin Lehnert
 
NASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & EngineeringNASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & Engineeringinside-BigData.com
 
Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science ServicesIan Foster
 
Data Intensive Research with DISPEL
Data Intensive Research with DISPELData Intensive Research with DISPEL
Data Intensive Research with DISPELOscar Corcho
 
Unifying Space Mission Knowledge with NLP & Knowledge Graph
Unifying Space Mission Knowledge with NLP & Knowledge GraphUnifying Space Mission Knowledge with NLP & Knowledge Graph
Unifying Space Mission Knowledge with NLP & Knowledge GraphVaticle
 
Science Engagement: A Non-Technical Approach to the Technical Divide
Science Engagement: A Non-Technical Approach to the Technical DivideScience Engagement: A Non-Technical Approach to the Technical Divide
Science Engagement: A Non-Technical Approach to the Technical DivideCybera Inc.
 
Cyberinfrastructure to Support Ocean Observatories
Cyberinfrastructure to Support Ocean ObservatoriesCyberinfrastructure to Support Ocean Observatories
Cyberinfrastructure to Support Ocean ObservatoriesLarry Smarr
 
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...EarthCube
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Databricks
 
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...Mario Juric
 
SKA Regional Sciences Centres - A Platform for Global Astronomy
SKA Regional Sciences Centres - A Platform for Global AstronomySKA Regional Sciences Centres - A Platform for Global Astronomy
SKA Regional Sciences Centres - A Platform for Global AstronomyEUDAT
 
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationThe Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationIan Foster
 
EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)
EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)
EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)EarthCube
 
IEEE_BigData2014-Lee.pdf
IEEE_BigData2014-Lee.pdfIEEE_BigData2014-Lee.pdf
IEEE_BigData2014-Lee.pdfssuserff37aa
 
Time to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the CloudTime to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the CloudAmazon Web Services
 
Materials Project computation and database infrastructure
Materials Project computation and database infrastructureMaterials Project computation and database infrastructure
Materials Project computation and database infrastructureAnubhav Jain
 
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)Spark Summit
 

Similar to Pablo Gomez - Solving Large-scale Challenges with ESA Datalabs (20)

Accelerating Data-driven Discovery in Energy Science
Accelerating Data-driven Discovery in Energy ScienceAccelerating Data-driven Discovery in Energy Science
Accelerating Data-driven Discovery in Energy Science
 
Research Data Infrastructure for Geochemistry (DFG Roundtable)
Research Data Infrastructure for Geochemistry (DFG Roundtable)Research Data Infrastructure for Geochemistry (DFG Roundtable)
Research Data Infrastructure for Geochemistry (DFG Roundtable)
 
NASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & EngineeringNASA Advanced Computing Environment for Science & Engineering
NASA Advanced Computing Environment for Science & Engineering
 
Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science Services
 
afternoon3.pdf
afternoon3.pdfafternoon3.pdf
afternoon3.pdf
 
Data Intensive Research with DISPEL
Data Intensive Research with DISPELData Intensive Research with DISPEL
Data Intensive Research with DISPEL
 
Unifying Space Mission Knowledge with NLP & Knowledge Graph
Unifying Space Mission Knowledge with NLP & Knowledge GraphUnifying Space Mission Knowledge with NLP & Knowledge Graph
Unifying Space Mission Knowledge with NLP & Knowledge Graph
 
Science Engagement: A Non-Technical Approach to the Technical Divide
Science Engagement: A Non-Technical Approach to the Technical DivideScience Engagement: A Non-Technical Approach to the Technical Divide
Science Engagement: A Non-Technical Approach to the Technical Divide
 
Cyberinfrastructure to Support Ocean Observatories
Cyberinfrastructure to Support Ocean ObservatoriesCyberinfrastructure to Support Ocean Observatories
Cyberinfrastructure to Support Ocean Observatories
 
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...
Data Facilities Workshop - Panel on Current Concepts in Data Sharing & Intero...
 
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
Experience of Running Spark on Kubernetes on OpenStack for High Energy Physic...
 
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
ADASS XXV: LSST DM - Building the Data System for the Era of Petascale Optica...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
SKA Regional Sciences Centres - A Platform for Global Astronomy
SKA Regional Sciences Centres - A Platform for Global AstronomySKA Regional Sciences Centres - A Platform for Global Astronomy
SKA Regional Sciences Centres - A Platform for Global Astronomy
 
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and AutomationThe Discovery Cloud: Accelerating Science via Outsourcing and Automation
The Discovery Cloud: Accelerating Science via Outsourcing and Automation
 
EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)
EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)
EarthCube's OceanLink - Project Overview and Presentation Updates (March 2014)
 
IEEE_BigData2014-Lee.pdf
IEEE_BigData2014-Lee.pdfIEEE_BigData2014-Lee.pdf
IEEE_BigData2014-Lee.pdf
 
Time to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the CloudTime to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the Cloud
 
Materials Project computation and database infrastructure
Materials Project computation and database infrastructureMaterials Project computation and database infrastructure
Materials Project computation and database infrastructure
 
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
Spark at NASA/JPL-(Chris Mattmann, NASA/JPL)
 

More from Advanced-Concepts-Team

2024.03.22 - Mike Heddes - Introduction to Hyperdimensional Computing.pdf
2024.03.22 - Mike Heddes - Introduction to Hyperdimensional Computing.pdf2024.03.22 - Mike Heddes - Introduction to Hyperdimensional Computing.pdf
2024.03.22 - Mike Heddes - Introduction to Hyperdimensional Computing.pdfAdvanced-Concepts-Team
 
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonicsIsabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonicsAdvanced-Concepts-Team
 
The ExoGRAVITY project - observations of exoplanets from the ground with opti...
The ExoGRAVITY project - observations of exoplanets from the ground with opti...The ExoGRAVITY project - observations of exoplanets from the ground with opti...
The ExoGRAVITY project - observations of exoplanets from the ground with opti...Advanced-Concepts-Team
 
Jonathan Sauder - Miniaturizing Mechanical Systems for CubeSats: Design Princ...
Jonathan Sauder - Miniaturizing Mechanical Systems for CubeSats: Design Princ...Jonathan Sauder - Miniaturizing Mechanical Systems for CubeSats: Design Princ...
Jonathan Sauder - Miniaturizing Mechanical Systems for CubeSats: Design Princ...Advanced-Concepts-Team
 
Towards an Artificial Muse for new Ideas in Quantum Physics
Towards an Artificial Muse for new Ideas in Quantum PhysicsTowards an Artificial Muse for new Ideas in Quantum Physics
Towards an Artificial Muse for new Ideas in Quantum PhysicsAdvanced-Concepts-Team
 
EDEN ISS - A space greenhouse analogue in Antarctica
EDEN ISS - A space greenhouse analogue in AntarcticaEDEN ISS - A space greenhouse analogue in Antarctica
EDEN ISS - A space greenhouse analogue in AntarcticaAdvanced-Concepts-Team
 
Information processing with artificial spiking neural networks
Information processing with artificial spiking neural networksInformation processing with artificial spiking neural networks
Information processing with artificial spiking neural networksAdvanced-Concepts-Team
 
Exploring Architected Materials Using Machine Learning
Exploring Architected Materials Using Machine LearningExploring Architected Materials Using Machine Learning
Exploring Architected Materials Using Machine LearningAdvanced-Concepts-Team
 
Electromagnetically Actuated Systems for Modular, Self-Assembling and Self-Re...
Electromagnetically Actuated Systems for Modular, Self-Assembling and Self-Re...Electromagnetically Actuated Systems for Modular, Self-Assembling and Self-Re...
Electromagnetically Actuated Systems for Modular, Self-Assembling and Self-Re...Advanced-Concepts-Team
 
HORUS: Peering into Lunar Shadowed Regions with AI
HORUS: Peering into Lunar Shadowed Regions with AIHORUS: Peering into Lunar Shadowed Regions with AI
HORUS: Peering into Lunar Shadowed Regions with AIAdvanced-Concepts-Team
 
META-SPACE: Psycho-physiologically Adaptive and Personalized Virtual Reality ...
META-SPACE: Psycho-physiologically Adaptive and Personalized Virtual Reality ...META-SPACE: Psycho-physiologically Adaptive and Personalized Virtual Reality ...
META-SPACE: Psycho-physiologically Adaptive and Personalized Virtual Reality ...Advanced-Concepts-Team
 
The Large Interferometer For Exoplanets (LIFE) II: Key Methods and Technologies
The Large Interferometer For Exoplanets (LIFE) II: Key Methods and TechnologiesThe Large Interferometer For Exoplanets (LIFE) II: Key Methods and Technologies
The Large Interferometer For Exoplanets (LIFE) II: Key Methods and TechnologiesAdvanced-Concepts-Team
 
In vitro simulation of spaceflight environment to elucidate combined effect o...
In vitro simulation of spaceflight environment to elucidate combined effect o...In vitro simulation of spaceflight environment to elucidate combined effect o...
In vitro simulation of spaceflight environment to elucidate combined effect o...Advanced-Concepts-Team
 
The Large Interferometer For Exoplanets (LIFE): the science of characterising...
The Large Interferometer For Exoplanets (LIFE): the science of characterising...The Large Interferometer For Exoplanets (LIFE): the science of characterising...
The Large Interferometer For Exoplanets (LIFE): the science of characterising...Advanced-Concepts-Team
 
Vernal pools a new ecosystem for astrobiology studies
Vernal pools a new ecosystem for astrobiology studiesVernal pools a new ecosystem for astrobiology studies
Vernal pools a new ecosystem for astrobiology studiesAdvanced-Concepts-Team
 
Keeping a Sentinel Eye on the Volcanoes – from Space!
Keeping a Sentinel Eye on the Volcanoes – from Space!Keeping a Sentinel Eye on the Volcanoes – from Space!
Keeping a Sentinel Eye on the Volcanoes – from Space!Advanced-Concepts-Team
 
AI4Space –Artificial Intelligence at ISTA - Hülsmann & Haser
AI4Space –Artificial Intelligence at ISTA - Hülsmann & HaserAI4Space –Artificial Intelligence at ISTA - Hülsmann & Haser
AI4Space –Artificial Intelligence at ISTA - Hülsmann & HaserAdvanced-Concepts-Team
 

More from Advanced-Concepts-Team (20)

2024.03.22 - Mike Heddes - Introduction to Hyperdimensional Computing.pdf
2024.03.22 - Mike Heddes - Introduction to Hyperdimensional Computing.pdf2024.03.22 - Mike Heddes - Introduction to Hyperdimensional Computing.pdf
2024.03.22 - Mike Heddes - Introduction to Hyperdimensional Computing.pdf
 
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonicsIsabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
 
The ExoGRAVITY project - observations of exoplanets from the ground with opti...
The ExoGRAVITY project - observations of exoplanets from the ground with opti...The ExoGRAVITY project - observations of exoplanets from the ground with opti...
The ExoGRAVITY project - observations of exoplanets from the ground with opti...
 
MOND_famaey.pdf
MOND_famaey.pdfMOND_famaey.pdf
MOND_famaey.pdf
 
Jonathan Sauder - Miniaturizing Mechanical Systems for CubeSats: Design Princ...
Jonathan Sauder - Miniaturizing Mechanical Systems for CubeSats: Design Princ...Jonathan Sauder - Miniaturizing Mechanical Systems for CubeSats: Design Princ...
Jonathan Sauder - Miniaturizing Mechanical Systems for CubeSats: Design Princ...
 
Towards an Artificial Muse for new Ideas in Quantum Physics
Towards an Artificial Muse for new Ideas in Quantum PhysicsTowards an Artificial Muse for new Ideas in Quantum Physics
Towards an Artificial Muse for new Ideas in Quantum Physics
 
EDEN ISS - A space greenhouse analogue in Antarctica
EDEN ISS - A space greenhouse analogue in AntarcticaEDEN ISS - A space greenhouse analogue in Antarctica
EDEN ISS - A space greenhouse analogue in Antarctica
 
How to give a robot a soul
How to give a robot a soulHow to give a robot a soul
How to give a robot a soul
 
Information processing with artificial spiking neural networks
Information processing with artificial spiking neural networksInformation processing with artificial spiking neural networks
Information processing with artificial spiking neural networks
 
Exploring Architected Materials Using Machine Learning
Exploring Architected Materials Using Machine LearningExploring Architected Materials Using Machine Learning
Exploring Architected Materials Using Machine Learning
 
Electromagnetically Actuated Systems for Modular, Self-Assembling and Self-Re...
Electromagnetically Actuated Systems for Modular, Self-Assembling and Self-Re...Electromagnetically Actuated Systems for Modular, Self-Assembling and Self-Re...
Electromagnetically Actuated Systems for Modular, Self-Assembling and Self-Re...
 
HORUS: Peering into Lunar Shadowed Regions with AI
HORUS: Peering into Lunar Shadowed Regions with AIHORUS: Peering into Lunar Shadowed Regions with AI
HORUS: Peering into Lunar Shadowed Regions with AI
 
META-SPACE: Psycho-physiologically Adaptive and Personalized Virtual Reality ...
META-SPACE: Psycho-physiologically Adaptive and Personalized Virtual Reality ...META-SPACE: Psycho-physiologically Adaptive and Personalized Virtual Reality ...
META-SPACE: Psycho-physiologically Adaptive and Personalized Virtual Reality ...
 
The Large Interferometer For Exoplanets (LIFE) II: Key Methods and Technologies
The Large Interferometer For Exoplanets (LIFE) II: Key Methods and TechnologiesThe Large Interferometer For Exoplanets (LIFE) II: Key Methods and Technologies
The Large Interferometer For Exoplanets (LIFE) II: Key Methods and Technologies
 
Black Holes and Bright Quasars
Black Holes and Bright QuasarsBlack Holes and Bright Quasars
Black Holes and Bright Quasars
 
In vitro simulation of spaceflight environment to elucidate combined effect o...
In vitro simulation of spaceflight environment to elucidate combined effect o...In vitro simulation of spaceflight environment to elucidate combined effect o...
In vitro simulation of spaceflight environment to elucidate combined effect o...
 
The Large Interferometer For Exoplanets (LIFE): the science of characterising...
The Large Interferometer For Exoplanets (LIFE): the science of characterising...The Large Interferometer For Exoplanets (LIFE): the science of characterising...
The Large Interferometer For Exoplanets (LIFE): the science of characterising...
 
Vernal pools a new ecosystem for astrobiology studies
Vernal pools a new ecosystem for astrobiology studiesVernal pools a new ecosystem for astrobiology studies
Vernal pools a new ecosystem for astrobiology studies
 
Keeping a Sentinel Eye on the Volcanoes – from Space!
Keeping a Sentinel Eye on the Volcanoes – from Space!Keeping a Sentinel Eye on the Volcanoes – from Space!
Keeping a Sentinel Eye on the Volcanoes – from Space!
 
AI4Space –Artificial Intelligence at ISTA - Hülsmann & Haser
AI4Space –Artificial Intelligence at ISTA - Hülsmann & HaserAI4Space –Artificial Intelligence at ISTA - Hülsmann & Haser
AI4Space –Artificial Intelligence at ISTA - Hülsmann & Haser
 

Recently uploaded

Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensorsonawaneprad
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trssuser06f238
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationColumbia Weather Systems
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPirithiRaju
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationColumbia Weather Systems
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingNetHelix
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.PraveenaKalaiselvan1
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologycaarthichand2003
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...lizamodels9
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPirithiRaju
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024AyushiRastogi48
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxJorenAcuavera1
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)Columbia Weather Systems
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxEran Akiva Sinbar
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxmalonesandreagweneth
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.aasikanpl
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPirithiRaju
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naJASISJULIANOELYNV
 

Recently uploaded (20)

Environmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial BiosensorEnvironmental Biotechnology Topic:- Microbial Biosensor
Environmental Biotechnology Topic:- Microbial Biosensor
 
Neurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 trNeurodevelopmental disorders according to the dsm 5 tr
Neurodevelopmental disorders according to the dsm 5 tr
 
User Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather StationUser Guide: Capricorn FLX™ Weather Station
User Guide: Capricorn FLX™ Weather Station
 
Pests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdfPests of Bengal gram_Identification_Dr.UPR.pdf
Pests of Bengal gram_Identification_Dr.UPR.pdf
 
User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)User Guide: Orion™ Weather Station (Columbia Weather Systems)
User Guide: Orion™ Weather Station (Columbia Weather Systems)
 
User Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather StationUser Guide: Magellan MX™ Weather Station
User Guide: Magellan MX™ Weather Station
 
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editingBase editing, prime editing, Cas13 & RNA editing and organelle base editing
Base editing, prime editing, Cas13 & RNA editing and organelle base editing
 
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
BIOETHICS IN RECOMBINANT DNA TECHNOLOGY.
 
Davis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technologyDavis plaque method.pptx recombinant DNA technology
Davis plaque method.pptx recombinant DNA technology
 
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
Best Call Girls In Sector 29 Gurgaon❤️8860477959 EscorTs Service In 24/7 Delh...
 
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdfPests of Blackgram, greengram, cowpea_Dr.UPR.pdf
Pests of Blackgram, greengram, cowpea_Dr.UPR.pdf
 
Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024Vision and reflection on Mining Software Repositories research in 2024
Vision and reflection on Mining Software Repositories research in 2024
 
Topic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptxTopic 9- General Principles of International Law.pptx
Topic 9- General Principles of International Law.pptx
 
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
User Guide: Pulsar™ Weather Station (Columbia Weather Systems)
 
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdfPests of safflower_Binomics_Identification_Dr.UPR.pdf
Pests of safflower_Binomics_Identification_Dr.UPR.pdf
 
The dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptxThe dark energy paradox leads to a new structure of spacetime.pptx
The dark energy paradox leads to a new structure of spacetime.pptx
 
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptxLIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
LIGHT-PHENOMENA-BY-CABUALDIONALDOPANOGANCADIENTE-CONDEZA (1).pptx
 
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
Call Girls in Munirka Delhi 💯Call Us 🔝9953322196🔝 💯Escort.
 
Pests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdfPests of castor_Binomics_Identification_Dr.UPR.pdf
Pests of castor_Binomics_Identification_Dr.UPR.pdf
 
FREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by naFREE NURSING BUNDLE FOR NURSES.PDF by na
FREE NURSING BUNDLE FOR NURSES.PDF by na
 

Pablo Gomez - Solving Large-scale Challenges with ESA Datalabs

  • 1. 1 ESA UNCLASSIFIED - For ESA Official Use Only Solving Large-scale Data Challenges with ESA Datalabs Pablo Gómez Data Science Section, SCI-SAS 24/11/2023 ESA ESAC
  • 2. 2 2 • Part of the Data Science and Archives Division • Focused on science data exploitation • Works with different missions & interdisciplinary Background – Data Science Section
  • 3. 3 3 • Part of the Data Science and Archives Division • Focused on science data exploitation • Works with different missions & interdisciplinary Background – Data Science Section
  • 4. 4 4 “Big” Data – Where we are and what is coming Euclid First Images
  • 5. 5 5 “Big” Data – Where we are and what is coming Gaia Data Release 3
  • 6. 6 6 Importance of archival data – Hubble Space Telescope HST publications by type https://archive.stsci.edu/hst/bibliography/pubstat.html de Marchi & Merín, presented at EAS 2023 Not assigned Partly Archival Archival General Observer
  • 7. 7 7 “Big” Data – Where we are and what is coming ESAC Science Data Center
  • 8. 8 8 ESA Datalabs – datalabs.esa.int 8 in beta mode
  • 9. 9 9 ESA Datalabs main functionality System/Core Discovery Pipelines
  • 11. 11 11 Example: JWST Data Analysis Tools Notebooks
  • 12. 12 12 A Platform Designed to Boost Science Collaboration
  • 14. 14 14 A Platform Designed to Boost Research Productivity 14 SaaS PaaS IaaS System Development IT Development Science Development You can start HERE!
  • 15. 15 15 A Platform Designed to Boost Access to Data SCI … … ESA
  • 16. 16 16 Leveraging on ESA’s Digital Ecosystem of Platforms datalabs.esa.int gssc.esa.int
  • 17. 18 18 Data Discovery Portal / Volume Catalogue
  • 18. 19 19 Computing & Data Colocation – Data Volume Catalog
  • 19. 20 20 Datalab & Volume Integration
  • 22. 24 24 Pipelines: Integrated Development Environment Common Workflow Language - CWL
  • 23. 25 25 Upcoming in 0.10.0 – Datalabs Marketplace (like App Store)
  • 24. 26 26 Recent Events • Euclid Consortium meeting June 2023 • 200+ new users • Stress test • Lots of feedback • Focus on user experience • With ESA missions • Experimental onboarding of external projects ideas for new use-cases; UI improvements
  • 25. 27 27 JWST @ ESA Datalabs: baseline JWST area JWST area @ ESA Datalabs • JWST calibration pipeline • Astroquery (inc. ESA JWST module) • pyESASky • JDAVIZ • astropy • matplotlib • …. Access to JWST NFS volume: • JWST calibration files • Example notebooks for eJWST • Example notebooks from STSCI
  • 26. 28 28 The ESA Space Science Exploitation Platform • SCI Data available for researches to work on it, made easy • Reusable for fast implementation of Scientific Processing Pipelines • Reusable for fast implementation of Scientific Analysis and Visualisation Tools High-level messages Increase Space Science Operations Efficiency Enable Collaboration and Open Science • Share complex processing tools and data with your team • Share your contributions with the community in SCI‘s AppStore
  • 27. 29 29 Catalogue of interacting galaxies in HST archives One example use case of ESA: Datalabs Harnessing the Hubble Space Telescope Archives: A Catalogue of 21,926 Interacting Galaxies O’Ryan et al. 2023, arXiv:2303.00366 ➢ Access to data directly (open large FITS file is a few seconds, 100k cutouts created on the order of minutes) ➢ 92 million cutouts produced (2.5 TB) ➢ Using fine-tuned Zoobot on a sample of mergers from CANDELS & COSMOS ➢ Predict interacting galaxies in HST archives: 21,926 interacting galaxies found with high confidence (p>0.95) ➢ Other gems: strong lenses, proto- planetary disks
  • 28. 30 30 ESA Datalabs for Euclid pilot studies Detecting Solar System Object Preserving Low-Surface Brightness Detecting Transients Cosmology Likelihood for Observables in Euclid
  • 29. 32 Perspective – A typical ML project 1. Setup Tools & Frameworks Local folders etc. Getting the data
  • 30. 33 Perspective – A typical ML project 1. Setup Tools & Frameworks Local folders etc. Getting the data
  • 31. 34 Perspective – A typical ML project 1 - Setup Tools & Frameworks Local folders etc. Getting the data 2 - Data Prep I/O Data Cleaning Data Labeling Gaia Data Release 3 Bing
  • 32. 35 Perspective – A typical ML project 1 - Setup Tools & Frameworks Local folders etc. Getting the data 2 - Data Prep I/O Data Cleaning Data Labeling 3 - Models Training Inference Clustering …
  • 33. 36 Perspective – A typical ML project 1 - Setup Tools & Frameworks Local folders etc. Getting the data 2 - Data Prep I/O Data Cleaning Data Labeling 3 - Models Training Inference Clustering …
  • 34. 37 Perspective – What we can build
  • 35. 38 Datalabs – Quo vadis? Anomaly Detection Finding interesting things Dealing with the flood Etseneth et al. 2023
  • 36. 39 Datalabs – Quo vadis? Anomaly Detection Finding interesting things Dealing with the flood Learning with Few Labels Etseneth et al. 2023 Get a few labels Train a semi- supervised model Different Downstream Tasks • Roughly sort unlabeled data • Find other instances • Incremental improvements
  • 37. 40 Datalabs – Quo vadis? Anomaly Detection Finding interesting things Dealing with the flood Learning with Few Labels Etseneth et al. 2023 Get a few labels Train a semi- supervised model Different Downstream Tasks • Roughly sort unlabeled data • Find other instances • Incremental improvements Standardized ML Data Preprocessing