SlideShare a Scribd company logo
1 of 12
Single-cell Data on Polly
Polly by Elucidata
Elucidata’s data harmonization platform- Polly, delivers the highest
quality single cell data to fit diverse analysis methods & pipelines. All
datasets are Polly Verified, i.e harmonized with a configurable, granular
& transparent curation process
Streamlined Journey to Improving Quality of Single-cell Data
Data on Polly
Data at Source
Tabular file
(MTX, CSV)
Txt File
● 50% Missing annotations
● <2% Harmonized
● Different access nuances
● Formats vary across datasets &
samples
● <1% Missing annotations
● 100% Harmonized
● 4X New fields added
● Consistent H5AD format
Processing
Metadata
Harmonization
Cell
Annotation
Quality
Assurance
Polly Harmonization
Single-Cell Data Options on Polly
Raw Counts Polly Processed Counts Author Processed Counts
What is it?
Raw unfiltered counts
extracted from the
source, cleaned and
metadata annotated
Harmonized Single Cell Data,
consistently processed & cell
type annotated using a
validated Polly Pipeline
Single Cell Data that is
processed & cell type
annotated using author
provided parameters
Useful for
Re-Processing and
annotating data with
in-house pipelines
Making data comparable &
interoperable for large scale
comparative analyses
Replicating a published study
of interest
Output
File(s)
Unfiltered Raw counts
with 30 metadata fields
(H5AD)
● Polly Processed Counts
with cell type annotations
and 32 other fields (H5AD)
● Raw Counts with 30 fields
(H5AD)
● Author Processed Counts
with cell type annotations &
32 other fields (H5AD)
● Raw Counts with 30 fields
(H5AD)
Why Access Single-Cell Data on Polly?
Data You
Can Trust
~50 QA checks performed
on all data/metadata to
ensure quality and
provenance.
Learn how each dataset
was processed and
annotated with
comprehensive QA
reports.
Complete
Transparency
Request custom metadata
fields or cell type
annotation with your own
markers.
Customizable
Harmonization
Flexible Ways to
Consume Data
Work with Polly’s data on
tools and environments of
your choice. No download
restrictions applied!
How We Deliver: Data Concierge
Data Audits
● Experts identify datasets relevant to your research on/off Polly
● Requirement gathering for curation & processing of found data
Store in your Atlas
● Domain specific repository of Analysis-Ready data
● All datasets are QC-ed, Custom Curated & Polly Verified
Exploration and Analysis
● Explore on Polly via CellxGene
● Download data with Polly’s APIs or GUI, explore on tools of choice
● Customized solutions as service: GSEA, Knowledge Graphs, ML
Classifiers and Dashboards for analysis & visualization
7
About the Customer
A therapeutics is an early stage startup based in Boston that is developing biologics for inflammatory and
autoimmune diseases. The company was looking to identify potential targets for these indications.
Objective
Find and integrate single cell datasets specific to inflammatory diseases from public sources.
Perform meta-analysis to arrive at fibroblast specific gene targets for further exploration.
Target Identification & Validation with Curated Single-cell Data: Case-study
Finding Relevant Datasets
How Was the Data Processed?
Data at Source Unfiltered Raw
Counts
H5AD files with
Hugo symbols, QC
metrics, curated
metadata fields
Filtering &
Normalization
Consistent filtering
criteria, normalization
& Batch effect
correction
Cell Type
Annotation
Store on
Atlas
Marker list from
publications to
derive cell
annotations
h5AD with curated
metadata and
consistently
annotated cells
mtx, csv, tsv, h5ad,
seurat, h5
Meta-Analysis for Target Identification and Validation
Differential expression analysis of merged
data to get top 250 DEGs
13 datasets identified and 3 datasets
merged
Refine results to top 20 genes with RF
model and point biserial scores
Examine expression and narrow down to
10 genes
Review literature and perform pathway
analysis to arrive at 5 targets
B cells
T cells
Myeloid cells
Plasma
Stem or
Enterocyte cells
Mast cells
Vascular cells
Fibroblasts
UMAP
2
UMAP 1
Integrated Cell Type
Diseased Normal
Fibroblast Fibroblast Other
Other
Gene
1
Single Cell Data Curation
Impact
Target Identification & Validation
156 scRNA-Seq datasets, specific to inflammatory diseases
were identified and annotated with relevant metadata information
Shortlisted 4 novel targets and validated 5 pre-identified targets
using meta-analysis
Time Savings
4X acceleration in the target identification process (from 8-10 months
to 2.5 months)
Reach out to us at info@elucidata.io or Book a Demo
with us to learn more.

More Related Content

Similar to Single-cell Data on Polly.pptx

BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataPhilip Cheung
 
Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Paolo Missier
 
Clinical Data Management
Clinical Data ManagementClinical Data Management
Clinical Data ManagementDABBETA DIVYA
 
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Paolo Missier
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshopGenomeInABottle
 
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLarenPAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLarenIntegrated Breeding Platform
 
Focus on the Evidence: a knowledge graph approach to profiling drug targets
Focus on the Evidence: a knowledge graph approach to profiling drug targetsFocus on the Evidence: a knowledge graph approach to profiling drug targets
Focus on the Evidence: a knowledge graph approach to profiling drug targetsNolan Nichols
 
The Future of Personalized Medicine
The Future of Personalized MedicineThe Future of Personalized Medicine
The Future of Personalized MedicineEdgewater
 
D1 1440 cesar wong next generation sequencing &amp; bio medical data analysis
D1 1440 cesar wong next generation sequencing &amp; bio medical data analysisD1 1440 cesar wong next generation sequencing &amp; bio medical data analysis
D1 1440 cesar wong next generation sequencing &amp; bio medical data analysisDr. Wilfred Lin (Ph.D.)
 
Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...
Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...
Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...DataWorks Summit
 
Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1Ann-Marie Roche
 
SooryaKiran Bioinformatics
SooryaKiran BioinformaticsSooryaKiran Bioinformatics
SooryaKiran Bioinformaticscontactsoorya
 
Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01Upendra Agarwal
 
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...Remedy Informatics
 
Health advances ai in diagnostic development
Health advances ai in diagnostic developmentHealth advances ai in diagnostic development
Health advances ai in diagnostic developmentHealth Advances
 
Pistoia Alliance conference April 2016: Big Data: Mathew Woodwark
Pistoia Alliance conference April 2016: Big Data: Mathew WoodwarkPistoia Alliance conference April 2016: Big Data: Mathew Woodwark
Pistoia Alliance conference April 2016: Big Data: Mathew WoodwarkPistoia Alliance
 
A Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesA Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesMatthieu Schapranow
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...Syed Ahmad Chan Bukhari, PhD
 

Similar to Single-cell Data on Polly.pptx (20)

BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
 
Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07Invited talk @Roma La Sapienza, April '07
Invited talk @Roma La Sapienza, April '07
 
Clinical Data Management
Clinical Data ManagementClinical Data Management
Clinical Data Management
 
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
Invited talk @Aberdeen, '07: Modelling and computing the quality of informati...
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
171017 giab for giab grc workshop
171017 giab for giab grc workshop171017 giab for giab grc workshop
171017 giab for giab grc workshop
 
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLarenPAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
PAG 2015 - Overview of the Breeding Management System - Dr Graham McLaren
 
Focus on the Evidence: a knowledge graph approach to profiling drug targets
Focus on the Evidence: a knowledge graph approach to profiling drug targetsFocus on the Evidence: a knowledge graph approach to profiling drug targets
Focus on the Evidence: a knowledge graph approach to profiling drug targets
 
The Future of Personalized Medicine
The Future of Personalized MedicineThe Future of Personalized Medicine
The Future of Personalized Medicine
 
D1 1440 cesar wong next generation sequencing &amp; bio medical data analysis
D1 1440 cesar wong next generation sequencing &amp; bio medical data analysisD1 1440 cesar wong next generation sequencing &amp; bio medical data analysis
D1 1440 cesar wong next generation sequencing &amp; bio medical data analysis
 
Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...
Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...
Microsoft HDInsight as a Big Data and Interoperability Platform to Drive Poin...
 
Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1Pathway studio into webinar 052715v1
Pathway studio into webinar 052715v1
 
SooryaKiran Bioinformatics
SooryaKiran BioinformaticsSooryaKiran Bioinformatics
SooryaKiran Bioinformatics
 
Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01Clinicaldatamanagementindiaasahub 130313225150-phpapp01
Clinicaldatamanagementindiaasahub 130313225150-phpapp01
 
Irida bccdc dec10_2015
Irida bccdc dec10_2015Irida bccdc dec10_2015
Irida bccdc dec10_2015
 
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
Ontology-Driven Clinical Intelligence: A Path from the Biobank to Cross-Disea...
 
Health advances ai in diagnostic development
Health advances ai in diagnostic developmentHealth advances ai in diagnostic development
Health advances ai in diagnostic development
 
Pistoia Alliance conference April 2016: Big Data: Mathew Woodwark
Pistoia Alliance conference April 2016: Big Data: Mathew WoodwarkPistoia Alliance conference April 2016: Big Data: Mathew Woodwark
Pistoia Alliance conference April 2016: Big Data: Mathew Woodwark
 
A Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life SciencesA Federated In-Memory Database System for Life Sciences
A Federated In-Memory Database System for Life Sciences
 
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ... Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
Use of CEDAR Technology for Ontology-based Submission of Biomedical Data to ...
 

Recently uploaded

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Recently uploaded (20)

Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

Single-cell Data on Polly.pptx

  • 2. Polly by Elucidata Elucidata’s data harmonization platform- Polly, delivers the highest quality single cell data to fit diverse analysis methods & pipelines. All datasets are Polly Verified, i.e harmonized with a configurable, granular & transparent curation process
  • 3. Streamlined Journey to Improving Quality of Single-cell Data Data on Polly Data at Source Tabular file (MTX, CSV) Txt File ● 50% Missing annotations ● <2% Harmonized ● Different access nuances ● Formats vary across datasets & samples ● <1% Missing annotations ● 100% Harmonized ● 4X New fields added ● Consistent H5AD format Processing Metadata Harmonization Cell Annotation Quality Assurance Polly Harmonization
  • 4. Single-Cell Data Options on Polly Raw Counts Polly Processed Counts Author Processed Counts What is it? Raw unfiltered counts extracted from the source, cleaned and metadata annotated Harmonized Single Cell Data, consistently processed & cell type annotated using a validated Polly Pipeline Single Cell Data that is processed & cell type annotated using author provided parameters Useful for Re-Processing and annotating data with in-house pipelines Making data comparable & interoperable for large scale comparative analyses Replicating a published study of interest Output File(s) Unfiltered Raw counts with 30 metadata fields (H5AD) ● Polly Processed Counts with cell type annotations and 32 other fields (H5AD) ● Raw Counts with 30 fields (H5AD) ● Author Processed Counts with cell type annotations & 32 other fields (H5AD) ● Raw Counts with 30 fields (H5AD)
  • 5. Why Access Single-Cell Data on Polly? Data You Can Trust ~50 QA checks performed on all data/metadata to ensure quality and provenance. Learn how each dataset was processed and annotated with comprehensive QA reports. Complete Transparency Request custom metadata fields or cell type annotation with your own markers. Customizable Harmonization Flexible Ways to Consume Data Work with Polly’s data on tools and environments of your choice. No download restrictions applied!
  • 6. How We Deliver: Data Concierge Data Audits ● Experts identify datasets relevant to your research on/off Polly ● Requirement gathering for curation & processing of found data Store in your Atlas ● Domain specific repository of Analysis-Ready data ● All datasets are QC-ed, Custom Curated & Polly Verified Exploration and Analysis ● Explore on Polly via CellxGene ● Download data with Polly’s APIs or GUI, explore on tools of choice ● Customized solutions as service: GSEA, Knowledge Graphs, ML Classifiers and Dashboards for analysis & visualization
  • 7. 7 About the Customer A therapeutics is an early stage startup based in Boston that is developing biologics for inflammatory and autoimmune diseases. The company was looking to identify potential targets for these indications. Objective Find and integrate single cell datasets specific to inflammatory diseases from public sources. Perform meta-analysis to arrive at fibroblast specific gene targets for further exploration. Target Identification & Validation with Curated Single-cell Data: Case-study
  • 9. How Was the Data Processed? Data at Source Unfiltered Raw Counts H5AD files with Hugo symbols, QC metrics, curated metadata fields Filtering & Normalization Consistent filtering criteria, normalization & Batch effect correction Cell Type Annotation Store on Atlas Marker list from publications to derive cell annotations h5AD with curated metadata and consistently annotated cells mtx, csv, tsv, h5ad, seurat, h5
  • 10. Meta-Analysis for Target Identification and Validation Differential expression analysis of merged data to get top 250 DEGs 13 datasets identified and 3 datasets merged Refine results to top 20 genes with RF model and point biserial scores Examine expression and narrow down to 10 genes Review literature and perform pathway analysis to arrive at 5 targets B cells T cells Myeloid cells Plasma Stem or Enterocyte cells Mast cells Vascular cells Fibroblasts UMAP 2 UMAP 1 Integrated Cell Type Diseased Normal Fibroblast Fibroblast Other Other Gene 1
  • 11. Single Cell Data Curation Impact Target Identification & Validation 156 scRNA-Seq datasets, specific to inflammatory diseases were identified and annotated with relevant metadata information Shortlisted 4 novel targets and validated 5 pre-identified targets using meta-analysis Time Savings 4X acceleration in the target identification process (from 8-10 months to 2.5 months)
  • 12. Reach out to us at info@elucidata.io or Book a Demo with us to learn more.