SlideShare a Scribd company logo
1 of 58
The Role of The Statisticians in
Personalized Medicine:
An Overview of Statistical
Methods in Bioinformatics
Setia Pramana
Teknik Fisika
Fakultas Teknik Industri
Institut Teknologi Sepuluh Nopember
Surabaya, 12 March 2014
Setia Pramana 1
Educational Background
• Universitas Brawijaya Malang, FMIPA, Statistics
department, 1995-1999.
• Hasselt Universiteit, Belgium, MSc in Applied Statistics
2005-2006.
• Hasselt Universiteit, Belgium, MSc in Biostatistics 2006-
2007.
• Hasselt Universiteit, Belgium, PhD Statistical
Bioinformatics, 2007-2011.
• Medical Epidemiology And Biostatistics Dept. Karolinska
Institutet, Sweden, Postdoctoral, 2011-2014
Now?
• Lecture and Researcher at Sekolah Tinggi Ilmu
Statistik, Jakarta.
• Adjunct Faculty at Medical Epidemiology and
Biostatistics Dept, Karolinska Institutet, Stockholm.
Outline
• Personalized Medicine
• Central Dogma
• Microarray Data Analysis
• Next Generation Sequencing
• Summary
Setia Pramana 4
Personalized Medicine
• Drug Development:
– Takes 10-15 years
– Cost millions USD
• Who: Pharmaceutical, biotechnology, device companies,
Universities and government research agencies
• Regulatory: The US Food and Drug Administration (FDA)
• Evaluate:
– Safety – can people take it?
– Efficacy – does it do anything in humans?
– Effectiveness – is it better or at least as good as what is
currently available?
– Do the benefits outweigh the risks?
Setia Pramana 5
Personalized Medicine
• Drug Development Stages:
- Drug Discovery
- Pre-clinical Development
- Clinical Development 4 Phases
• Statisticians are involved in all stages
• Stages are highly regulated
• Result is based on most of patients
• But .. Patients are created differently!
Setia Pramana 6
Patients Heterogeneity
Setia Pramana 7
Patients Heterogeneity
• We’re all different in
- Physiological, demographic characteristics
- Medical history
- Genetic/genomic characteristics
• What works for a patient with one set of
characteristics might not work for another!
Setia Pramana 8
Patients Heterogeneity
• “One size does not fit all”
• Use a patient’s characteristics to determine best
treatment for him/her
• Genomic information is a great potential
-- > Personalized medicine:
“The right treatment for the right patient at the right
time”
Setia Pramana 9
Subgroup identification and targeted treatment
• Determine subgroups of patients who share certain
characteristics and would get better on a particular
treatment
• Discover biomarkers which can identify the subgroup
• Focus on finding and treating a subgroup
Setia Pramana 11
Subgroup identification and targeted treatment
Genotype Phenotype Intervention Outcome
Mutations/SN
Ps
Gene/Protein
Expression
Epigenetics
Diseases
Disability
etc
Drug
Regimes
Personalized
medicine
Setia Pramana 12
Advanced Biomedical Technologies
• High-throughput microarrays and molecular imaging
to monitor SNPs, gene and protein expressions
• Next-Generation Sequencing
Setia Pramana 13
Microarrays
Setia Pramana 14
Central Dogma
Central Dogma
http://compbio.pbworks.com
Setia Pramana 15
Gene
• The full DNA sequence of an organism is called its
genome
• A gene is a segment that specifies the sequence of
one or more protein.
Setia Pramana 16
Genomics
• The study of all the genes of a cell, or tissue, at :
– the DNA (genotype), e.g., GWAS SNP, CNV etc…
– mRNA (transcriptomics), Gene expression,
– or protein levels (proteomics).
• Functional Genomics: study the functionality of specific
genes, their relations to diseases, their associated
proteins and their participation in biological processes.
Setia Pramana 17
Microarray
• DNA microarrays are biotechnologies which
allow the monitoring of expression of
thousand genes.
Setia Pramana 18
Applications
• High efficacy and low/no side effect drug
• Genes related disease.
• Biological discovery
– new and better molecular diagnostics
– new molecular targets for therapy
– finding and refining biological pathways
• Molecular diagnosis of leukemia, breast cancer, etsc.
• Appropriate treatment for genetic signature
• Potential new drug targets
Setia Pramana 19
Microarray
Overview of the process
of generating high
throughput gene
expression data using
microarrays.
Setia Pramana 20
The Pipeline
• Experiment design  Lab work  Image processing
• Signal summarization (RMA, GCRMA)
• Normalization
• Data Analysis:
– Differentially Expressed genes
– Clustering
– Classification
– Etc.
• Network / Pathways (GSEA etc..)
• Biological interpretations
Setia Pramana 21
Microarray Data Structure
Setia Pramana 22
Preprocessed Data
Genes C1 C2 C3 T1 T2 T3
G8522 6.78 6.55 6.37 6.89 6.78 6.92
G8523 6.52 6.61 6.72 6.51 6.59 6.46
G8524 5.67 5.69 5.88 7.43 7.16 7.31
G8525 5.64 5.91 5.61 7.41 7.49 7.41
G8526 4.63 4.85 5.72 5.71 5.47 5.79
G8528 7.81 7.58 7.24 7.79 7.38 8.60
G8529 4.26 4.20 4.82 3.11 4.94 3.08
G8530 7.36 7.45 7.31 7.46 7.53 7.35
G8531 5.30 5.36 5.70 5.41 5.73 5.77
G8532 5.84 5.48 5.93 5.84 5.73 5.75Setia Pramana 23
Challenges
• Mega data, difficult to visualize
• Too few records (columns/samples), usually < 100
• Too many rows(genes), usually > 10,000
• Too many genes likely leading to False positives
• For exploration, a large set of all relevant genes is
desired
• For diagnostics or identification of therapeutic
targets, the smallest set of genes is needed
• Model needs to be explainable to biologists
Setia Pramana 24
Microarray Data Analysis Types
• Gene Selection
–find genes for therapeutic targets
• Classification (Supervised)
–identify disease (biomarker study)
–predict outcome / select best treatment
• Clustering (Unsupervised)
–find new biological classes / refine existing ones
–Understanding regulatory relationship/pathway
–exploration
Setia Pramana 25
Gene Selection
• Modified t-test
• Significance Analysis of Microarray (SAM)
• Limma (Linear model for microarrays )
• Linear Mixed model
• Lasso (least absolute selection and shrinkage operator)
• Elastic-net
• Etc,
Setia Pramana 26
Visualization
• Dimensionality reduction
• PCA (Principal Component Analysis)
• Biplot
• Heatmap
• Multi dimensional scaling
• Etc
Setia Pramana 27
Clustering
• Cluster the genes
• Cluster the
arrays/conditions
• Cluster both simultaneously
• K-means
• Hierarchical
• Biclustering algorithms
Setia Pramana 28
Clustering
• Cluster or Classify
genes according to
tumors
• Cluster tumors
according to genes
Setia Pramana 29
Classification
• Linear Discriminat Analysis
• K nearest neighbour
• Logistic regression
• L1 Penalized Logistric regression
• Neural Network
• Support vector machines
• Random forest
• etc
Setia Pramana 31
Aim: To improve understanding of host protein
profiles during disease progression especially in
children.
Classification of Malaria Subtypes
•Identify panel of proteins which could distinguish
between different subtypes.
•Implement L1-penalized logistic regression
Penalized Logistic Regression
•Logistic regression is a supervised method for binary
or multi-class classification.
•In high-dimensional data (e.g., microarray): More
variables than the observations  Classical logistic
regression does not work.
•Other problems: Variables are correlated
(multicolinierity) and over fitting.
•Solution: Introduce a penalty for complexity in the
model.
36
Penalized Logistic Regression
Logistic model:
Maximize the log-likelihood:
•-Penalization (Lasso):
•
37
• Shrinks all regression coefficients () toward zero
and set some of them to zero.
• Performs parameter estimation and variable
selection at the same time.
• The choice of λ is crucial and chosen via k-fold
cross-validation procedure.
• The procedure is implemented in an R package
called penalized.
38
L1 Penalized Logistic Regression
Classification of Severe Malaria Anemia vs.
Uncomplicated Malaria group
39
AUC: 0.86
Dose-response Microarray Studies
Setia Pramana 40
Dose-response Microarray Studies
Setia Pramana 41
Implemented in R package IsoGene and IsoGeneGUI.
Dose-response Microarray Studies
Setia Pramana 42
Gene Signature for Prostate Cancer
Setia Pramana 43
Gene Signature for Prostate Cancer
Setia Pramana 44
Gene Signature for Prostate Cancer
Setia Pramana 45
Next Generation Sequencing
Setia Pramana 46
Next Generation Sequencing
Setia Pramana 47
Reading the order of bases of DNA fragments
NGS used for:
• Whole genome re-sequencing
• Metagenomics
• Cancer genomics
• Exome sequencing (targeted)
• RNA-sequencing
• Chip-seq
• Genomic Epidemiology
Setia Pramana 49
Next Generation Sequencing
Setia Pramana 50
• Produce Massive Data and fast
• Problem is storage and analysis
RNA-seq Pipeline
• Align to a reference genome using Tophat.
Reference
Pramana, et.al 51NBBC 2013
Source: Trapnell et.al, 2010
RNA-seq Pipeline
• Measure gene expression using Cufflinks: FPKM
(Fragments Per Kilobase of transcript per Million
mapped reads).
Reference Gene
Transcript 2
Transcript 1
Isoform/Transcript FPKM
Gene FPKM
Sample 1
Sample 2
Sample 3
Pramana, et.al 52NBBC 2013 Source: Trapnell et.al, 2013
Setia Pramana 53
Subtype-specific Transcripts/Isoforms
• Breast invasive carcinoma (BRCA) from the Cancer
Genome Atlas Project (TCGA).
• 329 tumor samples.
• Platform: illumina
• Paired-end reads (length 50 bp).
• 20 -100 million reads
Setia Pramana 54
Subtype-specific Transcripts/Isoforms
• To discover transcripts/isoforms which are only
significantly (high/low) expressed in a certain cancer
subtype.
Pramana, et.al 55NBBC 2013
Analysis Flow
329 samples TCGA
Discovery set
179 samples
Validation set
- TCGA 150 samples
- External samples
Classification to mol-subtypes
- Use Swedish microarray data as
training data.
- Based on gene level FPKM
- Median and variance normalization
- K-nearest neighbor
- Classifier genes selection
Subtype-specific Transcript
- Transcript level FPKM of all
genes
- For each transcript: Robust
contrast tests.
- Multiple testing adjustment.
Pramana, et.al 56NBBC 2013
Subtype-specific Transcripts/Isoforms
Setia Pramana 57
Subtype-specific Transcripts/Isoforms
Setia Pramana 58
Subtype-specific Transcripts/Isoforms
Setia Pramana 59
Software?
• R now is growing, especially in bioinformatics
– Statistics, data analysis, machine learning
– Free
– High Quality
– Open Source
– Extendable (you can submit and publish your own package!!)
– Can be integrated with other languages (C/C++, Java, Python)
– Large active user community
– Command-based (-)
Setia Pramana 60
Summary
• Statistics plays important roles in developing
personalized medicine
• Multidisciplinary field  need collaboration with
different experts.
• Bioinformaticians is one of the sexiest job
• Big Data in Medicine: Numerous opportunities to be
explored and discovered.
Setia Pramana 61
Thank you for your attention….
Setia Pramana 62

More Related Content

What's hot

EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...IJDKP
 
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...IJDKP
 
Math, Stats and CS in Public Health and Medical Research
Math, Stats and CS in Public Health and Medical ResearchMath, Stats and CS in Public Health and Medical Research
Math, Stats and CS in Public Health and Medical ResearchJessica Minnier
 
Digital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineDigital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineJoel Saltz
 
Risk Classification with an Adaptive Naive Bayes Kernel Machine Model
Risk Classification with an Adaptive Naive Bayes Kernel Machine ModelRisk Classification with an Adaptive Naive Bayes Kernel Machine Model
Risk Classification with an Adaptive Naive Bayes Kernel Machine ModelJessica Minnier
 
A Method to facilitate cancer detection and type classification from gene exp...
A Method to facilitate cancer detection and type classification from gene exp...A Method to facilitate cancer detection and type classification from gene exp...
A Method to facilitate cancer detection and type classification from gene exp...Xi Chen
 
Deep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a surveyDeep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a surveySOYEON KIM
 
NetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarNetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarAlexander Pico
 
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...SOYEON KIM
 
Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Andrei KUCHARAVY
 
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...IJTET Journal
 
Chemoradiation for head and neck cancers
Chemoradiation for head and neck cancers Chemoradiation for head and neck cancers
Chemoradiation for head and neck cancers Dr Krishna Koirala
 
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Natalio Krasnogor
 
MDC Connects: Biomarker identification - Assessing Immune Function
MDC Connects: Biomarker identification - Assessing Immune FunctionMDC Connects: Biomarker identification - Assessing Immune Function
MDC Connects: Biomarker identification - Assessing Immune FunctionMedicines Discovery Catapult
 
Addressing Questions & Unmet Needs in Melanoma Research and Treatment
Addressing Questions & Unmet Needs in Melanoma Research and TreatmentAddressing Questions & Unmet Needs in Melanoma Research and Treatment
Addressing Questions & Unmet Needs in Melanoma Research and TreatmentTom Williams
 

What's hot (20)

EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...
 
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...
EFFICACY OF NON-NEGATIVE MATRIX FACTORIZATION FOR FEATURE SELECTION IN CANCER...
 
Math, Stats and CS in Public Health and Medical Research
Math, Stats and CS in Public Health and Medical ResearchMath, Stats and CS in Public Health and Medical Research
Math, Stats and CS in Public Health and Medical Research
 
Digital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision MedicineDigital Pathology, FDA Approval and Precision Medicine
Digital Pathology, FDA Approval and Precision Medicine
 
Risk Classification with an Adaptive Naive Bayes Kernel Machine Model
Risk Classification with an Adaptive Naive Bayes Kernel Machine ModelRisk Classification with an Adaptive Naive Bayes Kernel Machine Model
Risk Classification with an Adaptive Naive Bayes Kernel Machine Model
 
Project_702
Project_702Project_702
Project_702
 
A Method to facilitate cancer detection and type classification from gene exp...
A Method to facilitate cancer detection and type classification from gene exp...A Method to facilitate cancer detection and type classification from gene exp...
A Method to facilitate cancer detection and type classification from gene exp...
 
Deep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a surveyDeep learning based multi-omics integration, a survey
Deep learning based multi-omics integration, a survey
 
NetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David AmarNetBioSIG2014-Talk by David Amar
NetBioSIG2014-Talk by David Amar
 
Basics of Gene Therapy
Basics of Gene TherapyBasics of Gene Therapy
Basics of Gene Therapy
 
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
Robust Pathway-based Multi-Omics Data Integration using Directed Random Walk ...
 
Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...Systems biology in polypharmacology: explaining and predicting drug secondary...
Systems biology in polypharmacology: explaining and predicting drug secondary...
 
Coebp
CoebpCoebp
Coebp
 
Applications of Proteomics Science
Applications of Proteomics ScienceApplications of Proteomics Science
Applications of Proteomics Science
 
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
A Classification of Cancer Diagnostics based on Microarray Gene Expression Pr...
 
Chemoradiation for head and neck cancers
Chemoradiation for head and neck cancers Chemoradiation for head and neck cancers
Chemoradiation for head and neck cancers
 
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
 
1207.2600
1207.26001207.2600
1207.2600
 
MDC Connects: Biomarker identification - Assessing Immune Function
MDC Connects: Biomarker identification - Assessing Immune FunctionMDC Connects: Biomarker identification - Assessing Immune Function
MDC Connects: Biomarker identification - Assessing Immune Function
 
Addressing Questions & Unmet Needs in Melanoma Research and Treatment
Addressing Questions & Unmet Needs in Melanoma Research and TreatmentAddressing Questions & Unmet Needs in Melanoma Research and Treatment
Addressing Questions & Unmet Needs in Melanoma Research and Treatment
 

Viewers also liked

Andy Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...
Andy  Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...Andy  Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...
Andy Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...CIAT
 
Meta analysis of microarray
Meta analysis of microarrayMeta analysis of microarray
Meta analysis of microarray弘毅 露崎
 
Sequential Extraction of Local ICA Structures
Sequential Extraction of Local ICA StructuresSequential Extraction of Local ICA Structures
Sequential Extraction of Local ICA Structurestopujahin
 
Microarray data and pathway analysis: example from the bench
Microarray data and pathway analysis: example from the benchMicroarray data and pathway analysis: example from the bench
Microarray data and pathway analysis: example from the benchMaté Ongenaert
 
Functional And Pathway Analysis 2010
Functional And Pathway Analysis 2010Functional And Pathway Analysis 2010
Functional And Pathway Analysis 2010Stewart MacArthur
 

Viewers also liked (6)

Cytoscape Talk 2010
Cytoscape Talk 2010Cytoscape Talk 2010
Cytoscape Talk 2010
 
Andy Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...
Andy  Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...Andy  Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...
Andy Jarvis PARASID Near Real Time Monitoring Of Habitat Change Using A Neur...
 
Meta analysis of microarray
Meta analysis of microarrayMeta analysis of microarray
Meta analysis of microarray
 
Sequential Extraction of Local ICA Structures
Sequential Extraction of Local ICA StructuresSequential Extraction of Local ICA Structures
Sequential Extraction of Local ICA Structures
 
Microarray data and pathway analysis: example from the bench
Microarray data and pathway analysis: example from the benchMicroarray data and pathway analysis: example from the bench
Microarray data and pathway analysis: example from the bench
 
Functional And Pathway Analysis 2010
Functional And Pathway Analysis 2010Functional And Pathway Analysis 2010
Functional And Pathway Analysis 2010
 

Similar to Statistical Methods Guide Personalized Medicine

High throughput Data Analysis
High throughput Data AnalysisHigh throughput Data Analysis
High throughput Data AnalysisSetia Pramana
 
Molecular Subtyping of Breast Cancer and Somatic Mutation Discovery Using DNA...
Molecular Subtyping of Breast Cancer and Somatic Mutation Discovery Using DNA...Molecular Subtyping of Breast Cancer and Somatic Mutation Discovery Using DNA...
Molecular Subtyping of Breast Cancer and Somatic Mutation Discovery Using DNA...Setia Pramana
 
PadminiNarayanan-Intro-2018.pptx
PadminiNarayanan-Intro-2018.pptxPadminiNarayanan-Intro-2018.pptx
PadminiNarayanan-Intro-2018.pptxDESMONDEZIEKE1
 
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMEDDataScienceConferenc1
 
2017 molecular profiling_wim_vancriekinge
2017 molecular profiling_wim_vancriekinge2017 molecular profiling_wim_vancriekinge
2017 molecular profiling_wim_vancriekingeProf. Wim Van Criekinge
 
dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...
dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...
dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...dkNET
 
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...QIAGEN
 
Experimental methods and the big data sets
Experimental methods and the big data sets Experimental methods and the big data sets
Experimental methods and the big data sets improvemed
 
3_Gibson_Janedsadddadsadsadsadsadasdsdasda
3_Gibson_Janedsadddadsadsadsadsadasdsdasda3_Gibson_Janedsadddadsadsadsadsadasdsdasda
3_Gibson_JanedsadddadsadsadsadsadasdsdasdaAdiM27
 
Next Generation Sequencing (NGS) in the Clinic
Next Generation Sequencing (NGS) in the ClinicNext Generation Sequencing (NGS) in the Clinic
Next Generation Sequencing (NGS) in the ClinicEdizonJambormias2
 
Dr. John Svaren - 'Neuropatías periféricas hereditarias'
Dr. John Svaren - 'Neuropatías periféricas hereditarias'Dr. John Svaren - 'Neuropatías periféricas hereditarias'
Dr. John Svaren - 'Neuropatías periféricas hereditarias'Fundación Ramón Areces
 
Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...Malachi Griffith
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomicsPawan Kumar
 
Whole Genome Trait Association in SVS
Whole Genome Trait Association in SVSWhole Genome Trait Association in SVS
Whole Genome Trait Association in SVSGolden Helix
 
Big Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey NislowBig Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey NislowKnome_Inc
 

Similar to Statistical Methods Guide Personalized Medicine (20)

High throughput Data Analysis
High throughput Data AnalysisHigh throughput Data Analysis
High throughput Data Analysis
 
Molecular Subtyping of Breast Cancer and Somatic Mutation Discovery Using DNA...
Molecular Subtyping of Breast Cancer and Somatic Mutation Discovery Using DNA...Molecular Subtyping of Breast Cancer and Somatic Mutation Discovery Using DNA...
Molecular Subtyping of Breast Cancer and Somatic Mutation Discovery Using DNA...
 
PadminiNarayanan-Intro-2018.pptx
PadminiNarayanan-Intro-2018.pptxPadminiNarayanan-Intro-2018.pptx
PadminiNarayanan-Intro-2018.pptx
 
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
[DSC Europe 23][DigiHealth] Tomislav Krizan - AIMED
 
2017 molecular profiling_wim_vancriekinge
2017 molecular profiling_wim_vancriekinge2017 molecular profiling_wim_vancriekinge
2017 molecular profiling_wim_vancriekinge
 
dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...
dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...
dkNET Webinar: Multi-Omics Data Integration for Phenotype Prediction of Type-...
 
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...
 
Axt microarrays
Axt microarraysAxt microarrays
Axt microarrays
 
Experimental methods and the big data sets
Experimental methods and the big data sets Experimental methods and the big data sets
Experimental methods and the big data sets
 
3_Gibson_Janedsadddadsadsadsadsadasdsdasda
3_Gibson_Janedsadddadsadsadsadsadasdsdasda3_Gibson_Janedsadddadsadsadsadsadasdsdasda
3_Gibson_Janedsadddadsadsadsadsadasdsdasda
 
Next Generation Sequencing (NGS) in the Clinic
Next Generation Sequencing (NGS) in the ClinicNext Generation Sequencing (NGS) in the Clinic
Next Generation Sequencing (NGS) in the Clinic
 
Dr. John Svaren - 'Neuropatías periféricas hereditarias'
Dr. John Svaren - 'Neuropatías periféricas hereditarias'Dr. John Svaren - 'Neuropatías periféricas hereditarias'
Dr. John Svaren - 'Neuropatías periféricas hereditarias'
 
Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...Bioinformatics tools for development, analysis, and preclinical testing of in...
Bioinformatics tools for development, analysis, and preclinical testing of in...
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Whole Genome Trait Association in SVS
Whole Genome Trait Association in SVSWhole Genome Trait Association in SVS
Whole Genome Trait Association in SVS
 
Proteomics
ProteomicsProteomics
Proteomics
 
Big Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey NislowBig Data and Genomic Medicine by Corey Nislow
Big Data and Genomic Medicine by Corey Nislow
 
Genomics experimental-methods
Genomics experimental-methodsGenomics experimental-methods
Genomics experimental-methods
 
Molecular profiling 2013
Molecular profiling 2013Molecular profiling 2013
Molecular profiling 2013
 
Brian_Strahl 2013_class_on_genomics_and_proteomics
Brian_Strahl 2013_class_on_genomics_and_proteomicsBrian_Strahl 2013_class_on_genomics_and_proteomics
Brian_Strahl 2013_class_on_genomics_and_proteomics
 

More from Setia Pramana

Big data for official statistics @ Konferensi Big Data Indonesia 2016
Big data for official statistics @ Konferensi Big Data Indonesia 2016 Big data for official statistics @ Konferensi Big Data Indonesia 2016
Big data for official statistics @ Konferensi Big Data Indonesia 2016 Setia Pramana
 
Introduction to Computational Statistics
Introduction to Computational StatisticsIntroduction to Computational Statistics
Introduction to Computational StatisticsSetia Pramana
 
Bioinformatics I-4 lecture
Bioinformatics I-4 lectureBioinformatics I-4 lecture
Bioinformatics I-4 lectureSetia Pramana
 
Correlation and Regression Analysis using SPSS and Microsoft Excel
Correlation and Regression Analysis using SPSS and Microsoft ExcelCorrelation and Regression Analysis using SPSS and Microsoft Excel
Correlation and Regression Analysis using SPSS and Microsoft ExcelSetia Pramana
 
Pengalaman Menjadi Mahasiswa Muslim di Eropa
Pengalaman Menjadi Mahasiswa Muslim di EropaPengalaman Menjadi Mahasiswa Muslim di Eropa
Pengalaman Menjadi Mahasiswa Muslim di EropaSetia Pramana
 
Multivariate data analysis
Multivariate data analysisMultivariate data analysis
Multivariate data analysisSetia Pramana
 
Research Methods for Computational Statistics
Research Methods for Computational StatisticsResearch Methods for Computational Statistics
Research Methods for Computational StatisticsSetia Pramana
 
Survival Data Analysis for Sekolah Tinggi Ilmu Statistik Jakarta
Survival Data Analysis for Sekolah Tinggi Ilmu Statistik JakartaSurvival Data Analysis for Sekolah Tinggi Ilmu Statistik Jakarta
Survival Data Analysis for Sekolah Tinggi Ilmu Statistik JakartaSetia Pramana
 
“Big Data” and the Challenges for Statisticians
“Big Data” and the  Challenges for Statisticians“Big Data” and the  Challenges for Statisticians
“Big Data” and the Challenges for StatisticiansSetia Pramana
 
Getting a Scholarship, how?
Getting a Scholarship, how?Getting a Scholarship, how?
Getting a Scholarship, how?Setia Pramana
 
Kehidupan sehari-hari dengan Personnummer atau SIN Single Identity Number
Kehidupan sehari-hari dengan Personnummer atau SIN Single Identity NumberKehidupan sehari-hari dengan Personnummer atau SIN Single Identity Number
Kehidupan sehari-hari dengan Personnummer atau SIN Single Identity NumberSetia Pramana
 
Research possibilities with the Personal Identification Number (person nummer...
Research possibilities with the Personal Identification Number (person nummer...Research possibilities with the Personal Identification Number (person nummer...
Research possibilities with the Personal Identification Number (person nummer...Setia Pramana
 
Developing R Graphical User Interfaces
Developing R Graphical User InterfacesDeveloping R Graphical User Interfaces
Developing R Graphical User InterfacesSetia Pramana
 
Academia vs industry
Academia vs industryAcademia vs industry
Academia vs industrySetia Pramana
 
Gene sebuah nikmat Allah
Gene sebuah nikmat AllahGene sebuah nikmat Allah
Gene sebuah nikmat AllahSetia Pramana
 
Model averaging in dose-response study in microarray expression
Model averaging in dose-response study in microarray expressionModel averaging in dose-response study in microarray expression
Model averaging in dose-response study in microarray expressionSetia Pramana
 
Dose-Response Modeling of Gene Expression Data in pre-clinical Microarray Exp...
Dose-Response Modeling of Gene Expression Data in pre-clinical Microarray Exp...Dose-Response Modeling of Gene Expression Data in pre-clinical Microarray Exp...
Dose-Response Modeling of Gene Expression Data in pre-clinical Microarray Exp...Setia Pramana
 

More from Setia Pramana (20)

Big data for official statistics @ Konferensi Big Data Indonesia 2016
Big data for official statistics @ Konferensi Big Data Indonesia 2016 Big data for official statistics @ Konferensi Big Data Indonesia 2016
Big data for official statistics @ Konferensi Big Data Indonesia 2016
 
Resampling methods
Resampling methodsResampling methods
Resampling methods
 
Introduction to Computational Statistics
Introduction to Computational StatisticsIntroduction to Computational Statistics
Introduction to Computational Statistics
 
Bioinformatics I-4 lecture
Bioinformatics I-4 lectureBioinformatics I-4 lecture
Bioinformatics I-4 lecture
 
Correlation and Regression Analysis using SPSS and Microsoft Excel
Correlation and Regression Analysis using SPSS and Microsoft ExcelCorrelation and Regression Analysis using SPSS and Microsoft Excel
Correlation and Regression Analysis using SPSS and Microsoft Excel
 
Pengalaman Menjadi Mahasiswa Muslim di Eropa
Pengalaman Menjadi Mahasiswa Muslim di EropaPengalaman Menjadi Mahasiswa Muslim di Eropa
Pengalaman Menjadi Mahasiswa Muslim di Eropa
 
Multivariate data analysis
Multivariate data analysisMultivariate data analysis
Multivariate data analysis
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Research Methods for Computational Statistics
Research Methods for Computational StatisticsResearch Methods for Computational Statistics
Research Methods for Computational Statistics
 
Survival Data Analysis for Sekolah Tinggi Ilmu Statistik Jakarta
Survival Data Analysis for Sekolah Tinggi Ilmu Statistik JakartaSurvival Data Analysis for Sekolah Tinggi Ilmu Statistik Jakarta
Survival Data Analysis for Sekolah Tinggi Ilmu Statistik Jakarta
 
“Big Data” and the Challenges for Statisticians
“Big Data” and the  Challenges for Statisticians“Big Data” and the  Challenges for Statisticians
“Big Data” and the Challenges for Statisticians
 
Getting a Scholarship, how?
Getting a Scholarship, how?Getting a Scholarship, how?
Getting a Scholarship, how?
 
Kehidupan sehari-hari dengan Personnummer atau SIN Single Identity Number
Kehidupan sehari-hari dengan Personnummer atau SIN Single Identity NumberKehidupan sehari-hari dengan Personnummer atau SIN Single Identity Number
Kehidupan sehari-hari dengan Personnummer atau SIN Single Identity Number
 
Research possibilities with the Personal Identification Number (person nummer...
Research possibilities with the Personal Identification Number (person nummer...Research possibilities with the Personal Identification Number (person nummer...
Research possibilities with the Personal Identification Number (person nummer...
 
Developing R Graphical User Interfaces
Developing R Graphical User InterfacesDeveloping R Graphical User Interfaces
Developing R Graphical User Interfaces
 
Academia vs industry
Academia vs industryAcademia vs industry
Academia vs industry
 
Gene sebuah nikmat Allah
Gene sebuah nikmat AllahGene sebuah nikmat Allah
Gene sebuah nikmat Allah
 
Model averaging in dose-response study in microarray expression
Model averaging in dose-response study in microarray expressionModel averaging in dose-response study in microarray expression
Model averaging in dose-response study in microarray expression
 
Dose-Response Modeling of Gene Expression Data in pre-clinical Microarray Exp...
Dose-Response Modeling of Gene Expression Data in pre-clinical Microarray Exp...Dose-Response Modeling of Gene Expression Data in pre-clinical Microarray Exp...
Dose-Response Modeling of Gene Expression Data in pre-clinical Microarray Exp...
 
IsoGeneGUI
IsoGeneGUIIsoGeneGUI
IsoGeneGUI
 

Recently uploaded

How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
How to Manage Engineering to Order in Odoo 17
How to Manage Engineering to Order in Odoo 17How to Manage Engineering to Order in Odoo 17
How to Manage Engineering to Order in Odoo 17Celine George
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfPrerana Jadhav
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management systemChristalin Nelson
 
Multi Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleMulti Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleCeline George
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Projectjordimapav
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 

Recently uploaded (20)

How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
How to Manage Engineering to Order in Odoo 17
How to Manage Engineering to Order in Odoo 17How to Manage Engineering to Order in Odoo 17
How to Manage Engineering to Order in Odoo 17
 
Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"Mattingly "AI & Prompt Design: Large Language Models"
Mattingly "AI & Prompt Design: Large Language Models"
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 
Narcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdfNarcotic and Non Narcotic Analgesic..pdf
Narcotic and Non Narcotic Analgesic..pdf
 
Concurrency Control in Database Management system
Concurrency Control in Database Management systemConcurrency Control in Database Management system
Concurrency Control in Database Management system
 
Multi Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP ModuleMulti Domain Alias In the Odoo 17 ERP Module
Multi Domain Alias In the Odoo 17 ERP Module
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
ClimART Action | eTwinning Project
ClimART Action    |    eTwinning ProjectClimART Action    |    eTwinning Project
ClimART Action | eTwinning Project
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 

Statistical Methods Guide Personalized Medicine

  • 1. The Role of The Statisticians in Personalized Medicine: An Overview of Statistical Methods in Bioinformatics Setia Pramana Teknik Fisika Fakultas Teknik Industri Institut Teknologi Sepuluh Nopember Surabaya, 12 March 2014 Setia Pramana 1
  • 2. Educational Background • Universitas Brawijaya Malang, FMIPA, Statistics department, 1995-1999. • Hasselt Universiteit, Belgium, MSc in Applied Statistics 2005-2006. • Hasselt Universiteit, Belgium, MSc in Biostatistics 2006- 2007. • Hasselt Universiteit, Belgium, PhD Statistical Bioinformatics, 2007-2011. • Medical Epidemiology And Biostatistics Dept. Karolinska Institutet, Sweden, Postdoctoral, 2011-2014
  • 3. Now? • Lecture and Researcher at Sekolah Tinggi Ilmu Statistik, Jakarta. • Adjunct Faculty at Medical Epidemiology and Biostatistics Dept, Karolinska Institutet, Stockholm.
  • 4. Outline • Personalized Medicine • Central Dogma • Microarray Data Analysis • Next Generation Sequencing • Summary Setia Pramana 4
  • 5. Personalized Medicine • Drug Development: – Takes 10-15 years – Cost millions USD • Who: Pharmaceutical, biotechnology, device companies, Universities and government research agencies • Regulatory: The US Food and Drug Administration (FDA) • Evaluate: – Safety – can people take it? – Efficacy – does it do anything in humans? – Effectiveness – is it better or at least as good as what is currently available? – Do the benefits outweigh the risks? Setia Pramana 5
  • 6. Personalized Medicine • Drug Development Stages: - Drug Discovery - Pre-clinical Development - Clinical Development 4 Phases • Statisticians are involved in all stages • Stages are highly regulated • Result is based on most of patients • But .. Patients are created differently! Setia Pramana 6
  • 8. Patients Heterogeneity • We’re all different in - Physiological, demographic characteristics - Medical history - Genetic/genomic characteristics • What works for a patient with one set of characteristics might not work for another! Setia Pramana 8
  • 9. Patients Heterogeneity • “One size does not fit all” • Use a patient’s characteristics to determine best treatment for him/her • Genomic information is a great potential -- > Personalized medicine: “The right treatment for the right patient at the right time” Setia Pramana 9
  • 10. Subgroup identification and targeted treatment • Determine subgroups of patients who share certain characteristics and would get better on a particular treatment • Discover biomarkers which can identify the subgroup • Focus on finding and treating a subgroup Setia Pramana 11
  • 11. Subgroup identification and targeted treatment Genotype Phenotype Intervention Outcome Mutations/SN Ps Gene/Protein Expression Epigenetics Diseases Disability etc Drug Regimes Personalized medicine Setia Pramana 12
  • 12. Advanced Biomedical Technologies • High-throughput microarrays and molecular imaging to monitor SNPs, gene and protein expressions • Next-Generation Sequencing Setia Pramana 13
  • 15. Gene • The full DNA sequence of an organism is called its genome • A gene is a segment that specifies the sequence of one or more protein. Setia Pramana 16
  • 16. Genomics • The study of all the genes of a cell, or tissue, at : – the DNA (genotype), e.g., GWAS SNP, CNV etc… – mRNA (transcriptomics), Gene expression, – or protein levels (proteomics). • Functional Genomics: study the functionality of specific genes, their relations to diseases, their associated proteins and their participation in biological processes. Setia Pramana 17
  • 17. Microarray • DNA microarrays are biotechnologies which allow the monitoring of expression of thousand genes. Setia Pramana 18
  • 18. Applications • High efficacy and low/no side effect drug • Genes related disease. • Biological discovery – new and better molecular diagnostics – new molecular targets for therapy – finding and refining biological pathways • Molecular diagnosis of leukemia, breast cancer, etsc. • Appropriate treatment for genetic signature • Potential new drug targets Setia Pramana 19
  • 19. Microarray Overview of the process of generating high throughput gene expression data using microarrays. Setia Pramana 20
  • 20. The Pipeline • Experiment design  Lab work  Image processing • Signal summarization (RMA, GCRMA) • Normalization • Data Analysis: – Differentially Expressed genes – Clustering – Classification – Etc. • Network / Pathways (GSEA etc..) • Biological interpretations Setia Pramana 21
  • 22. Preprocessed Data Genes C1 C2 C3 T1 T2 T3 G8522 6.78 6.55 6.37 6.89 6.78 6.92 G8523 6.52 6.61 6.72 6.51 6.59 6.46 G8524 5.67 5.69 5.88 7.43 7.16 7.31 G8525 5.64 5.91 5.61 7.41 7.49 7.41 G8526 4.63 4.85 5.72 5.71 5.47 5.79 G8528 7.81 7.58 7.24 7.79 7.38 8.60 G8529 4.26 4.20 4.82 3.11 4.94 3.08 G8530 7.36 7.45 7.31 7.46 7.53 7.35 G8531 5.30 5.36 5.70 5.41 5.73 5.77 G8532 5.84 5.48 5.93 5.84 5.73 5.75Setia Pramana 23
  • 23. Challenges • Mega data, difficult to visualize • Too few records (columns/samples), usually < 100 • Too many rows(genes), usually > 10,000 • Too many genes likely leading to False positives • For exploration, a large set of all relevant genes is desired • For diagnostics or identification of therapeutic targets, the smallest set of genes is needed • Model needs to be explainable to biologists Setia Pramana 24
  • 24. Microarray Data Analysis Types • Gene Selection –find genes for therapeutic targets • Classification (Supervised) –identify disease (biomarker study) –predict outcome / select best treatment • Clustering (Unsupervised) –find new biological classes / refine existing ones –Understanding regulatory relationship/pathway –exploration Setia Pramana 25
  • 25. Gene Selection • Modified t-test • Significance Analysis of Microarray (SAM) • Limma (Linear model for microarrays ) • Linear Mixed model • Lasso (least absolute selection and shrinkage operator) • Elastic-net • Etc, Setia Pramana 26
  • 26. Visualization • Dimensionality reduction • PCA (Principal Component Analysis) • Biplot • Heatmap • Multi dimensional scaling • Etc Setia Pramana 27
  • 27. Clustering • Cluster the genes • Cluster the arrays/conditions • Cluster both simultaneously • K-means • Hierarchical • Biclustering algorithms Setia Pramana 28
  • 28. Clustering • Cluster or Classify genes according to tumors • Cluster tumors according to genes Setia Pramana 29
  • 29.
  • 30. Classification • Linear Discriminat Analysis • K nearest neighbour • Logistic regression • L1 Penalized Logistric regression • Neural Network • Support vector machines • Random forest • etc Setia Pramana 31
  • 31. Aim: To improve understanding of host protein profiles during disease progression especially in children.
  • 32. Classification of Malaria Subtypes •Identify panel of proteins which could distinguish between different subtypes. •Implement L1-penalized logistic regression
  • 33. Penalized Logistic Regression •Logistic regression is a supervised method for binary or multi-class classification. •In high-dimensional data (e.g., microarray): More variables than the observations  Classical logistic regression does not work. •Other problems: Variables are correlated (multicolinierity) and over fitting. •Solution: Introduce a penalty for complexity in the model. 36
  • 34. Penalized Logistic Regression Logistic model: Maximize the log-likelihood: •-Penalization (Lasso): • 37
  • 35. • Shrinks all regression coefficients () toward zero and set some of them to zero. • Performs parameter estimation and variable selection at the same time. • The choice of λ is crucial and chosen via k-fold cross-validation procedure. • The procedure is implemented in an R package called penalized. 38 L1 Penalized Logistic Regression
  • 36. Classification of Severe Malaria Anemia vs. Uncomplicated Malaria group 39 AUC: 0.86
  • 38. Dose-response Microarray Studies Setia Pramana 41 Implemented in R package IsoGene and IsoGeneGUI.
  • 40. Gene Signature for Prostate Cancer Setia Pramana 43
  • 41. Gene Signature for Prostate Cancer Setia Pramana 44
  • 42. Gene Signature for Prostate Cancer Setia Pramana 45
  • 44. Next Generation Sequencing Setia Pramana 47 Reading the order of bases of DNA fragments
  • 45. NGS used for: • Whole genome re-sequencing • Metagenomics • Cancer genomics • Exome sequencing (targeted) • RNA-sequencing • Chip-seq • Genomic Epidemiology Setia Pramana 49
  • 46. Next Generation Sequencing Setia Pramana 50 • Produce Massive Data and fast • Problem is storage and analysis
  • 47. RNA-seq Pipeline • Align to a reference genome using Tophat. Reference Pramana, et.al 51NBBC 2013 Source: Trapnell et.al, 2010
  • 48. RNA-seq Pipeline • Measure gene expression using Cufflinks: FPKM (Fragments Per Kilobase of transcript per Million mapped reads). Reference Gene Transcript 2 Transcript 1 Isoform/Transcript FPKM Gene FPKM Sample 1 Sample 2 Sample 3 Pramana, et.al 52NBBC 2013 Source: Trapnell et.al, 2013
  • 50. Subtype-specific Transcripts/Isoforms • Breast invasive carcinoma (BRCA) from the Cancer Genome Atlas Project (TCGA). • 329 tumor samples. • Platform: illumina • Paired-end reads (length 50 bp). • 20 -100 million reads Setia Pramana 54
  • 51. Subtype-specific Transcripts/Isoforms • To discover transcripts/isoforms which are only significantly (high/low) expressed in a certain cancer subtype. Pramana, et.al 55NBBC 2013
  • 52. Analysis Flow 329 samples TCGA Discovery set 179 samples Validation set - TCGA 150 samples - External samples Classification to mol-subtypes - Use Swedish microarray data as training data. - Based on gene level FPKM - Median and variance normalization - K-nearest neighbor - Classifier genes selection Subtype-specific Transcript - Transcript level FPKM of all genes - For each transcript: Robust contrast tests. - Multiple testing adjustment. Pramana, et.al 56NBBC 2013
  • 56. Software? • R now is growing, especially in bioinformatics – Statistics, data analysis, machine learning – Free – High Quality – Open Source – Extendable (you can submit and publish your own package!!) – Can be integrated with other languages (C/C++, Java, Python) – Large active user community – Command-based (-) Setia Pramana 60
  • 57. Summary • Statistics plays important roles in developing personalized medicine • Multidisciplinary field  need collaboration with different experts. • Bioinformaticians is one of the sexiest job • Big Data in Medicine: Numerous opportunities to be explored and discovered. Setia Pramana 61
  • 58. Thank you for your attention…. Setia Pramana 62