SlideShare a Scribd company logo
1 of 56
Download to read offline
Maarten van Smeden, PhD
2 november 2020
Why the EPV≥10 sample size rule is rubbish
and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
• Statistician at Julius Center for Health Sciences and Primary Care
• Main interests (but not limited to):
• prognostic and diagnostic modeling
• measurement error
• missing data
Today’s topic:
EPV≥10 sample size rule (aka 1 in 10 rule) has be one of the leading
sample size rules in prognostic/diagnostic prediction modeling
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Outline
• The EPV≥10 rule-of-thumb: where does it come from?
• Evidence the EPV≥10 rule has no rationale
• Evidence that sample size is important (even if you use the fancier methods)
• Actual sample size calculations for prediction models
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Ever wondered if AD/BC gives the “best” estimate of the odds ratio?
What if I told you that AD/BC is biased?
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Let’s say we have fitted a logistic regression model to a dataset, and obtain
ln
𝑝𝑝𝑖𝑖
1 − 𝑝𝑝𝑖𝑖
= 𝛼𝛼� + 𝛽𝛽̂1 𝑋𝑋1𝑖𝑖 + 𝛽𝛽̂2 𝑋𝑋2𝑖𝑖 + ⋯ + 𝛽𝛽̂𝑘𝑘 𝑋𝑋𝑘𝑘𝑖𝑖
I’m very sorry, but 𝛽𝛽̂1 is a biased estimator, and 𝛽𝛽̂2 too, ….
…. actually they are all finite sample biased
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Epidemiology text-books:
• Confounding bias
• Information bias
• Selection bias
… nothing about finite sample bias
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Important: bias vs consistency
• Consistency ≈ as sample size increases, estimate converges to truth
• Bias ≈ with repeated samples, the average estimate converges to truth
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Log(odds) is consistent but finite sample biased
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Illustration by simulation
• Simulate 4 normal covariates with equal multivariable log-odds-ratios of 2
• 1,000 simulation samples of N = 50
• Consistency: create 1,000 meta-dataset of increasing size: meta-dataset
r consists of each created dataset up to r;
• Bias: calculate difference estimate of exposure effect and true value for
each of the created datasets up to r;
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Average of 400 studies
with N = 50
1 study with N = 20,000
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
With decreasing sample size
How we usually think
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
With decreasing sample size
But actually with odds ratios
(and other ratios)
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
The origin of the 1 in 10 rule
“For EPV values of 10 or greater, no major problems occurred. For EPV
values less than 10, however, the regression coefficients were biased in
both positive and negative directions”
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Source: Peduzzi et al. 1996
?
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
More simulation studies
Citations based on Google Scholar, Oct 30 2020
citations: 5,736
“a minimum of 10 EPV […] may be too conservative”
“substantial problems even if the number of EPV exceeds 10”
For EPV values of 10 or greater, no major problems
citations: 2,438
citations: 216
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
More simulation studies
Citations based on Google Scholar, Oct 30 2020
citations: 5,736
“a minimum of 10 EPV […] may be too conservative”
“substantial problems even if the number of EPV exceeds 10”
For EPV values of 10 or greater, no major problems
citations: 2,438
citations: 216
!?!
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
• Examine the reasons for substantial differences between the earlier EPV
simulation studies
• Evaluate a possible solution to reduce the finite sample bias
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
• Examine the reasons for substantial differences between the earlier EPV
simulation studies (simulation technicality: handling of “separation”)
• Evaluate a possible solution to reduce the finite sample bias
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
• Examine the reasons for substantial differences between the earlier EPV
simulation studies (simulation technicality: handling of “separation”)
• Evaluate a possible solution to reduce the finite sample bias
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
• Firth’s ”correction” aims to reduce finite sample bias in maximum
likelihood estimates, applicable to logistic regression
• It makes clever use of the “Jeffries prior” (from Bayesian literature) to
penalize the log-likelihood, which shrinks the estimated coefficients
• It has a nice theoretical justifications, but does it work well?
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Standard
Averaged over 465 simulation conditions with 10,000 replications each
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
StandardFirth’scorrection
Averaged over 465 simulation conditions with 10,000 replications each
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Firth’s correction
Difficult? No
Example R code:
> require(“logistf”)
> logistf(Y~X1+X2+X3+X4, firth=T, data=df)
Compared to default (maxlik) logistic regression, Firth’s correction generally:
• Narrower confidence intervals
• Lower MSE
• Better predictions*
*requires adjustment of the intercept using flic=TRUE option in logistf
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Sample issue size solved?
… not quite!
• Precision of regression coefficients
• Variable selection and functional form
• Ensure predictions are adequate
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Sample issue size solved?
… not quite!
• Precision of regression coefficients
• Variable selection and functional form
• Ensure predictions are adequate
• Why would a one-solution fits all rule-of-thumb be appropriate?
• Think of sample size for a randomized clinical trial
Would be odd to suggest all trials should have 100 patients in each arm?
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
TRIPOD Item 8. Explain how the study size was arrived at
Moons et al. Ann Intern Med 2015 (TRIPOD Explanation & Elaboration)
“Although there is a consensus on the importance
of having an adequate sample size for model
development, how to determine what counts as
‘adequate’ is not clear …”
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Why is sample size important?
• We want to have a large enough sample size to develop a model that
provides accurate risk predictions in new individuals from target
population
• Many (most?) models do not perform well when checked in new data
• small sample sizes
• overfitting
• lack of (internal) validation
• …
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Recent example
• Reviewed 232 prediction models
• “All models were rated at high or
unclear risk of bias”
• Sample size: median 338; IQR 134 to 707
• Number of events: median 69; IQR 37 to 160
Living review, doi: 10.1136/bmj.m1328 (these numbers from a soon to appear review update)
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Recent example
• External validation 22 COVID-19 related
prognostic models
• Performance: poor to very poor
• “Admission oxygen saturation on room air and patient age are strong
predictors of deterioration and mortality among hospitalised adults with
COVID-19, respectively. None of the prognostic models evaluated here
offered incremental value for patient stratification to these univariable
predictors.”
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Small sample size and overfitting
• Spurious predictor-outcome associations
• Important predictors can be missed
• Unimportant predictors can be selected
• Regression coefficients too large and uncertain
• Model doesn’t predict well in new data
• Disappointing discrimination
• Often calibration slope < 1
https://twitter.com/LesGuessing/status/997146590442799105
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
With small N: calibration slope often < 1
Predictions too extreme
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
“Modern” methods aim to circumvent overfitting
• Penalised regression: e.g. lasso, ridge regression, elastic net
• Standard regression followed by uniform (global) shrinkage
• Target calibrated predicted risks in new data: shrinkage and penalty
terms estimated using bootstrapping or cross-validation
• Sample size problem solved?
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
“shrinkage works on the average but may fail in the particular unique
problem on which the statistician is working.”
• Required shrinkage is hard to estimate
• Often large uncertainty correct value to use, especially in small datasets (!)
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
“We conclude that, despite improved performance on average, shrinkage often
worked poorly in individual datasets, in particular when it was most needed.
The results imply that shrinkage methods do not solve problems associated
with small sample size or low number of events per variable.”
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Our proposal
• Calculate sample size that is needed to
• minimise potential overfitting
• estimate probability (risk) precisely
• Sample size formula’s for
• Continuous outcomes
• Time-to-event outcomes
• Binary outcomes (focus today)
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Example
• COVID-19 prognosis hospitalized
patients
• Composite outcome: “deterioration”
(in-hospital death, ventilator support,
ICU)
A priori expectations
• Event fraction at least 30%
• 40 candidate predictor parameters
• C-statistic of 0.71(conservative est)
-> Cox-Snell R2 of 0.24
MedRxiv Preprint (not peer reviewed): 10.1101/2020.10.09.20209957
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Restricted cubic splines
with 4 knots: 3 degrees of
freedom
Note: EPV rule also
calculates degrees of
freedom of candidate
predictors, not variables!
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Calculate required sample size
Criterion 1. Shrinkage: expected heuristic shrinkage factor, S ≥ 0.9
(calibration slope, target < 10% overfitting)
Criterion 2. Optimism: Cox-Snell R2 apparent - Cox-Snell R2 validation < 0.05
(overfitting)
Criterion 3: A small margin of error in overall risk estimate < 0.05 absolute error
(precision estimated baseline risk)
(Criterion 4: a small margin of absolute error in the estimated risks)
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Calculation
R code:
> require(pmsampsize)
> pmsampsize(type="b",rsquared=0.24,parameters=40,prevalence=0.3)
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
A few alternative scenarios
• rsquared=0.24,parameters=40,prevalence=0.3 -> EPV≥9.7
• rsquared=0.12,parameters=40,prevalence=0.3 -> EPV≥21.0
• rsquared=0.12,parameters=40,prevalence=0.5 -> EPV≥35.0
• rsquared=0.36,parameters=40,prevalence=0.2 -> EPV≥5
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
The sample size that meets all criteria is the MINIMUM required
• Why minimum? Other criteria may be important
e.g. missing data, clustering, variable selection
• May raise required sample size further
• Simulation based approaches
Preprint (not peer reviewed) doi: 10.21203/rs.3.rs-87100/v1
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Summary
• Default logistic regression produces finite sample biased estimates
• Finite sample bias can be substantial; easily solved using Firth’s correction
• “Modern” approaches (e.g. Firth, Lasso, Ridge) no compensation for low N
• New sample size criteria to replace the one-size-fits-all EPV≥10 rule
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
https://www.prognosisresearch.com/
New website by Richard Riley and Kym Snell
M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
Work in collaboration with:
• Carl Moons
• Hans Reitsma
• Richard Riley (Keele, materials for this presentation)
• Gary Collins (Oxford)
• Ben Van Calster (Leuven)
• Ewout Steyerberg (Leiden)
• Rishi Gupta (UCL)
• Many others
Contact: M.vanSmeden@umcutrecht.nl

More Related Content

What's hot

Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?Maarten van Smeden
 
Real world modified
Real world modifiedReal world modified
Real world modifiedStephen Senn
 
Machine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctorsMachine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctorsMaarten van Smeden
 
Development and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutionsDevelopment and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutionsMaarten van Smeden
 
Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Evangelos Kritsotakis
 
Clinical prediction models: development, validation and beyond
Clinical prediction models:development, validation and beyondClinical prediction models:development, validation and beyond
Clinical prediction models: development, validation and beyondMaarten van Smeden
 
Choosing Regression Models
Choosing Regression ModelsChoosing Regression Models
Choosing Regression ModelsStephen Senn
 
An introduction to Bayesian Statistics using Python
An introduction to Bayesian Statistics using PythonAn introduction to Bayesian Statistics using Python
An introduction to Bayesian Statistics using Pythonfreshdatabos
 
Dichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianDichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianLaure Wynants
 
The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...StephenSenn2
 
Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Maarten van Smeden
 
A gentle introduction to AI for medicine
A gentle introduction to AI for medicineA gentle introduction to AI for medicine
A gentle introduction to AI for medicineMaarten van Smeden
 
Soccer technical director performance appraisal
Soccer technical director performance appraisalSoccer technical director performance appraisal
Soccer technical director performance appraisalbethanygriffin174
 
Introduction to Uplift Modelling
Introduction to Uplift ModellingIntroduction to Uplift Modelling
Introduction to Uplift ModellingPierre Gutierrez
 
Statistics and ML 21Oct22 sel.pptx
Statistics and ML 21Oct22 sel.pptxStatistics and ML 21Oct22 sel.pptx
Statistics and ML 21Oct22 sel.pptxEwout Steyerberg
 

What's hot (20)

Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?Algorithm based medicine: old statistics wine in new machine learning bottles?
Algorithm based medicine: old statistics wine in new machine learning bottles?
 
Real world modified
Real world modifiedReal world modified
Real world modified
 
Machine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctorsMachine learning versus traditional statistical modeling and medical doctors
Machine learning versus traditional statistical modeling and medical doctors
 
UMC Utrecht AI Methods Lab
UMC Utrecht AI Methods LabUMC Utrecht AI Methods Lab
UMC Utrecht AI Methods Lab
 
Development and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutionsDevelopment and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutions
 
Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...Developing and validating statistical models for clinical prediction and prog...
Developing and validating statistical models for clinical prediction and prog...
 
Predictimands
PredictimandsPredictimands
Predictimands
 
Clinical prediction models: development, validation and beyond
Clinical prediction models:development, validation and beyondClinical prediction models:development, validation and beyond
Clinical prediction models: development, validation and beyond
 
Choosing Regression Models
Choosing Regression ModelsChoosing Regression Models
Choosing Regression Models
 
An introduction to Bayesian Statistics using Python
An introduction to Bayesian Statistics using PythonAn introduction to Bayesian Statistics using Python
An introduction to Bayesian Statistics using Python
 
Dichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatisticianDichotomania and other challenges for the collaborating biostatistician
Dichotomania and other challenges for the collaborating biostatistician
 
P-values in crisis
P-values in crisisP-values in crisis
P-values in crisis
 
The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...The replication crisis: are P-values the problem and are Bayes factors the so...
The replication crisis: are P-values the problem and are Bayes factors the so...
 
Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...
 
A gentle introduction to AI for medicine
A gentle introduction to AI for medicineA gentle introduction to AI for medicine
A gentle introduction to AI for medicine
 
Soccer technical director performance appraisal
Soccer technical director performance appraisalSoccer technical director performance appraisal
Soccer technical director performance appraisal
 
Introduction to Uplift Modelling
Introduction to Uplift ModellingIntroduction to Uplift Modelling
Introduction to Uplift Modelling
 
Statistics and ML 21Oct22 sel.pptx
Statistics and ML 21Oct22 sel.pptxStatistics and ML 21Oct22 sel.pptx
Statistics and ML 21Oct22 sel.pptx
 
Clinical prediction models
Clinical prediction modelsClinical prediction models
Clinical prediction models
 
Structural Equation Modelling (SEM) Part 3
Structural Equation Modelling (SEM) Part 3Structural Equation Modelling (SEM) Part 3
Structural Equation Modelling (SEM) Part 3
 

Similar to Why the EPV≥10 sample size rule is rubbish and what to use instead

The Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective StatisticiansThe Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective StatisticiansStephen Senn
 
Big biomedical data is a lie
Big biomedical data is a lieBig biomedical data is a lie
Big biomedical data is a liePaul Agapow
 
Statistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica CameronStatistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica CameronUser Vision
 
MLSEV Virtual. Evaluations
MLSEV Virtual. EvaluationsMLSEV Virtual. Evaluations
MLSEV Virtual. EvaluationsBigML, Inc
 

Similar to Why the EPV≥10 sample size rule is rubbish and what to use instead (6)

The Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective StatisticiansThe Seven Habits of Highly Effective Statisticians
The Seven Habits of Highly Effective Statisticians
 
Sti2018 jws
Sti2018 jwsSti2018 jws
Sti2018 jws
 
Big biomedical data is a lie
Big biomedical data is a lieBig biomedical data is a lie
Big biomedical data is a lie
 
Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)
 
Statistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica CameronStatistics for UX Professionals - Jessica Cameron
Statistics for UX Professionals - Jessica Cameron
 
MLSEV Virtual. Evaluations
MLSEV Virtual. EvaluationsMLSEV Virtual. Evaluations
MLSEV Virtual. Evaluations
 

More from Maarten van Smeden

Rage against the machine learning 2023
Rage against the machine learning 2023Rage against the machine learning 2023
Rage against the machine learning 2023Maarten van Smeden
 
Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...Maarten van Smeden
 
Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19Maarten van Smeden
 
Correcting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confoundingCorrecting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confoundingMaarten van Smeden
 
Living systematic reviews: now and in the future
Living systematic reviews: now and in the futureLiving systematic reviews: now and in the future
Living systematic reviews: now and in the futureMaarten van Smeden
 
The statistics of the coronavirus
The statistics of the coronavirusThe statistics of the coronavirus
The statistics of the coronavirusMaarten van Smeden
 
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...Maarten van Smeden
 
ML and AI: a blessing and curse for statisticians and medical doctors
ML and AI: a blessing and curse forstatisticians and medical doctorsML and AI: a blessing and curse forstatisticians and medical doctors
ML and AI: a blessing and curse for statisticians and medical doctorsMaarten van Smeden
 
Measurement error in medical research
Measurement error in medical researchMeasurement error in medical research
Measurement error in medical researchMaarten van Smeden
 
The basics of prediction modeling
The basics of prediction modeling The basics of prediction modeling
The basics of prediction modeling Maarten van Smeden
 
The absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problemThe absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problemMaarten van Smeden
 
Anatomy of a successful science thread
Anatomy of a successful science threadAnatomy of a successful science thread
Anatomy of a successful science threadMaarten van Smeden
 
Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?Maarten van Smeden
 

More from Maarten van Smeden (16)

Rage against the machine learning 2023
Rage against the machine learning 2023Rage against the machine learning 2023
Rage against the machine learning 2023
 
Associate professor lecture
Associate professor lectureAssociate professor lecture
Associate professor lecture
 
Algorithm based medicine
Algorithm based medicineAlgorithm based medicine
Algorithm based medicine
 
Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...Clinical prediction models for covid-19: alarming results from a living syste...
Clinical prediction models for covid-19: alarming results from a living syste...
 
Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19Prediction models for diagnosis and prognosis related to COVID-19
Prediction models for diagnosis and prognosis related to COVID-19
 
Correcting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confoundingCorrecting for missing data, measurement error and confounding
Correcting for missing data, measurement error and confounding
 
Living systematic reviews: now and in the future
Living systematic reviews: now and in the futureLiving systematic reviews: now and in the future
Living systematic reviews: now and in the future
 
Voorspelmodellen en COVID-19
Voorspelmodellen en COVID-19Voorspelmodellen en COVID-19
Voorspelmodellen en COVID-19
 
The statistics of the coronavirus
The statistics of the coronavirusThe statistics of the coronavirus
The statistics of the coronavirus
 
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...COVID-19 related prediction models for diagnosis and prognosis - a living sys...
COVID-19 related prediction models for diagnosis and prognosis - a living sys...
 
ML and AI: a blessing and curse for statisticians and medical doctors
ML and AI: a blessing and curse forstatisticians and medical doctorsML and AI: a blessing and curse forstatisticians and medical doctors
ML and AI: a blessing and curse for statisticians and medical doctors
 
Measurement error in medical research
Measurement error in medical researchMeasurement error in medical research
Measurement error in medical research
 
The basics of prediction modeling
The basics of prediction modeling The basics of prediction modeling
The basics of prediction modeling
 
The absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problemThe absence of a gold standard: a measurement error problem
The absence of a gold standard: a measurement error problem
 
Anatomy of a successful science thread
Anatomy of a successful science threadAnatomy of a successful science thread
Anatomy of a successful science thread
 
Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?
 

Recently uploaded

Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Nistarini College, Purulia (W.B) India
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxpradhanghanshyam7136
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfSumit Kumar yadav
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Lokesh Kothari
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxjana861314
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 

Recently uploaded (20)

Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...Bentham & Hooker's Classification. along with the merits and demerits of the ...
Bentham & Hooker's Classification. along with the merits and demerits of the ...
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Cultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptxCultivation of KODO MILLET . made by Ghanshyam pptx
Cultivation of KODO MILLET . made by Ghanshyam pptx
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
Labelling Requirements and Label Claims for Dietary Supplements and Recommend...
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptxBroad bean, Lima Bean, Jack bean, Ullucus.pptx
Broad bean, Lima Bean, Jack bean, Ullucus.pptx
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 

Why the EPV≥10 sample size rule is rubbish and what to use instead

  • 1. Maarten van Smeden, PhD 2 november 2020 Why the EPV≥10 sample size rule is rubbish and what to use instead
  • 2. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead • Statistician at Julius Center for Health Sciences and Primary Care • Main interests (but not limited to): • prognostic and diagnostic modeling • measurement error • missing data Today’s topic: EPV≥10 sample size rule (aka 1 in 10 rule) has be one of the leading sample size rules in prognostic/diagnostic prediction modeling
  • 3. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 4. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 5. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Outline • The EPV≥10 rule-of-thumb: where does it come from? • Evidence the EPV≥10 rule has no rationale • Evidence that sample size is important (even if you use the fancier methods) • Actual sample size calculations for prediction models
  • 6. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Ever wondered if AD/BC gives the “best” estimate of the odds ratio? What if I told you that AD/BC is biased?
  • 7. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Let’s say we have fitted a logistic regression model to a dataset, and obtain ln 𝑝𝑝𝑖𝑖 1 − 𝑝𝑝𝑖𝑖 = 𝛼𝛼� + 𝛽𝛽̂1 𝑋𝑋1𝑖𝑖 + 𝛽𝛽̂2 𝑋𝑋2𝑖𝑖 + ⋯ + 𝛽𝛽̂𝑘𝑘 𝑋𝑋𝑘𝑘𝑖𝑖 I’m very sorry, but 𝛽𝛽̂1 is a biased estimator, and 𝛽𝛽̂2 too, …. …. actually they are all finite sample biased
  • 8. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Epidemiology text-books: • Confounding bias • Information bias • Selection bias … nothing about finite sample bias
  • 9. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Important: bias vs consistency • Consistency ≈ as sample size increases, estimate converges to truth • Bias ≈ with repeated samples, the average estimate converges to truth
  • 10. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Log(odds) is consistent but finite sample biased
  • 11. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 12. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 13. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 14. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 15. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 16. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 17. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Illustration by simulation • Simulate 4 normal covariates with equal multivariable log-odds-ratios of 2 • 1,000 simulation samples of N = 50 • Consistency: create 1,000 meta-dataset of increasing size: meta-dataset r consists of each created dataset up to r; • Bias: calculate difference estimate of exposure effect and true value for each of the created datasets up to r;
  • 18. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 19. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 20. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Average of 400 studies with N = 50 1 study with N = 20,000
  • 21. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead With decreasing sample size How we usually think
  • 22. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead With decreasing sample size But actually with odds ratios (and other ratios)
  • 23. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead The origin of the 1 in 10 rule “For EPV values of 10 or greater, no major problems occurred. For EPV values less than 10, however, the regression coefficients were biased in both positive and negative directions”
  • 24. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Source: Peduzzi et al. 1996 ?
  • 25. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead More simulation studies Citations based on Google Scholar, Oct 30 2020 citations: 5,736 “a minimum of 10 EPV […] may be too conservative” “substantial problems even if the number of EPV exceeds 10” For EPV values of 10 or greater, no major problems citations: 2,438 citations: 216
  • 26. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead More simulation studies Citations based on Google Scholar, Oct 30 2020 citations: 5,736 “a minimum of 10 EPV […] may be too conservative” “substantial problems even if the number of EPV exceeds 10” For EPV values of 10 or greater, no major problems citations: 2,438 citations: 216 !?!
  • 27. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead • Examine the reasons for substantial differences between the earlier EPV simulation studies • Evaluate a possible solution to reduce the finite sample bias
  • 28. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead • Examine the reasons for substantial differences between the earlier EPV simulation studies (simulation technicality: handling of “separation”) • Evaluate a possible solution to reduce the finite sample bias
  • 29. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead • Examine the reasons for substantial differences between the earlier EPV simulation studies (simulation technicality: handling of “separation”) • Evaluate a possible solution to reduce the finite sample bias
  • 30. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead • Firth’s ”correction” aims to reduce finite sample bias in maximum likelihood estimates, applicable to logistic regression • It makes clever use of the “Jeffries prior” (from Bayesian literature) to penalize the log-likelihood, which shrinks the estimated coefficients • It has a nice theoretical justifications, but does it work well?
  • 31. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Standard Averaged over 465 simulation conditions with 10,000 replications each
  • 32. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead StandardFirth’scorrection Averaged over 465 simulation conditions with 10,000 replications each
  • 33. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Firth’s correction Difficult? No Example R code: > require(“logistf”) > logistf(Y~X1+X2+X3+X4, firth=T, data=df) Compared to default (maxlik) logistic regression, Firth’s correction generally: • Narrower confidence intervals • Lower MSE • Better predictions* *requires adjustment of the intercept using flic=TRUE option in logistf
  • 34. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Sample issue size solved? … not quite! • Precision of regression coefficients • Variable selection and functional form • Ensure predictions are adequate
  • 35. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Sample issue size solved? … not quite! • Precision of regression coefficients • Variable selection and functional form • Ensure predictions are adequate • Why would a one-solution fits all rule-of-thumb be appropriate? • Think of sample size for a randomized clinical trial Would be odd to suggest all trials should have 100 patients in each arm?
  • 36. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead TRIPOD Item 8. Explain how the study size was arrived at Moons et al. Ann Intern Med 2015 (TRIPOD Explanation & Elaboration) “Although there is a consensus on the importance of having an adequate sample size for model development, how to determine what counts as ‘adequate’ is not clear …”
  • 37. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Why is sample size important? • We want to have a large enough sample size to develop a model that provides accurate risk predictions in new individuals from target population • Many (most?) models do not perform well when checked in new data • small sample sizes • overfitting • lack of (internal) validation • …
  • 38. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Recent example • Reviewed 232 prediction models • “All models were rated at high or unclear risk of bias” • Sample size: median 338; IQR 134 to 707 • Number of events: median 69; IQR 37 to 160 Living review, doi: 10.1136/bmj.m1328 (these numbers from a soon to appear review update)
  • 39. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Recent example • External validation 22 COVID-19 related prognostic models • Performance: poor to very poor • “Admission oxygen saturation on room air and patient age are strong predictors of deterioration and mortality among hospitalised adults with COVID-19, respectively. None of the prognostic models evaluated here offered incremental value for patient stratification to these univariable predictors.”
  • 40. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Small sample size and overfitting • Spurious predictor-outcome associations • Important predictors can be missed • Unimportant predictors can be selected • Regression coefficients too large and uncertain • Model doesn’t predict well in new data • Disappointing discrimination • Often calibration slope < 1 https://twitter.com/LesGuessing/status/997146590442799105
  • 41. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead With small N: calibration slope often < 1 Predictions too extreme
  • 42. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead “Modern” methods aim to circumvent overfitting • Penalised regression: e.g. lasso, ridge regression, elastic net • Standard regression followed by uniform (global) shrinkage • Target calibrated predicted risks in new data: shrinkage and penalty terms estimated using bootstrapping or cross-validation • Sample size problem solved?
  • 43. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead “shrinkage works on the average but may fail in the particular unique problem on which the statistician is working.” • Required shrinkage is hard to estimate • Often large uncertainty correct value to use, especially in small datasets (!)
  • 44. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead “We conclude that, despite improved performance on average, shrinkage often worked poorly in individual datasets, in particular when it was most needed. The results imply that shrinkage methods do not solve problems associated with small sample size or low number of events per variable.”
  • 45. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 46. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead
  • 47. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Our proposal • Calculate sample size that is needed to • minimise potential overfitting • estimate probability (risk) precisely • Sample size formula’s for • Continuous outcomes • Time-to-event outcomes • Binary outcomes (focus today)
  • 48. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Example • COVID-19 prognosis hospitalized patients • Composite outcome: “deterioration” (in-hospital death, ventilator support, ICU) A priori expectations • Event fraction at least 30% • 40 candidate predictor parameters • C-statistic of 0.71(conservative est) -> Cox-Snell R2 of 0.24 MedRxiv Preprint (not peer reviewed): 10.1101/2020.10.09.20209957
  • 49. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Restricted cubic splines with 4 knots: 3 degrees of freedom Note: EPV rule also calculates degrees of freedom of candidate predictors, not variables!
  • 50. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Calculate required sample size Criterion 1. Shrinkage: expected heuristic shrinkage factor, S ≥ 0.9 (calibration slope, target < 10% overfitting) Criterion 2. Optimism: Cox-Snell R2 apparent - Cox-Snell R2 validation < 0.05 (overfitting) Criterion 3: A small margin of error in overall risk estimate < 0.05 absolute error (precision estimated baseline risk) (Criterion 4: a small margin of absolute error in the estimated risks)
  • 51. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Calculation R code: > require(pmsampsize) > pmsampsize(type="b",rsquared=0.24,parameters=40,prevalence=0.3)
  • 52. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead A few alternative scenarios • rsquared=0.24,parameters=40,prevalence=0.3 -> EPV≥9.7 • rsquared=0.12,parameters=40,prevalence=0.3 -> EPV≥21.0 • rsquared=0.12,parameters=40,prevalence=0.5 -> EPV≥35.0 • rsquared=0.36,parameters=40,prevalence=0.2 -> EPV≥5
  • 53. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead The sample size that meets all criteria is the MINIMUM required • Why minimum? Other criteria may be important e.g. missing data, clustering, variable selection • May raise required sample size further • Simulation based approaches Preprint (not peer reviewed) doi: 10.21203/rs.3.rs-87100/v1
  • 54. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Summary • Default logistic regression produces finite sample biased estimates • Finite sample bias can be substantial; easily solved using Firth’s correction • “Modern” approaches (e.g. Firth, Lasso, Ridge) no compensation for low N • New sample size criteria to replace the one-size-fits-all EPV≥10 rule
  • 55. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead https://www.prognosisresearch.com/ New website by Richard Riley and Kym Snell
  • 56. M.vanSmeden@umcutrecht.nl | Twitter: @MaartenvSmedenWhy the EPV ≥ 10 rule is rubbish and what to use instead Work in collaboration with: • Carl Moons • Hans Reitsma • Richard Riley (Keele, materials for this presentation) • Gary Collins (Oxford) • Ben Van Calster (Leuven) • Ewout Steyerberg (Leiden) • Rishi Gupta (UCL) • Many others Contact: M.vanSmeden@umcutrecht.nl