SlideShare a Scribd company logo
1 of 45
The Rothamsted
School The analysis
of designed
experiments and
the legacy of Fisher,
Yates and Nelder
Stephen Senn
Stephen Senn 2022
Outline
Part I (Not so technical )
• The roots of modern statistics
• Small data
• Careful design of experiments
• Some examples of problems with
judging causality from associations
in the health care field
• Two different objectives of clinical
trials
Part II (More technical )
• Design
• The Rothamsted (Genstat)
approach
• Some statistical issues
• Conclusion
Stephen Senn 2022
Part I
Less technical matter to do with history of statistics and basic ‘philosophical’
considerations
Stephen Senn 2022
Stephen Senn 2022
John Nelder & Michael Healy
Stephen Senn 2022
William Sealy Gosset
1876-1937
• Born Canterbury 1876
• Educated Winchester and Oxford
• First in mathematical moderations 1897
and first in degree in Chemistry 1899
• Starts with Guinness in 1899 in Dublin
• Autumn 1906-spring 1907 with Karl
Pearson at UCL
• 1908 publishes ‘The probable error of a
mean’
• First method available to judge
‘significance’ in small samples
Stephen Senn 2022
Ronald Aylmer Fisher
1890-1962
• Most influential statistician ever
• Also major figure in evolutionary
biology
• Educated Harrow and Cambridge
• Statistician at Rothamsted agricultural
station 1919-1933
• Developed theory of small sample
inference and many modern concepts
• Likelihood, variance, sufficiency, ANOVA
• Developed theory of experimental
design
• Blocking, Randomisation, Replication,
Small data challenges
Situation Problem Solution
Sample size small Too few data to estimate variance
adequately
Develop small sample test
(Student)
Experimental material not
homogenous
Dealing with variability Blocking and randomisation
(Fisher)
Limited time (1) How to study more than one thing Complex treatment structure
factorial experiments (Fisher, Yates)
Limited time (2) How to study very many factors Fractional factorials. (Yates)
Experimental material varies at
different levels
Some treatments can be varied at
lowest level but not all
General balance approach to
analysis (Nelder)
Stephen Senn 2022
Characteristics of development of statistics in
the first half of the 20th century
• Numerical work was arduous and long
• Human computers
• Desk calculators
• Careful thought as to how to perform a calculation paid dividends
• Much development of inferential theory for small samples
• Design of experiments became a new subject in its own right developed by
statisticians
• Orthogonality
• Made calculation easier (eg decomposition of variance terms in ANOVA)
• Increased efficiency
• Randomisation
• “Guaranteed” properties of statistical analysis
• Dealt with hidden confounders
• Factorial experimentation
• Efficient way to study multiple influences
Stephen Senn 2022
The Rothamsted School
Stephen Senn 2022
RA Fisher
1890-1962
Variance, ANOVA
Randomisation, design,
significance tests
Frank Yates
1902-1994
Factorials, recovering
Inter-block information
John Nelder
1924-2010
General balance, computing
Genstat®
and Frank Anscombe, David Finney, Rosemary Bailey, Roger Payne etc
Stephen Senn 2022
General Balance
• An idea of John Nelder’s
• Two papers in the Proceedings of the Royal Society, 1965 concerning
“The analysis of randomized experiments with orthogonal block
structure”
• Block structure and the null analysis of variance
• Treatment structure and the general analysis of variance
Stephen Senn 2022
Basic Idea
• Splits an experiment into two radically different components
• The block structure, which describes the way that the experimental units are
organised
• The way that variation amongst units can be described
• Null ANOVA – an idea of Anscombe’s
• The treatment structure, which reflects the way that treatments are
combined for the scientific purpose of the experiment
Stephen Senn 2022
Design Driven Modelling
• Together with a third piece of information, the design matrix, these
determine the analysis of variance
• Note that because both block and treatments structure can be hierarchical
such a design matrix is not, on its own sufficient to derive an ANOVA
• But together with John’s block and treatment structure it is
• For designs exhibiting general balance
• This approach is incorporated in Genstat®
An Example
• Incomplete blocks cross-over
design comparing three
treatments
• Placebo
• Formoterol 12 g
• Formoterol 24 g
• Patients treated in two periods
only
• 24 patients randomised to one
of six sequences
• Four per sequence
Patients per sequence and treatment
Sequence Placebo F12 F24
PF12 4 4
F12P 4 4
PF24 4 4
F24P 4 4
F12F24 4 4
F24F12 4 4
Stephen Senn 2022
Skeleton Analysis of Variance
BLOCK Sequence/Patient
TREATMENT Treatment
ANOVA
Analysis of variance
Source of variation d.f.
Sequence stratum
Treatment 2
Residual 3
Sequence.Patient stratum
18
Sequence.Patient.*Units* stratum
Treatment 2
Residual 22
Total 47
Stephen Senn 2022
Causal versus predictive inference
• Clinical trials can be used to try and answer a number of very
different questions
• Two examples are
• Did the treatment have an effect in these patients?
• A causal purpose
• What will the effect be in future patients?
• A predictive purpose
• Unfortunately, in practice, an answer is produced without stating
what the question was
• Given certain assumptions these questions can be answered using the
same analysis but the assumptions are strong and rarely stated
Stephen Senn 2022
Two models
Predictive
• The population is taken to be ‘patients in
general’
• Of course this really means future
patients
• They are the ones to whom the
treatment will be applied
• We treat the patients in the trial as an
appropriate selection from this population
• This does not require them to be typical
but it does require additivity of the
treatment effect
Causal
• We take the patients as fixed
• We want to know what the effect was for
them
• Unfortunately there are missing
counterfactuals
• What would have happened to control
patients given intervention and vice-versa
• The population is the population of all
possible allocations to the patients studied
Stephen Senn 2022
Coverage probabilities for two questions
Predictive Causal
Stephen Senn 2022
60 trials
Part II
Technical matters to do with design and inference
Stephen Senn 2022
Trial in asthma
Basic situation
• Two beta-agonists compared
• Zephyr(Z) and Mistral(M)
• Block structure has several levels
• Different designs will be investigated
• Cluster
• Parallel group
• Cross-over Trial
• Each design will be blocked at a different
level
• NB Each design will collect
6 x 4 x 2 x 7 = 336 measurements of Forced
Expiratory Volume in one second (FEV1)
Block structure
Level Number
within higher
level
Total
Number
Centre 6 6
Patient 4 24
Episodes 2 48
Measurements 7 336
Stephen Senn 2022
Block structure
• Patients are nested with centres
• Episodes are nested within patients
• Measurements are nested within
episodes
• Centres/Patients/Episodes/Measurements
Stephen Senn 2022
Measurements not shown
Possible designs
• Cluster randomised
• In each centre all the patients either receive Zephyr (Z) or Mistral (M) in both
episodes
• Three centres are chosen at random to receive Z and three to receive M
• Parallel group trial
• In each centre half the patients receive Z and half M in both episodes
• Two patients per centre are randomly chosen to receive Z and two to receive
M
• Cross-over trial
• For each patient the patient receives M in one episode and Z in another
• The order of allocation, ZM or MZ is random
Stephen Senn 2022
Stephen Senn 2022
Stephen Senn 2022
Stephen Senn 2022
Null (skeleton) analysis of variance with Genstat ®
Code Output
Stephen Senn 2022
BLOCKSTRUCTURE Centre/Patient/Episode/Measurement
ANOVA
Full (skeleton) analysis of variance with Genstat ®
Additional Code Output
Stephen Senn 2022
TREATMENTSTRUCTURE Design[]
ANOVA
(Here Design[] is a pointer with values corresponding
to each of the three designs.)
The bottom line
• The approach recognises that things vary
• Centres, patients episodes
• It does not require everything to be balanced
• Things that can be eliminated will be eliminated by design
• Cross-over trial eliminates patients and centres
• Parallel group trial eliminates centres
• Cluster randomised eliminates none of these
• The measure of uncertainty produced by the analysis will reflected what
cannot be eliminated
• This requires matching the analysis to the design
• Note that Genstat® deals with this formally and automatically. Other
packages do not.
Stephen Senn 2022
Stephen Senn 2022
To call in the statistician after
the experiment is done may be
no more than asking him to
perform a post-mortem
examination: he may be able
to say what the experiment
died of
RA Fisher
The Shocking Truth
• The validity of conventional analysis of randomised trials does not
depend on covariate balance
• It is valid because they are not perfectly balanced
• An allowance is already made for things being unbalanced
• If they were balanced the standard analysis would be wrong
• Like an insurance broker forbidding you to travel abroad in the policy but
calculating your premiums on the assumption that you will
• This accounts for unobserved covariates. What happens when they
are observed?
Stephen Senn 2022
Stephen Senn 2022
• Two dice are rolled
– Red die
– Black die
• You have to call correctly the probability of a total score of 10
• Three variants
– Game 1 You call the probability and the dice are rolled
together
– Game 2 the red die is rolled first, you are shown the score
and then must call the probability
– Game 3 the red die is rolled first, you are not shown the
score and then must call the probability
Game of Chance
Stephen Senn 2022
Total Score when Rolling Two Dice
Variant 1. Three of 36 equally likely results give a 10. The probability is 3/36=1/12.
Stephen Senn 2022
Variant 2: If the red die score is 1,2 or 3, the probability of a total of10 is 0.
If the red die score is 4,5 or 6, the probability of a total of10 is 1/6.
Variant 3: The probability = (½ x 0) + (½ x 1/6) = 1/12
Total Score when Rolling Two Dice
The morals
Dice games
• You can’t treat game 2 like game 1
• You must condition on the information
received
• You must use the actual data from the red die
• You can treat game 3 like game 1
• You can use the distribution in probability
that the red die has
Inference in general
• You can’t use the random behavior of
a system to justify ignoring
information that arises from the
system
• That would be to treat game 2 like game 1
• You can use the random behavior of
the system to justify ignoring that
which has not been seen
• You are entitled to treat game 3 like game 1
Stephen Senn 2022
The difference between
mathematical and applied
statistics is that the former is full
of lemmas whereas the latter is
full of dilemmas
Stephen Senn 2022
What does the Rothamsted approach do?
• Matches the allocation procedure to the analysis. You can either
regard this as meaning
• The randomisation you carried out guides the analysis
• The analysis you intend guides the randomisation
• Or both
• Either way, the idea is to avoid inconsistency
• Regarding something as being very important at the allocation stage but not
at the analysis stage is inconsistent
• Permits you not only to take account of things seen but also to make
an appropriate allowance for things unseen
• Die analogy is that it makes sure that the game is a fair one
Stephen Senn 2022
A simulating example
• I am going to simulate 200 clinical trials
• Trials are of a bronchodilator against placebo.
• Simple randomisation of 50 patients to each arm
• I shall have values at outcome and values at baseline
• Forced expiratory volume in one second (FEV1) in mL
• Parameter settings
• True mean under placebo 2200 mL
• Under bronchodilator 2500 mL
• Treatment effect is 300 mL
• SD at outcome and baseline is 150 mL
• Correlation is 0.7
Stephen Senn 2022
Point estimates and confidence intervals
Baseline values not available (like game 1)
Stephen Senn 2022
Point estimates and 95% confidence intervals
Baseline values available (Game 2)
Stephen Senn 2022
We tend to believe “the truth is in
there”, but sometimes it isn’t and
the danger is we will find it
anyway
Stephen Senn 2022
How analysis of covariance works
• This shows ANCOVA applied to
sample 170 of the 200 simulated
• There is an imbalance at
baseline
• I have adjusted for this by fitting
two parallel lines
• The difference between the two
estimates show how an outcome
value would change for a given
baseline value if treatments
were switched
Stephen Senn 2022
Lessons for big data
• We tend to treat observational data-sets as if they were badly
randomised parallel group trials but cluster-randomised trials might
be a better analogy
• True standard errors may be much bigger than estimated ones
• See Cox, Kartsonaki & Keogh (2018) and Xiao-Li Meng (2018)
• Design matters
• Beware of dreams in which mathematics triumphs over biology
• You can be rich in data but poor in information
Stephen Senn 2022
Data Filtering Some Examples
Finding
• Oscar winners lived longer than actors who
didn’t win an Oscar
• A 20 year follow-up study of women in an
English village found higher survival amongst
smokers than non-smokers
• Transplant receivers on highest doses of
cyclosporine had higher probability of graft
rejection than on lower doses
• Left-handers observed to die younger on
average than right-handers
• Obese infarct survivors have better prognosis
than non-obese
Possible Explanation
• The longer you live the greater your
chance of winning
• The smokers were from more recent
generations. They were much younger
than non-smokers
• The anticipated transplant rejection was
the cause of the dose being increased
• In an earlier era left-handers were forced
to become right-handers
• There are two kinds of infarct: very
serious which is independent of weight
and less serious linked to obesity.
Stephen Senn 2022
Morals
• What you don’t see can be important
• Where you have not been able to run trials, biases
can be very important
• For some purposes just piling on data does not really
help
• What helps are
• Careful design
• Thinking!
Stephen Senn 2022
A big data analyst is an expert at reaching
misleading conclusions with huge data sets,
whereas a statistician can do the same with
small ones
Stephen Senn 2022
References
Stephen Senn 2022
D. R. Cox, C. Kartsonaki and R. H. Keogh (2018) Big data: Some statistical issues. Stat Probab Lett, 111-
115.
X.-L. Meng (2018) Statistical paradises and paradoxes in big data (I): Law of large populations, big
data paradox, and the 2016 US presidential election. The Annals of Applied Statistics, 685-726.
S. J. Senn (2013) Seven myths of randomisation in clinical trials. Statistics in Medicine, 1439-1450.
S. Senn (2013) A Brief Note Regarding Randomization. Perspectives in biology and medicine, 452-453.
S. J. Senn (2019) The well-adjusted statistician. Applied Clinical Trials, June 18.
https://www.appliedclinicaltrialsonline.com/view/well-adjusted-statistician-analysis-covariance-
explained
S. Senn (2019) John Ashworth Nelder. 8 October 1924—7 August 2010: The Royal Society Publishing.
A number of blogs on my blog site are also relevant: http://www.senns.uk/Blogs.html

More Related Content

What's hot

QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...GaryCollins74
 
Why I hate minimisation
Why I hate minimisationWhy I hate minimisation
Why I hate minimisationStephen Senn
 
Choosing Regression Models
Choosing Regression ModelsChoosing Regression Models
Choosing Regression ModelsStephen Senn
 
Whatever happened to design based inference
Whatever happened to design based inferenceWhatever happened to design based inference
Whatever happened to design based inferenceStephenSenn2
 
Introduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part IIntroduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part IMaarten van Smeden
 
The basics of prediction modeling
The basics of prediction modeling The basics of prediction modeling
The basics of prediction modeling Maarten van Smeden
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsMaarten van Smeden
 
Development and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutionsDevelopment and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutionsMaarten van Smeden
 
Minimally important differences
Minimally important differencesMinimally important differences
Minimally important differencesStephen Senn
 
Mathematics, Statistics and Medical Informatics
Mathematics, Statistics and Medical InformaticsMathematics, Statistics and Medical Informatics
Mathematics, Statistics and Medical InformaticsAsli Yazagan
 
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...Maarten van Smeden
 
Clinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxClinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxStephenSenn2
 
Quadrilaterals
QuadrilateralsQuadrilaterals
Quadrilateralspoonambhs
 
Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Maarten van Smeden
 
Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...Maarten van Smeden
 
Teorema di pitagora ol (2)
Teorema di pitagora ol (2)Teorema di pitagora ol (2)
Teorema di pitagora ol (2)Marcello Pedone
 
Lecture 2: Research Proposal Development
Lecture 2: Research Proposal DevelopmentLecture 2: Research Proposal Development
Lecture 2: Research Proposal DevelopmentESD UNU-IAS
 
Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?Maarten van Smeden
 

What's hot (20)

QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDIC...
 
Why I hate minimisation
Why I hate minimisationWhy I hate minimisation
Why I hate minimisation
 
Choosing Regression Models
Choosing Regression ModelsChoosing Regression Models
Choosing Regression Models
 
Whatever happened to design based inference
Whatever happened to design based inferenceWhatever happened to design based inference
Whatever happened to design based inference
 
Introduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part IIntroduction to prediction modelling - Berlin 2018 - Part I
Introduction to prediction modelling - Berlin 2018 - Part I
 
The basics of prediction modeling
The basics of prediction modeling The basics of prediction modeling
The basics of prediction modeling
 
Regression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questionsRegression shrinkage: better answers to causal questions
Regression shrinkage: better answers to causal questions
 
Development and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutionsDevelopment and evaluation of prediction models: pitfalls and solutions
Development and evaluation of prediction models: pitfalls and solutions
 
Minimally important differences
Minimally important differencesMinimally important differences
Minimally important differences
 
Mathematics, Statistics and Medical Informatics
Mathematics, Statistics and Medical InformaticsMathematics, Statistics and Medical Informatics
Mathematics, Statistics and Medical Informatics
 
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
Shrinkage in medical prediction: the poor man’s solution for an inadequate sa...
 
Clinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxClinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptx
 
Quadrilaterals
QuadrilateralsQuadrilaterals
Quadrilaterals
 
Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...Guideline for high-quality diagnostic and prognostic applications of AI in he...
Guideline for high-quality diagnostic and prognostic applications of AI in he...
 
YMCA SWSE_ Reference letter
YMCA SWSE_ Reference letterYMCA SWSE_ Reference letter
YMCA SWSE_ Reference letter
 
Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...Improving epidemiological research: avoiding the statistical paradoxes and fa...
Improving epidemiological research: avoiding the statistical paradoxes and fa...
 
Teorema di pitagora ol (2)
Teorema di pitagora ol (2)Teorema di pitagora ol (2)
Teorema di pitagora ol (2)
 
Running a focus group
Running a focus groupRunning a focus group
Running a focus group
 
Lecture 2: Research Proposal Development
Lecture 2: Research Proposal DevelopmentLecture 2: Research Proposal Development
Lecture 2: Research Proposal Development
 
Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?Is it causal, is it prediction or is it neither?
Is it causal, is it prediction or is it neither?
 

Similar to The Rothamsted School & The analysis of designed experiments

To infinity and beyond v2
To infinity and beyond v2To infinity and beyond v2
To infinity and beyond v2Stephen Senn
 
The challenge of small data
The challenge of small dataThe challenge of small data
The challenge of small dataStephen Senn
 
Understanding randomisation
Understanding randomisationUnderstanding randomisation
Understanding randomisationStephen Senn
 
What is your question
What is your questionWhat is your question
What is your questionStephenSenn2
 
What is your question
What is your questionWhat is your question
What is your questionStephen Senn
 
To infinity and beyond
To infinity and beyond To infinity and beyond
To infinity and beyond Stephen Senn
 
Seven myths of randomisation
Seven myths of randomisation Seven myths of randomisation
Seven myths of randomisation Stephen Senn
 
Clinical trials: quo vadis in the age of covid?
Clinical trials: quo vadis in the age of covid?Clinical trials: quo vadis in the age of covid?
Clinical trials: quo vadis in the age of covid?Stephen Senn
 
Critical appraisal of randomized clinical trials
Critical appraisal of randomized clinical trialsCritical appraisal of randomized clinical trials
Critical appraisal of randomized clinical trialsSamir Haffar
 
Real world modified
Real world modifiedReal world modified
Real world modifiedStephen Senn
 
The Rothamsted school meets Lord's paradox
The Rothamsted school meets Lord's paradoxThe Rothamsted school meets Lord's paradox
The Rothamsted school meets Lord's paradoxStephen Senn
 
Thinking statistically v3
Thinking statistically v3Thinking statistically v3
Thinking statistically v3Stephen Senn
 
Big data vs the RCT - Derek Angus - SSAI2017
Big data vs the RCT - Derek Angus - SSAI2017Big data vs the RCT - Derek Angus - SSAI2017
Big data vs the RCT - Derek Angus - SSAI2017scanFOAM
 
In Search of Lost Infinities: What is the “n” in big data?
In Search of Lost Infinities: What is the “n” in big data?In Search of Lost Infinities: What is the “n” in big data?
In Search of Lost Infinities: What is the “n” in big data?Stephen Senn
 
Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student Dr. Rupendra Bharti
 
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhdChapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhdbeshahashenafe20
 
scope and need of biostatics
scope and need of  biostaticsscope and need of  biostatics
scope and need of biostaticsdr_sharmajyoti01
 
Seventy years of RCTs
Seventy years of RCTsSeventy years of RCTs
Seventy years of RCTsStephen Senn
 
Seminar 10 BIOSTATISTICS
Seminar 10 BIOSTATISTICSSeminar 10 BIOSTATISTICS
Seminar 10 BIOSTATISTICSAnusha Divvi
 

Similar to The Rothamsted School & The analysis of designed experiments (20)

To infinity and beyond v2
To infinity and beyond v2To infinity and beyond v2
To infinity and beyond v2
 
The challenge of small data
The challenge of small dataThe challenge of small data
The challenge of small data
 
Understanding randomisation
Understanding randomisationUnderstanding randomisation
Understanding randomisation
 
What is your question
What is your questionWhat is your question
What is your question
 
What is your question
What is your questionWhat is your question
What is your question
 
To infinity and beyond
To infinity and beyond To infinity and beyond
To infinity and beyond
 
Seven myths of randomisation
Seven myths of randomisation Seven myths of randomisation
Seven myths of randomisation
 
Yates and cochran
Yates and cochranYates and cochran
Yates and cochran
 
Clinical trials: quo vadis in the age of covid?
Clinical trials: quo vadis in the age of covid?Clinical trials: quo vadis in the age of covid?
Clinical trials: quo vadis in the age of covid?
 
Critical appraisal of randomized clinical trials
Critical appraisal of randomized clinical trialsCritical appraisal of randomized clinical trials
Critical appraisal of randomized clinical trials
 
Real world modified
Real world modifiedReal world modified
Real world modified
 
The Rothamsted school meets Lord's paradox
The Rothamsted school meets Lord's paradoxThe Rothamsted school meets Lord's paradox
The Rothamsted school meets Lord's paradox
 
Thinking statistically v3
Thinking statistically v3Thinking statistically v3
Thinking statistically v3
 
Big data vs the RCT - Derek Angus - SSAI2017
Big data vs the RCT - Derek Angus - SSAI2017Big data vs the RCT - Derek Angus - SSAI2017
Big data vs the RCT - Derek Angus - SSAI2017
 
In Search of Lost Infinities: What is the “n” in big data?
In Search of Lost Infinities: What is the “n” in big data?In Search of Lost Infinities: What is the “n” in big data?
In Search of Lost Infinities: What is the “n” in big data?
 
Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student Non parametric study; Statistical approach for med student
Non parametric study; Statistical approach for med student
 
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhdChapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
Chapter Seven - .pptbhhhdfhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhd
 
scope and need of biostatics
scope and need of  biostaticsscope and need of  biostatics
scope and need of biostatics
 
Seventy years of RCTs
Seventy years of RCTsSeventy years of RCTs
Seventy years of RCTs
 
Seminar 10 BIOSTATISTICS
Seminar 10 BIOSTATISTICSSeminar 10 BIOSTATISTICS
Seminar 10 BIOSTATISTICS
 

Recently uploaded

Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxTanveerAhmed817946
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 

Recently uploaded (20)

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptx
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 

The Rothamsted School & The analysis of designed experiments

  • 1. The Rothamsted School The analysis of designed experiments and the legacy of Fisher, Yates and Nelder Stephen Senn Stephen Senn 2022
  • 2. Outline Part I (Not so technical ) • The roots of modern statistics • Small data • Careful design of experiments • Some examples of problems with judging causality from associations in the health care field • Two different objectives of clinical trials Part II (More technical ) • Design • The Rothamsted (Genstat) approach • Some statistical issues • Conclusion Stephen Senn 2022
  • 3. Part I Less technical matter to do with history of statistics and basic ‘philosophical’ considerations Stephen Senn 2022
  • 4. Stephen Senn 2022 John Nelder & Michael Healy
  • 5. Stephen Senn 2022 William Sealy Gosset 1876-1937 • Born Canterbury 1876 • Educated Winchester and Oxford • First in mathematical moderations 1897 and first in degree in Chemistry 1899 • Starts with Guinness in 1899 in Dublin • Autumn 1906-spring 1907 with Karl Pearson at UCL • 1908 publishes ‘The probable error of a mean’ • First method available to judge ‘significance’ in small samples
  • 6. Stephen Senn 2022 Ronald Aylmer Fisher 1890-1962 • Most influential statistician ever • Also major figure in evolutionary biology • Educated Harrow and Cambridge • Statistician at Rothamsted agricultural station 1919-1933 • Developed theory of small sample inference and many modern concepts • Likelihood, variance, sufficiency, ANOVA • Developed theory of experimental design • Blocking, Randomisation, Replication,
  • 7. Small data challenges Situation Problem Solution Sample size small Too few data to estimate variance adequately Develop small sample test (Student) Experimental material not homogenous Dealing with variability Blocking and randomisation (Fisher) Limited time (1) How to study more than one thing Complex treatment structure factorial experiments (Fisher, Yates) Limited time (2) How to study very many factors Fractional factorials. (Yates) Experimental material varies at different levels Some treatments can be varied at lowest level but not all General balance approach to analysis (Nelder) Stephen Senn 2022
  • 8. Characteristics of development of statistics in the first half of the 20th century • Numerical work was arduous and long • Human computers • Desk calculators • Careful thought as to how to perform a calculation paid dividends • Much development of inferential theory for small samples • Design of experiments became a new subject in its own right developed by statisticians • Orthogonality • Made calculation easier (eg decomposition of variance terms in ANOVA) • Increased efficiency • Randomisation • “Guaranteed” properties of statistical analysis • Dealt with hidden confounders • Factorial experimentation • Efficient way to study multiple influences Stephen Senn 2022
  • 9. The Rothamsted School Stephen Senn 2022 RA Fisher 1890-1962 Variance, ANOVA Randomisation, design, significance tests Frank Yates 1902-1994 Factorials, recovering Inter-block information John Nelder 1924-2010 General balance, computing Genstat® and Frank Anscombe, David Finney, Rosemary Bailey, Roger Payne etc
  • 10. Stephen Senn 2022 General Balance • An idea of John Nelder’s • Two papers in the Proceedings of the Royal Society, 1965 concerning “The analysis of randomized experiments with orthogonal block structure” • Block structure and the null analysis of variance • Treatment structure and the general analysis of variance
  • 11. Stephen Senn 2022 Basic Idea • Splits an experiment into two radically different components • The block structure, which describes the way that the experimental units are organised • The way that variation amongst units can be described • Null ANOVA – an idea of Anscombe’s • The treatment structure, which reflects the way that treatments are combined for the scientific purpose of the experiment
  • 12. Stephen Senn 2022 Design Driven Modelling • Together with a third piece of information, the design matrix, these determine the analysis of variance • Note that because both block and treatments structure can be hierarchical such a design matrix is not, on its own sufficient to derive an ANOVA • But together with John’s block and treatment structure it is • For designs exhibiting general balance • This approach is incorporated in Genstat®
  • 13. An Example • Incomplete blocks cross-over design comparing three treatments • Placebo • Formoterol 12 g • Formoterol 24 g • Patients treated in two periods only • 24 patients randomised to one of six sequences • Four per sequence Patients per sequence and treatment Sequence Placebo F12 F24 PF12 4 4 F12P 4 4 PF24 4 4 F24P 4 4 F12F24 4 4 F24F12 4 4 Stephen Senn 2022
  • 14. Skeleton Analysis of Variance BLOCK Sequence/Patient TREATMENT Treatment ANOVA Analysis of variance Source of variation d.f. Sequence stratum Treatment 2 Residual 3 Sequence.Patient stratum 18 Sequence.Patient.*Units* stratum Treatment 2 Residual 22 Total 47 Stephen Senn 2022
  • 15. Causal versus predictive inference • Clinical trials can be used to try and answer a number of very different questions • Two examples are • Did the treatment have an effect in these patients? • A causal purpose • What will the effect be in future patients? • A predictive purpose • Unfortunately, in practice, an answer is produced without stating what the question was • Given certain assumptions these questions can be answered using the same analysis but the assumptions are strong and rarely stated Stephen Senn 2022
  • 16. Two models Predictive • The population is taken to be ‘patients in general’ • Of course this really means future patients • They are the ones to whom the treatment will be applied • We treat the patients in the trial as an appropriate selection from this population • This does not require them to be typical but it does require additivity of the treatment effect Causal • We take the patients as fixed • We want to know what the effect was for them • Unfortunately there are missing counterfactuals • What would have happened to control patients given intervention and vice-versa • The population is the population of all possible allocations to the patients studied Stephen Senn 2022
  • 17. Coverage probabilities for two questions Predictive Causal Stephen Senn 2022 60 trials
  • 18. Part II Technical matters to do with design and inference Stephen Senn 2022
  • 19. Trial in asthma Basic situation • Two beta-agonists compared • Zephyr(Z) and Mistral(M) • Block structure has several levels • Different designs will be investigated • Cluster • Parallel group • Cross-over Trial • Each design will be blocked at a different level • NB Each design will collect 6 x 4 x 2 x 7 = 336 measurements of Forced Expiratory Volume in one second (FEV1) Block structure Level Number within higher level Total Number Centre 6 6 Patient 4 24 Episodes 2 48 Measurements 7 336 Stephen Senn 2022
  • 20. Block structure • Patients are nested with centres • Episodes are nested within patients • Measurements are nested within episodes • Centres/Patients/Episodes/Measurements Stephen Senn 2022 Measurements not shown
  • 21. Possible designs • Cluster randomised • In each centre all the patients either receive Zephyr (Z) or Mistral (M) in both episodes • Three centres are chosen at random to receive Z and three to receive M • Parallel group trial • In each centre half the patients receive Z and half M in both episodes • Two patients per centre are randomly chosen to receive Z and two to receive M • Cross-over trial • For each patient the patient receives M in one episode and Z in another • The order of allocation, ZM or MZ is random Stephen Senn 2022
  • 25. Null (skeleton) analysis of variance with Genstat ® Code Output Stephen Senn 2022 BLOCKSTRUCTURE Centre/Patient/Episode/Measurement ANOVA
  • 26. Full (skeleton) analysis of variance with Genstat ® Additional Code Output Stephen Senn 2022 TREATMENTSTRUCTURE Design[] ANOVA (Here Design[] is a pointer with values corresponding to each of the three designs.)
  • 27. The bottom line • The approach recognises that things vary • Centres, patients episodes • It does not require everything to be balanced • Things that can be eliminated will be eliminated by design • Cross-over trial eliminates patients and centres • Parallel group trial eliminates centres • Cluster randomised eliminates none of these • The measure of uncertainty produced by the analysis will reflected what cannot be eliminated • This requires matching the analysis to the design • Note that Genstat® deals with this formally and automatically. Other packages do not. Stephen Senn 2022
  • 28. Stephen Senn 2022 To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of RA Fisher
  • 29. The Shocking Truth • The validity of conventional analysis of randomised trials does not depend on covariate balance • It is valid because they are not perfectly balanced • An allowance is already made for things being unbalanced • If they were balanced the standard analysis would be wrong • Like an insurance broker forbidding you to travel abroad in the policy but calculating your premiums on the assumption that you will • This accounts for unobserved covariates. What happens when they are observed? Stephen Senn 2022
  • 30. Stephen Senn 2022 • Two dice are rolled – Red die – Black die • You have to call correctly the probability of a total score of 10 • Three variants – Game 1 You call the probability and the dice are rolled together – Game 2 the red die is rolled first, you are shown the score and then must call the probability – Game 3 the red die is rolled first, you are not shown the score and then must call the probability Game of Chance
  • 31. Stephen Senn 2022 Total Score when Rolling Two Dice Variant 1. Three of 36 equally likely results give a 10. The probability is 3/36=1/12.
  • 32. Stephen Senn 2022 Variant 2: If the red die score is 1,2 or 3, the probability of a total of10 is 0. If the red die score is 4,5 or 6, the probability of a total of10 is 1/6. Variant 3: The probability = (½ x 0) + (½ x 1/6) = 1/12 Total Score when Rolling Two Dice
  • 33. The morals Dice games • You can’t treat game 2 like game 1 • You must condition on the information received • You must use the actual data from the red die • You can treat game 3 like game 1 • You can use the distribution in probability that the red die has Inference in general • You can’t use the random behavior of a system to justify ignoring information that arises from the system • That would be to treat game 2 like game 1 • You can use the random behavior of the system to justify ignoring that which has not been seen • You are entitled to treat game 3 like game 1 Stephen Senn 2022
  • 34. The difference between mathematical and applied statistics is that the former is full of lemmas whereas the latter is full of dilemmas Stephen Senn 2022
  • 35. What does the Rothamsted approach do? • Matches the allocation procedure to the analysis. You can either regard this as meaning • The randomisation you carried out guides the analysis • The analysis you intend guides the randomisation • Or both • Either way, the idea is to avoid inconsistency • Regarding something as being very important at the allocation stage but not at the analysis stage is inconsistent • Permits you not only to take account of things seen but also to make an appropriate allowance for things unseen • Die analogy is that it makes sure that the game is a fair one Stephen Senn 2022
  • 36. A simulating example • I am going to simulate 200 clinical trials • Trials are of a bronchodilator against placebo. • Simple randomisation of 50 patients to each arm • I shall have values at outcome and values at baseline • Forced expiratory volume in one second (FEV1) in mL • Parameter settings • True mean under placebo 2200 mL • Under bronchodilator 2500 mL • Treatment effect is 300 mL • SD at outcome and baseline is 150 mL • Correlation is 0.7 Stephen Senn 2022
  • 37. Point estimates and confidence intervals Baseline values not available (like game 1) Stephen Senn 2022
  • 38. Point estimates and 95% confidence intervals Baseline values available (Game 2) Stephen Senn 2022
  • 39. We tend to believe “the truth is in there”, but sometimes it isn’t and the danger is we will find it anyway Stephen Senn 2022
  • 40. How analysis of covariance works • This shows ANCOVA applied to sample 170 of the 200 simulated • There is an imbalance at baseline • I have adjusted for this by fitting two parallel lines • The difference between the two estimates show how an outcome value would change for a given baseline value if treatments were switched Stephen Senn 2022
  • 41. Lessons for big data • We tend to treat observational data-sets as if they were badly randomised parallel group trials but cluster-randomised trials might be a better analogy • True standard errors may be much bigger than estimated ones • See Cox, Kartsonaki & Keogh (2018) and Xiao-Li Meng (2018) • Design matters • Beware of dreams in which mathematics triumphs over biology • You can be rich in data but poor in information Stephen Senn 2022
  • 42. Data Filtering Some Examples Finding • Oscar winners lived longer than actors who didn’t win an Oscar • A 20 year follow-up study of women in an English village found higher survival amongst smokers than non-smokers • Transplant receivers on highest doses of cyclosporine had higher probability of graft rejection than on lower doses • Left-handers observed to die younger on average than right-handers • Obese infarct survivors have better prognosis than non-obese Possible Explanation • The longer you live the greater your chance of winning • The smokers were from more recent generations. They were much younger than non-smokers • The anticipated transplant rejection was the cause of the dose being increased • In an earlier era left-handers were forced to become right-handers • There are two kinds of infarct: very serious which is independent of weight and less serious linked to obesity. Stephen Senn 2022
  • 43. Morals • What you don’t see can be important • Where you have not been able to run trials, biases can be very important • For some purposes just piling on data does not really help • What helps are • Careful design • Thinking! Stephen Senn 2022
  • 44. A big data analyst is an expert at reaching misleading conclusions with huge data sets, whereas a statistician can do the same with small ones Stephen Senn 2022
  • 45. References Stephen Senn 2022 D. R. Cox, C. Kartsonaki and R. H. Keogh (2018) Big data: Some statistical issues. Stat Probab Lett, 111- 115. X.-L. Meng (2018) Statistical paradises and paradoxes in big data (I): Law of large populations, big data paradox, and the 2016 US presidential election. The Annals of Applied Statistics, 685-726. S. J. Senn (2013) Seven myths of randomisation in clinical trials. Statistics in Medicine, 1439-1450. S. Senn (2013) A Brief Note Regarding Randomization. Perspectives in biology and medicine, 452-453. S. J. Senn (2019) The well-adjusted statistician. Applied Clinical Trials, June 18. https://www.appliedclinicaltrialsonline.com/view/well-adjusted-statistician-analysis-covariance- explained S. Senn (2019) John Ashworth Nelder. 8 October 1924—7 August 2010: The Royal Society Publishing. A number of blogs on my blog site are also relevant: http://www.senns.uk/Blogs.html