Seminar on Bias, Confounding, and Interaction.pptx

Bias
Confounding
Interaction
Presenters:
Dr Isaac & Dr Rishikanta
Moderator:
Prof Brogen Singh Akoijam

Bias
Confounding
Interaction
Presenters:
Dr Isaac & Dr Rishikanta
Moderator:
Prof Brogen Singh Akoijam
Outline
Introduction
Selection bias
Information bias
Confounding
Interaction
Mediator

Introduction
Any study is vulnerable to two types of errors
• Random error
 Due to chance
 Increasing sample size can reduce
• Systematic error (also called bias)
 Consistent, repeatable error in flawed design
 Can be attributed to a cause and not by chance
 Often cannot be controlled by statistical analysis

Example
• Checking BP of 10,000 people
• Known population mean (For e.g., 130mmHg)
Population mean
Random error
Systematic error

What is Bias?
• Any systematic error in the design, conduct or
analysis of a study that results in a mistaken
estimate of outcome variable
• Any trend in the collection, analysis, interpretation, publication or
review of data, that can lead to conclusions that are systematically
different from the truth (John M Last 2011)

Classification of bias
1 Selection bias
2 Information bias

SELECTION BIAS
• Error introduced when the study population
does not represent the target population
• Can be introduced during
 Design, due to
 bad definition of the eligible population
 lack of accuracy of sampling frame
 uneven diagnostic procedures
 Implementation

Selection bias due to inappropriate definition of
eligible population
• Healthcare access bias
• Neyman bias
• Spectrum bias
• Healthy worker effect
• Berkson’s bias
• Exclusion bias

Healthcare access bias
Patients admitted to an institution do not represent the cases
originated in the community
 Popularity bias
 Centripetal bias
 Referral filter bias
 Diagnostic/treatment access bias

Neyman bias
• Also called prevalence-incidence bias or selective survival bias
• Both cross-sectional and case-control studies
• Gap in time occurs between exposure & selection of participants
• In studies of diseases that are quickly fatal, transient or
subclinical
• Introduced as a result of selective survival among prevalent
cases

Example:
• A case-control study investigating pneumonia that only
enrolls cases and controls admitted to a hospital
• Those with pneumonia who died prior to admission will not be
included the sample
• The selected sample will, therefore, include moderately severe
cases, but not fatal cases

Spectrum bias
• In the assessment of validity of a diagnostic test
• Bias is produced when researchers included only ‘‘clear” or
‘‘definite” cases
• E.g., In a study investigating the ability of MR imaging to detect
cirrhosis, if only advanced clinical cases are included the
sensitivity will be overestimated

Healthy worker effect
• Lower mortality observed in the employed population when
compared with the general population
• Any excess risk associated with an occupation will tend to be
underestimated by a comparison with general population

Berkson’s bias
• Arises when the study population is selected from a specific
subpopulation, such as hospital
• Individuals in the hospital population more likely to have both
exposure & disease
• Can lead to spurious associations between exposure and
disease

• Sackett, 1979: analysed data from 257 hospitalized individuals
• Detected association between locomotor & respiratory disease
(OR 4.06)
• Repeated analysis in 2783 individuals from general population,
no association (OR 1.06)
• Original analysis of hospitalized individuals was biased because
both diseases caused individuals to be hospitalized
• By looking only within the stratum of hospitalized individuals,
observed distorted association

Exclusion bias
• Controls with conditions related to the exposure are excluded,
whereas cases with these diseases as comorbidities are kept
• E.g., Reserpine and breast cancer: controls with cardiovascular
disease were excluded but this criterion was not applied to cases
• This yielded a spurious association between reserpine and
breast cancer

Selection bias due to lack of accuracy of sampling
frame
Non-random sampling bias
This selection procedure can yield a nonrepresentative sample
in which a parameter estimate differs from the existing at the
target population

Selection bias due to uneven diagnostic
procedures in the target population
Diagnostic suspicion bias
Unmasking (detection signal) bias
Mimicry bias

Diagnostic suspicion bias
• Suspicions of conditions could influence how quickly people are
investigated, which can affect rates of diagnosis
• Diagnostic test accuracy studies that include selected patients
because they are more likely to have the condition based on
clinical suspicion typically overestimate the accuracy of the test

Unmasking (detection signal) bias
• Some exposures cause people to be given a diagnosis earlier,
and these might not be causes of the disease
• If a medication can cause vaginal bleeding
people with this symptom go sooner to the doctor
receive earlier or more intensive examination
investigations to diagnose cancer
it may appear that the medication caused the cancer

Mimicry bias
• When there is condition mimicking the disease, it could lead to
false conclusions about the causes of the disease of interest
• E.g., Sackett 1979 – oral contraceptive & hepatitis

Selection bias during study implementation
Losses/withdrawals to follow up
Non-response bias
Healthy volunteer effect

Withdrawal/Lost to follow-up (Attrition bias)
• Losses/withdrawals are uneven in both the exposure and
outcome categories
• E.g., trial to evaluate effectiveness of new medication for
disease
100 each in treatment & control group
30 dropout in treatment group, 10 in control group
If dropouts in treatment group experience more severe
side effects  underestimation of true adverse effects

Non-response bias
• Non-responders from a sample differ in a meaningful way to
responders
• E.g., those with poorer health tend to avoid taking part in health
surveys and those who do take part report better health status
and behaviours (healthy volunteer effect)

INFORMATION BIAS
• Occurs during data collection
• Flaw in measuring exposure or outcome
variable that results in different quality
(accuracy) of information
• Three main types
 Misclassification bias
 Ecological fallacy
 Regression to the mean

Misclassification bias
• Individuals are assigned to a different category than the one
they should be in
• Can lead to incorrect associations between assigned categories
and outcomes of interest
• Two types:
1. Differential or non-random
2. Non-differential or random

Differential / Non-random misclassification bias
Recall bias
• Person with disease/outcome tend to
recall exposure better
• Differential memory for the exposure
in the cases relative to the controls
• More likely to misclassify the
exposure in the controls than in the
cases
Case-control
Cases
Birth defect
Controls
Exposure? Exposure?

Surveillance bias
• More testing among exposure
group, leading to more detection
• Misclassify non-exposure group
as having less disease
• Also called detection bias
Cohort
Exposure
Smoking
Non-smokers
Emphysema? Emphysema?

Non-differential / random misclassification bias
• Exposure and disease equally misclassified
• Impact: dilution of effect, estimates become closer to null
Case-control
Cases Controls
Exposure? Exposure?
Cohort
Exposed Non-exposed
Emphysema? Emphysema?

Effect of non-differential misclassification bias
Correct classification
Heart attack
Yes No
High
fat diet
Ye
s
250 100
No 450 900
𝑅𝑅 =
250 350
450 1350
=
0.71
0.33
= 2.16
Suppose there is non-differential misclassification (20% No  Yes)
Heart attack
Yes No
High
fat diet
Ye
s
340 280
No 360 620
𝑅𝑅 =
340 620
360 980
=
0.55
0.37
= 1.49
20%

Other biases producing misclassification
• Observer/Interviewer bias
 Systematic difference between a true value and the value
observed due to observer variation
• Reporting bias
 Social desirability bias

Ecological fallacy
• Analyses realised in an ecological (group level) analysis are used
to make inferences at the individual level
• E.g., higher prevalence of disease does not necessarily imply
that individuals have higher risk
• E.g., Boys score better in maths than girls is a group
generalisation

Regression to mean
• Variables that are initially extreme tend to move closer to the
average on subsequent measurements
• E.g., effectiveness of new BP medication
• Initial readings high BP
• Subsequent measurements  lower BP
• Overestimating effectiveness of drug if regression to
mean not considered

Other information biases
Hawthorn effect
Lead-time bias
Protopathic bias
Temporal ambiguity
Will Rogers phenomenon
Verification bias

Hawthorn effect
• People behave differently because they know they are being
watched
• E.g., A survey of smoking by watching people during work
breaks might lead to observing much lower smoking rates than is
genuinely representative of the population under study

Lead time bias
• Survival time will appear to be longer in screen-detected people

Protopathic bias
• Occurs when the applied treatment for a disease or outcome
appears to cause the outcome
• E.g., patients may take NSAIDS to relieve pain prior to the date
of diagnosis of the condition
• This may cause biased results, which could be misinterpreted as
a reverse causality effect whereby the drug causes the disease

Will Rogers phenomenon
• Improvement in diagnostic tests refines disease staging in
diseases such as cancer
• This produces a stage migration from early to more advanced
stages and an apparent higher survival
• This bias is relevant when comparing cancer survival rates
across time or even among centres with different diagnostic
capabilities

Verification bias
• Occurs when there is a difference in testing strategy between
groups of individuals
• E.g., D-dimer testing for diagnosing pulmonary embolism
• positive D-dimer: ventilation–perfusion scans
• negative D-dimer: routine clinical follow up
• asymptomatic pulmonary embolisms but negative D-dimer
results may not have been diagnosed by routine follow up

The Latin confundere – to mix together
“Confounding is confusion, or mixing, of effects;
the effect of the exposure is mixed together with
the effect of another variable, leading to bias”
(Rothman, 2002)

CRITERIA
• It must be associated with both exposure and
outcome
• It is independently capable of giving the outcome
• It does not lie in the causal pathway
• It must be distributed unequally among
the groups being compared

EFFECTS OF CONFOUNDING
• An apparent association despite no real association
• An apparent absence of association despite a real existing
association
• May cause an overestimate of the true association (positive
confounding) or an underestimate of the association (negative
confounding)

IDENTIFYING CONFOUNDING
• Compare the estimated measure of association before and after
adjusting for confounding
• Determine whether a potential confounding variable is
associated with the exposure and also with the outcome
• Perform formal tests of hypothesis

RESIDUAL
CONFOUNDING
• Distortion that remains after
controlling for confounding in the
design and / or analysis of a
study
Coffee
drinking
Heart
health
Age, gender, smoking
Physical activity

• Unknown confounders or data on
these factors were not collected
• Control for confounding was not tight
enough
• Many errors in the classification of
subjects with respect to confounding
variables

Distortion that modifies
association between
exposure and outcome,
caused by the
presence of an
indication for the
exposure
Anti-
depressant
drug
Infertility
Depression
CONFOUNDING BY INDICATION

TIME VARYING CONFOUNDERS
Variables that changes its value over time
New
exercise
program
Weight loss
Physical activity

WAYS TO CONTROL CONFOUNDING
Design phase
• Randomization
• Restriction
• Matching
Data analysis
• Stratification
• Regression

RANDOMIZATION
Allocation of participants to
two or more treatment groups
that gives equal chance of
being in any treatment group

RESTRICTION
• Including the study participants of a
certain confounder category, thereby
eliminating its confounding effect
• Limitation
Reduces sample size
Residual confounding
Limits generalizability

MATCHING
•Pair each exposed subject with an
unexposed subject that shares the same
characteristic regarding the variable that we
want to control for
•Limitation
Time consuming
Limits sample size

STRATIFICATION
Involves estimating association
between exposure and outcome
at different categories of the
confounding factor

Age
(confounder)
<50 years ≥50 years
Estimate and compare the relationship between exposure
and outcome in both strata and also with the crude estimate

CVD NO
CVD
TOTAL
Active 48 800 848
Not
active
69 625 694
Crude RR
=(48/848)/(69/694)
=0.57
<50 yrs
CVD NO
CVD
Active 25 600 625
Not
active
11 225 236
≥50 yrs
CVD NO
CVD
Active 23 200 223
Not
active
58 400 458
RR<50yr=0.8
6
RR≥50yr=0.8
1

Study
phase
Method
Control
known
confounders
Control
unknown
confounders
Control time
varying
confounders
Design
Randomization YES YES YES
Restriction YES NO NO
Matching YES NO NO
Analysis
Stratification YES NO NO
Regression YES NO NO

INTERACTION
(EFFECT
MODIFICATION)

• “When the incidence rate of disease in the presence of two
or more risk factors differ from the incidence rated expected
to result from their individual effects”
(MacMahon)
• The association between exposure and outcome is different
at different levels of 3rd variable (effect modifier)

Smoking Lung cancer
Socio-economic status
(effect modifier)
Modifies
the effect

• Effect can be
Synergism
Antagonism
• To detect, stratified analysis is used
The stratum specific estimates are different

Confounding Interaction
Distortion of the association
between an exposure and
outcome by a 3rd variable
Effect of 1 explanatory variable
on the outcome depends on the
level of another variable
Variables are not dependent on
each other
Variables are dependent on
each other
Needs to remove the effect Needs to report the effect

• Shows the connection between two
variables, explains the process in which
two variables relate
• Conditions
The independent variable must cause
or predict the mediator
The mediator must influence the
dependent variable

Example
Sleep quality Work quality
Alertness
predicts influence

Seminar on Bias, Confounding, and Interaction.pptx

Recommended

Recommended

More Related Content

Similar to Seminar on Bias, Confounding, and Interaction.pptx

Similar to Seminar on Bias, Confounding, and Interaction.pptx (20)

More from IsaacLalrawngbawla1

More from IsaacLalrawngbawla1 (11)

Recently uploaded

Recently uploaded (20)

Seminar on Bias, Confounding, and Interaction.pptx

Editor's Notes