Few basic Epidemiology terminologies

Epidemiology Terminologies
Dr Venkatesh Karthikeyan
03/10/2021
www.drvenkateshkarthikeyan.com

• Accuracy
• Repeatability
• Validity
• Sensitivity
• Specificity
• AUC ROC curve
• Kappa statistics
• Correlation

1
2
3
4
5
Validity
Reliability
Accuracy
Sensitivity
Specificity
Basic requirements of measurements

Accuracy
• It refers to the closeness with which measured value agree with
“true” values.

Repeatability
• Sometimes called reliability, precision or reproducibility
• The test must give consistent results when repeated more than once
on the same individual or material

Validity
• Validity refers to what extent the test accurately measures which it
purports to measure.
• Validity expresses the ability of a test to separate or distinguish those
who have the disease from those who do not.
• E.g: Glycosuria is a useful screening test for diabetes, but a more valid or accurate
test is glucose tolerance test

Components of validity
• Sensitivity
• Ability of a test to identify correctly all those who have the disease.
• i.e., ability to identify “True positive”
• Specificity
• Ability of a test to identify correctly those who do not have the disease
• i.e., ability to identify “True negative”

Components of validity
Sensitivity and Specificity
Screening test
results
Diseased Not diseased Total
Positive a (True positive) b (False positive) a+b
Negative c (False negative) d (True negative) c+d
Total a+c b+d a+b+c+d
•Sensitivity = a/(a+c) * 100
•Specificity = d/(b+d) * 100
•Positive predictive value = a/(a+b) * 100
•Negative predictive value = d/(c+d) * 100

AUC – ROC curve
• The Receiver Operator Characteristic (ROC) curve is an evaluation
metric for binary classification problems.
• It is a probability curve that plots the TPR against FPR at various
threshold values
• The Area Under the Curve (AUC) is the measure of the ability of a
classifier to distinguish between classes and is used as a summary of
the ROC curve.
• The higher the AUC, the better the performance of the model at
distinguishing between the positive and negative classes.

When AUC = 1, then the classifier is able to perfectly distinguish between
all the Positive and the Negative class points correctly. If, however, the
AUC had been 0, then the classifier would be predicting all Negatives as
Positives, and all Positives as Negatives.

When 0.5<AUC<1, there is a high chance that the classifier will
be able to distinguish the positive class values from the negative
class values. This is so because the classifier is able to detect
more numbers of True positives and True negatives than False
negatives and False positives.

• When AUC=0.5, then the classifier is not able to distinguish between
Positive and Negative class points. Meaning either the classifier is
predicting random class or constant class for all the data points.

• So, the higher the AUC value for a classifier, the better its ability to
distinguish between positive and negative classes.

• The kappa statistic is frequently used to test interrater reliability.
• The importance of rater reliability lies in the fact that it represents the
extent to which the data collected in the study are correct
representations of the variables measured.
• Measurement of the extent to which data collectors (raters) assign the
same score to the same variable is called interrater reliability.

• While there have been a variety of methods to measure interrater
reliability, traditionally it was measured as percent agreement,
calculated as the number of agreement scores divided by the total
number of scores.
• kappa can range from −1 to +1
• Cohen’s suggested interpretation may be too lenient for health related
studies because it implies that a score as low as 0.41 might be
acceptable

• Correlation is a statistical method used to assess a possible linear
association between two continuous variables.
• Correlation is used to refer to an association, connection, or any form
of relationship, link or correspondence.
• Correlation is measured by a statistic called the correlation coefficient,
which represents the strength of the putative linear association
between the variables in question.

• A correlation coefficient of zero indicates that no linear relationship
exists between two continuous variables, and a correlation coefficient
of −1 or +1 indicates a perfect linear relationship.
• The strength of relationship can be anywhere between −1 and +1.
• The stronger the correlation, the closer the correlation coefficient
comes to ±1.

• Correlation coefficients are used to assess the strength and direction
of the linear relationships between pairs of variables.
• When both variables are normally distributed use Pearson's correlation
coefficient, otherwise use Spearman's correlation coefficient.

• Correlation coefficients do not communicate information about
whether one variable moves in response to another.
• There is no attempt to establish one variable as dependent and the
other as independent.
• Thus, relationships identified using correlation coefficients should be
interpreted for what they are: associations, not causal relationships.

References:
• https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3900052/
• https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3576830/

Few basic Epidemiology terminologies

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Few basic Epidemiology terminologies

Similar to Few basic Epidemiology terminologies (20)

More from Dr Venkatesh Karthikeyan

More from Dr Venkatesh Karthikeyan (20)

Recently uploaded

Recently uploaded (20)

Few basic Epidemiology terminologies

Editor's Notes