SlideShare a Scribd company logo
1 of 87
Download to read offline
DATA COLLECTION AND
PRESENTATION
BY DR. POONAM NARANG
P.G 1ST YEAR
DEPT. OF PUBLIC HEALTH
DENTISTRY
DATA PRESENTATION
CONTENTS
• OBJECTIVES
• INTRODUCTION AND DEFINITION
• CLASSIFICATION OF DATA
• METHODS OF COLLECTION OF DATA
• DATA PRESENTATION
• CONCLUSION
• REFERENCES
02
LEARNING OBJECTIVES
1. To know about data.
2. To enumerate various types of data
3. To know about scales of measurement
4. To enumerate the methods for collection of data
5. To know about various methods of data presentation
03
INTRODUCTION
Data, the plural of datum, are facts expressed in numerical terms.
In statistical language, it is also called variable as it is a character, characteristics or quality that
varies.These do not convey any meaning by themselves. Hence, these have to be worked upon using a set
of statistical tools to convert them into meaningful information.
INFORMATION
STATISTICS
DATA
04
Characteristics of Data Collection
Data collection can be characterized by several important characteristics that help to ensure the
quality and accuracy of the data gathered. These characteristics include:
•Validity
•Reliability
•Objectivity
•Precision
•Timeliness
•Ethical considerations
05
ADVANTAGES OF DATA
•Better decision-making
•Improved understanding
•Evaluation of interventions
•Identifying trends and patterns
•Validation of theories
•Improved quality
06
Limitations of Data Collection
While data collection has several advantages, it also has some limitations that must be considered.
These limitations include:
•Bias
•Sampling bias.
•Cost
•Limited scope
•Ethical considerations
•Data quality issues
07
CLASSIFICATION OF DATA
A. Based on nature of variable
• Qualitative Data
• Quantitative Data
• Discrete Data
• Continuous Data
B. Based on sources
• Primary Data
• Secondary Data
D. Based on presentation
• Grouped Data
• Ungrouped Data
C. According to highest level which it fits
• Nominal Data
• Ordinal Data
• Interval Data
• Ratio data
08
•Qualitative data ~
Classification of data according to qualitative characteristics such as sex, honesty,
intelligence, literacy, colour, religion, marital status etc.
Fig-01
09
•Quantitative data -
Classification of data according to quantitative characteristics such as age,
weight, height, marks etc.
10
Fig-02
•Discrete data -
Classification of data which takes exact numerical values (whole numbers).
Eg: No of Children in a family, shoe size
11
Fig-03
•Continuous data -
Classification of data which takes numerical values within a certain range.
Eg: Weight of girl baby of one month is given as 3.8kg, but exact weight could be
between 3.2 and 5.4
12
Fig-04
• Primary data- Data which is directly collected by the researcher/investigator.
• Secondary data- Data which is not directly collected by the researcher/investigator.
Primary Quantitative Data:- Questionnaires
Structured Interviews
Secondary Quantitative Data:- Official statistics
Primary Qualitative Data:- Participant Observation
Unstructured interviews
• Secondary Qualitative Data: Letters, articles, newspapers
13
•Grouped data- Data which is presented in group Eg: Age: 20-25 (12 persons),25-30 (8 persons)…..
•Ungrouped data- Data which is presented individually
Eg: Age: 28 years, 27 years, 23 years, 25 years, 26 years.....
Another classification - acc to the highest level which it fits:-
→ Nominal - Lowest level - only names are meaningful. For ex- in a classroom student can be hindu,
muslim, christian, etc, so the student belongs to one category.
→Ordinal - Adds an order to the names. For ex- post surgical pain can be classified in to its severity: 0
means no pain, 1 means mild pain, 2 means moderate pain, 3 means severe pain.
→ Interval - Adds meaningful differences. No true zero For ex- Knoop hardness no. for composites.
→Ratio- Adds a zero so that ratios are meaningful. Has true zero or starting point. For ex- height, weight
length, etc. like twice the weight.
14
Main sources of for collection of data
A. Experiments
B. Surveys
C. Records
A. Experiments- Experiments are performed in the lab of various branches of medical sciences like physiology, biochem, pharmacology
and clinical pathology or in the hospital ward or in community.
B. Survey- Surveys are carried out for epidemiological studies in the field by trained teams
Are specially applied to generate data needed for specific purposes and comprises of primary data
Records provides readymade data for routine and continuous information which may be used for
research as secondary data
• To find the incidences or prevalence of health or diseases statistics in a community- like incidences of malaria
- prevalence of leprosy
• To identify risks factors associated with disease occurrence
• Also need in operational research such as assessment of existing conditions of a program, health services or
facility,
• Evaluating new strategies for prevention and control of health problems
15
Survey provides useful information like- A). Changing trends in health statics; morbidity; mortality; health practices etc.
B). Provide feedback to modify policy, system redefinition of objectives
C). Provide timely warning of public health hazards
C. Records- records are maintained as a routine in registers or books over a long period of time, for various purposes such as vital
statistics- births, marriage and deaths or for illness in hospitals.
There are various methods of data collection:-
Experiments
Surveys
Observation method
Interview method
Questionnaire method
Schedule method
• Other methods include warranty cards, pantry audits,
distributary audits, consumer panels, using mechanical
devices, through projective technique, depth interviews
and content analysis.
16
Data can be collected either through primary sources or secondary sources
Primary Sources- Here the data is obtained by the investigator himself. This is the first hand
information.
1. Observation method- This is the most frequently used in practice. Observation is said to be a
scientific tool and a means of data collection for the researcher.(9)
Types of Observation Methods
• Structured Observation
• Unstructured Observation
• Controlled Observation
• Uncontrolled Observation
• Participant Observation
• Non-participant Observation
• Disguised Observation
17
2. Health interview survey- It is invaluable method of measuring subjective phenomena, such as
perceived morbidity, disability and impairments; opinions, beliefs and attributes and some behavioural
characteristics.
• Direct personal investigation
• Indirect oral investigation
• Easy to conduct in urbans
• Little use in developing countries
18
18
•Structured interviews: The questions are predetermined in both topic and order.
•Semi-structured interviews: A few questions are predetermined, but other questions aren’t planned.
•Unstructured interviews: None of the questions are predetermined.
•Focussed interview: focus attention on the given experience of the respondent.
3. Questionnaire Method- Standard method of data collection in clinical, epidemiological, psychosocial
and demographic research. It is used for measuring subjective phenomena.
19
WHAT IS QUESTIONNAIRE?
"A document containing set of questions logically related to the problem under
study.”
◦If the questions are filled by respondents, then its called as 'Questionnaire'
◦If filled by enumerators, it's called as ‘Schedule’
STRUCTURED QUESTIONNAIRES
Questionnaires in which there are definite, concrete and pre-determined questions. The questions are
presented with exactly the same wording and in the same order to all respondents.
The form of the question may be either closed (i.e., of the type 'yes' or 'no') or open (i.e., inviting free
response).
UNSTRUCTURED QUESTIONNAIRES
Interviewer is provided with a general guide on type of information to be obtained. Question formulation
is his own responsibility and replies taken down in respondent's own words.
20
GUTTMAN SCALE: (Cumulative)
◦Contain a series of statements that express
increasing intensity of a characteristic.
◦Respondent is asked to agree or disagree
with each statement.
◦Respondents score is the total number of
items with which he agrees or disagrees.
The 2 types of scales most commonly used are the Likert and Guttman scales.
LIKERT SCALE : (Summative)
◦Commonly used to quantify attitudes
& behaviour.
◦Respondents are asked to select a
response that best represents the rank
or degree of their answer.
◦Eg: respondent may be asked to
indicate whether he strongly agrees,
agrees, neither, disagrees, or strongly
disagrees with the statement.
21
ADVANTAGES
◦Simple
◦Economical
◦Standardisation
◦Anonymity
DISADVANTAGES
◦Used only when respondent is educated
and cooperating
◦Usually increases rate of non-responses
◦Inflexibility
◦Time consuming
Types of Questions
OPEN ENDED (i.e., inviting free responses)
CLOSED ENDED (i.e., of the type ‘yes’ or ‘no’)
QUESTIONNAIRE
22
HOW TO CONSTRUCT A QUESTIONNAIRE
Researcher should note the following with regard to these three main aspects of a questionnaire:
General form
Question Sequence
Determine the type the Questions :
A) Direct Question
B) Indirect Question
C) Open Form Questionnaire
D) Closed Form Questionnaire
E) Dichotomous Questions
F) Multiple Choice Questions (MCQ)
23
5. SCHEDULE METHOD
◦A schedule is a structure of set of questions on a given topic which are asked by the interviewer or
investigator personally.
◦Like questionnaire but filled by enumerators who are especially appointed for filling questionnaire.
Questionnaire vs schedule
• Questionnaires generally sent
through mail and no further
assistance from sender.
• Questionnaire is cheaper method.
• Non response is high.
• In questionnaires identity of
respondent is unknown
• Very slow method
• No personal contact
• Schedule is generally filled by
enumerator or research worker
• Costly, requires field workers
• Non response is low
• In schedule identity of person is
known
• Information is collected well in
time
• Direct personal contact
24
Other Methods of Data Collection
•Warranty Cards: They are also called feedback cards. They are usually a postal size card with
some questions along with a request to the consumers to fill and return them.
•Distributor or Store Audit: This can be performed by distributers or manufacturers through their
sales representatives commonly and seasonal purchasing pattern.
•Pantry Audit: It is applied to estimate consumption of basket of goods at the consumer level.
•Consumer Panel: It is an extension of pantry audit. It is approached on a regular basis.
•Use of Mechanical Devices: Eye camera, pupilometric camera, psychogalvanometer, motion
picture camera
25
SECONDARY SOURCES
Secondary data means data that are already available i.e., they refer to the data which have already been
collected and analyzed by someone else. When the researcher utilizes secondary data, then he has to look into
various sources from where he can obtain them.
Published data
◦books, magazines and newspapers
◦reports prepared by research scholars,
universities historical documents
Unpublished data
diaries, letters, unpublished
biographies and autobiographies
26
1) Published sources
a. Reports and of
fi
cial publications of i. International bodies such as World Health Organization
ii. Central and state governments such as Census data
iii. Reports of committees and commissions appointed by government
b. Semi of
fi
cial publications of various local bodies such as municipal corporations.
C. Publications of autonomous and private institutes such as
• Trade and professional bodies.
• Financial and economic journals.
• Annual reports of companies and corporations.
• Publications brought out by various autonomous research institutes and scholars.
2) Unpublished sources:- There are various unpublished data sources such as records
maintained by various government and private agencies, studies conducted by research
institutions, scholars etc... like dissertations of medical students of health university.
27
Factors to be considered before using secondary data
Reliability of data - Who, when , which methods, at what time etc.
Suitability of data - Object ,scope, and nature of original inquiry should be studied, as
if the study was with different objective then that data is not suitable for current study.
Adequacy of data- Level of accuracy, Area differences then data is not adequate for
study.
28
Selection of proper Method for collection of Data
1. Nature ,Scope and object of inquiry
2. Availability of Funds
3. Time Factor
4.Precision Required
29
1) Census:- In India from the
fi
rst census of 1881, every 10 years census is taken. It
is de
fi
ned as "the total process of collecting, compiling and publishing
demographic, economic and social data pertaining to all persons in a country or
delimited territory at a speci
fi
ed time or times". Last census was held in March
2011. The data on age, sex, income and other basic information obtained in census
provides a base for planning, action and research in
fi
eld of medicine as well as other
sectors.
2)Registration of vital events :- In India, registration of births, deaths and marriages
is mandatory by law. This forms foundation of health and vital statistics.
3) Sample Registration System(SRS) :- It is a dual record system, consisting of
continuous enumeration of births and deaths by an enumerator and an independent
survey every 6 months by an investigator - supervisor. Due to complete coverage of
our country by SRS, we are able to get more reliable information on birth and death
rates, age speci
fi
c fertility, mortality rates and infant mortality.
SOURCES FOR COLLECTION OF DATA:
30
4) Noti
fi
cation of diseases :- It is a valuable source of morbidity data such as incidence,
prevalence and distribution of certain speci
fi
ed diseases which noti
fi
able. Diseases to be
are noti
fi
ed are different in various countries as well as states in the same country.
Cholera, plague and yellow fever are internationally noti
fi
able diseases.
5) Hospital Records :- This forms basic and primary source of information about diseases
prevalent in the community due to the fact that in India registration of vital events is faulty
and noti
fi
cation of infectious diseases is far from adequate.
Serious limitation of hospital data is that it represents only those individuals who seek
medical care and we do not know the denominator due to lack of precise boundaries of the
catchment area of al hospital. Still it gives useful information regarding time, place and
person distribution of various diseases.
6) Epidemiological Surveillance :- Special surveillance activities are conducted for
diseases like malaria, AIDS in our country. This provides considerable morbidity and
mortality data for the speci
fi
c diseases. E.g. Sentinel surveillance data.
31
7) Surveys :- Population surveys supplement routinely collected statistics. The term “health survey"
is used for surveys relating to any aspect of health-morbidty, mortality, nutritional status etc. When
main emphasis is on disease in the community the survey is labelled as "morbidity survey". These
surveys can be conducted for evaluating health status of a population, for investigation of
factors affecting health and disease or for improving administration of health services. These
surveys can be cross-sectional or longitudinal; descriptive or analytic or both. Methods used for
data collection in surveys include health interview, health examination, study of health records and
mailed questionnaires. eg. NFHS data.
8) Research Findings :- In various departments of Medical Colleges Hospitals experiments are
performed for investigations and research. Similarly in biomedical institutions & pharmaceutical
industries lot of research activities are conducted with speci
fi
c objectives. This data is useful for
planning and implementation of health activities in general. E.g. Dissertations, research papers.
32
DATA PRESENTATION
The objective of classification of data is to make the data simple, concise, meaningful and
interesting and helpful in further analysis.
DATA COLLECTED FROM VARIOUS EXPERIMENTS
COMPILATION AND CLASSIFICATION
PRESENTATION
33
Principles of presentation of data
• Data should be arranged in such a way that it will arouse interest in reader.
• The data should be made sufficiently concise without losing important details.
• The data should presented in simple form to enable the reader to form quick impressions and to
draw some conclusions, directly or indirectly.
• Should facilitate further statistical analysis.
• It should define the problem and suggest its solution.
34
The main methods of presenting frequencies of a variable or data:-
1.Textual
2. Tabulation
3. Charts and
 
Diagrams
METHODS OF PRESENTATION OF DATA:
35
TEXTUAL PRESENTATION OF DATA
In textual presentation, data are described within the text. When the quantity of data is not too large
this form of presentation is more suitable. Look at the following cases:
Case 1
In a bandh call given on 08 September 2005 protesting the hike in prices of petrol and diesel, 5 petrol
pumps were found open and 17 were closed whereas 2 schools were closed and remaining 9 schools
were found open in a town of Bihar.
Case 2
Census of India 2001 reported that Indian population had risen to 102 crore of which only 49 crore were
females against 53 crore males. Seventy-four crore people resided in rural India and only 28 crore lived in
towns or cities. While there were 62 crore non-worker population against 40 crore workers in the entire
country. Urban population had an even higher share of non-workers (19 crore) against workers (9 crore) as
compared to the rural population where there were 31 crore workers out of a 74 crore population...
In both the cases data have been presented only in the text. A serious drawback of this method of
presentation is that one has to go through the complete text of presentation for comprehension. But, it is
also true that this matter often enables one to emphasise certain points of the presentation.
Tabulation :-
It is the first step before the data is used for analysis or interpretation.
In the process of tabulation the following type of classification are encountered.
• Geographical i.e area wise
• Chronological i.e on the basis of time
• Qualitative i.e. according to attribute
• Quantitative i.e. in terms of magnitude
36
MEANING OF VARIOUS TERMS

Grouped Frequency Distribution: a frequency distribution when several numbers are grouped
in one class.

Class limits: Separates one class in a grouped frequency distribution from another. The limits
could actually appear in the data and have gaps between the upper limits of one class and lower limit
of the next.

Class boundaries: Separates one class in a grouped frequency distribution from another. The
boundaries have one more decimal places than the row data and therefore do not appear in the data.
There is no gap between the upper boundary of one class and lower boundary of the next class. The
lower class boundary is found by subtracting U/2 from the corresponding lower class limit and the
upper class boundary is found by adding U/2 to the corresponding upper class limit.

Class width: the difference between the upper and lower class boundaries of any class. It is also
the difference between the lower limits of any two consecutive classes or the difference between any
two consecutive class marks.
37

Class mark (Mid points): it is the average of the lower and upper class limits or the average of
upper and lower class boundary.

Cumulative frequency: is the number of observations less than/more than or equal to a specific
value.

Relative frequency (rf): it is the frequency divided by the total frequency.

Relative cumulative frequency (rcf): it is the cumulative frequency divided by the total
frequency.
Classi
fi
cation and tabulation are not two distinct processes but actually they go together,
classi
fi
cation is the
fi
rst step in tabulation.
38
A) Tabulation :
It is usually the
fi
rst step in presentation and analysis of data. A table can be simple or
complex depending upon the number of measurements of a single set or multiple sets of
items. Let us take an example to understand tabulation. Number of deaths due to neonatal
tetanus in 97 districts of India in one year are given below :-
70, 71, 72, 79, 84, 92,
141 70, 73, 73, 77, 79,
84, 93, 146, 74, 75, 77,
84, 70, 72, 88, 88, 141,
75, 77, 82, 93, 109, 134,
147, 79, 79, 87, 95, 107,
125, 140, 148 76, 78, 83,
106, 124, 135, 141, 71, 70,
79, 87, 97, 117, 140, 150,
160, 73, 78, 82, 103, 116,
135, 148 160, 160, 160, 160,
150, 88, 72, 76, 78, 84,
73, 80, 98, 113, 137, 157
74, 75, 81, 88, 99, 102,
113, 158, 84, 73, 76, 78,
82, 101, 113, 158, 88, 75,
108
39
• It is obvious that we can understand very little from the
fi
gures. A better way can be to arrange the
fi
gures
in an ascending or descending order, i.e. from 70 to 160, but still bulk of the data remains.
• A simpler method of reducing bulk of data can be tally mark method. In this method a vertical bar (I) is put
against the concerned number when it occurs. So if 70 occurs four times we represent it by IIII. For
fi
fth
observation, instead of a vertical bar we put a cross tally (/) on the
fi
rst four tallies. Thus we can get sets
of
fi
ve each. This representation of the data is known as frequency distribution.
• Neonatal deaths are called the variable (x) and number of districts against the neonatal deaths are
known as frequency (f) of the variable. The term 'frequency' is derived from 'how frequently' a variable
occurs.
Fig.05 example
40
In this example frequency of 73 neonatal deaths is 5 whereas frequency of 158 deaths is 3. Though this
method reduces the data to some extent, still it can not be called the best method.
In such a case to condense data further the observed range of variable can be divided in to suitable no class
intervals and no of observations in each class are recorded. Such a figure Fig. 06 showing the distribution of
frequencies in the different classes is called a frequency distribution table. And the manner in which the class
frequencies are distributed over the class intervals is called the grouped frequency distribution of the
variable.
The merits of a frequency distribution table are that,
• It shows at a glance how many individual observations are in
a group, and where the main concentration lies.
• It also shows the range, and the shape of the distribution.
41
Fig.06
Rules and guidelines for tabular presentation -
• A number should be assigned to the table (Table No.).
• A title should be given to the table, it should be concise and self explanatory.
• Contents of the table should be defined clearly.
• Subtitles should be properly mentioned with columns and rows
• Group intervals classes in columns and rows should neither be too narrow nor too wide. They should also
be mutually exclusive and non overlapping.
• Unit of measurement must be mentioned clearly where ever necessary.
• Number of classes should be neither too large nor small. There can be 10 to 20 classes. Following formula,
can be used to find out approximate number of "K" classes.
• K= 1 + 3.322 log10 N, Where N is the total frequency.
• Foot notes be given whenever necessary providing additional information, source or explanatory notes.
• Any short forms /symbols, if used should be explained in the footnote.
• No place should be left in the body of tables.
• There should be logical arrangement of data in the table.
42
43
Fig.07. PARTS OF TABLE
Fig.08
44
Below are given examples of tables.
Fig.09
45
Fig.10
1. Classification by Space (geographical) :-
• Data are classified by location of occurrence.
• Arrangement of set of categories in alphabetical order of the terms defining these categories,
• In the order of their geographical location may be found to be suitable in many case.
Fig-11
46
2. Chronological i.e. On the basis of time :-
• In this case data are classified by time of occurrence of the observations
• Arrangement of categories is almost always in chronological order
Fig-12
47
3. Classification by attribute :-
• When the data represent observations made on a qualitative characteristic the classification in such
a case is made according to this qualities.
• Alphabetical arrangement of categories may be suitable for general purpose table.
• In the case of special purpose table arrangement may be made in the order of importance of these
categories.
Fig-13
48
4. Classification by the size of observations :-
• When the data represent observations of some characteristic on a numerical scale, classification is
made on the basis of the individual observations.
• The range of observations is suitable divided into smaller divisions called class intervals.
• The numerical scale adopted may be either discrete or continuous.
Fig-14
49
Advantages of tabular presentation
• It is convenient and suf
fi
cient form for presenting the statistical information.
• It summarises the information and displays important features of it.
• Unnecessary repetitions that may appear in texts are avoided.
• Comparison between localities, age groups etc. can be made easily.
• Errors and omissions in the information can be easily detected.
• Reference to any details of the data is facilitated.
50
B) Presentation by Graphs and Diagrams:-
After class wise or group wise tabulation, the frequencies of a characteristic can be presented by two
kinds of drawings: Graphs and diagrams.
They may be shown either by lines and dots or by figures.
The drawings are meant for the non-statistical-minded people who want to study the relative
values or frequencies of persons or events.
For the statistical-mined persons, they are for quick eye readings.
Diagrams and graphs are extremely useful because:-
• They are attractive to the eyes.
• Give a birds eye view of the entire data.
• Have a lasting impression on the mind of the layman.
• Facilitate comparison of data.
51
Demerits of Diagrams:
Simplicity vs. Details: Diagrams often prioritize simplicity over details and accuracy.
Loss of Original Data: The simplicity in charts and diagrams may lead to the loss of
crucial details from the original data.
Need for Original Data: In-depth studies may require referring back to the original data.
Guidelines for Graphs, Figures, and Pictures:
Clear Titles: Ensure all graphs, figures, and pictures have clearly stated and informative
titles.
Labeling: Clearly label all classes and keys for better understanding.
Unit of Measurement: Include the appropriate unit of measurement for clarity.
52
Presentation of quantitative, continuous or measured data is through graphs.
The common graphs in use are:
Histogram
Frequency polygon
Frequency curve
Line chart or graph
Cumulative frequency diagram
Scatter or dot diagram
Bland–Altman plot
Forest plot
Presentation of qualitative, discrete or counted data is through diagrams.
The common diagrams in use are:
Bar diagram
Pie or sector diagram
Venn diagram
Pictogram or picture diagram
Map diagram or spot map.
53
Histogram
It is a graphical presentation of frequency distribution.
Variable characters of the different groups are indicated on the horizontal line (X-axis) called abscissa
while frequency, i.e. number of observations is marked on the vertical line (Y-axis) called ordinate.
Frequency of each group will form a column or rectangle. Such a diagram is called 'histogram' and
is made use of in presenting any quantitative data.
It is a bar diagram without gap between bars.
If we draw frequencies of each group or class intervals in the form of columns or rectangles such a
diagram is called histogram.
It represents a frequency distribution.
54
The histogram is constructed as follows:
• On the X axis, the size of the observation is marked.
• Starting from 0 the limit of each class interval is marked, the width corresponding to the width of
the class interval in the frequency distribution.
• On the Y axis the frequencies are marked.
• A rectangle is drawn above each class interval with height proportional to the frequency of that
interval.
Advantages of Histogram:
Easy to understand
Disadvantages of Histogram:
Only 1 histogram can be placed at a time.
More time consuming to construct than a frequency polygon.
55
Assessing the relationship between two variables
The forms of data presentation that have been
described up to this point illustrated the distribution
of a given variable, whether categorical or numerical.
In addition, it is possible to present the relationship
between two variables of interest, either categorical or
numerical.
The relationship between categorical variables
may be investigated using a contingency table, which
has the purpose of analyzing the association between
two or more variables. The lines of this type of table
usually display the exposure variable (independent
variable), and the columns, the outcome variable
(dependent variable). For example, in order to study
the effect of sun exposure (exposure variable) on the
development of skin cancer (outcome variable), it is
Weight at 18 years of age (in kg) Absolute frequency(n) Relative frequency (%)
40.5 to 59.9 554 25.25
60.0 to 65.8 543 24.75
65.9 to 74.6 551 25.11
74.7 to 147.8 546 24.89
Total 2.194 100.00
TABLE 3: Weight distribution among 18-year-old young male sex (n = 2.194). Pelotas, Brazil, 2010
0 20 40 60 80 100 120 140
Weight distribution at 18 years of age
40
30
20
10
0
FIGURE 4: Weight distribution at 18 years of age among youngsters
from the city of Pelotas. Pelotas (n = 2.194), Brazil, 2010
Weight distribution at 18 years of age
Percentage
Assessing the relationship between two variables
The forms of data presentation that have bee
described up to this point illustrated the distributio
of a given variable, whether categorical or numerica
In addition, it is possible to present the relationsh
between two variables of interest, either categorical
numerical.
The relationship between categorical variabl
may be investigated using a contingency table, whic
has the purpose of analyzing the association betwee
two or more variables. The lines of this type of tab
usually display the exposure variable (independe
variable), and the columns, the outcome variab
(dependent variable). For example, in order to stud
40.5 to 59.9 554 25.25
60.0 to 65.8 543 24.75
65.9 to 74.6 551 25.11
74.7 to 147.8 546 24.89
Total 2.194 100.00
0 20 40 60 80 100 120 140
Weight distribution at 18 years of age
40
30
20
10
0
Weight distribution at 18 years of age
Percentage
Weight distribution among 18-year-old young male sex (n = 2.194). Pelotas, Brazil, 2010.[12]
Weight distribution at 18 years of age among youngsters from the
city of Pelotas. Pelotas (n = 2.194), Brazil
Fig-15 56
Example:
Fig-16
57
Frequency polygon:
1. The most commonly used graphic device to illustrate statistical distribution.
2. Used to represent frequency distribution of quantitative data.
3. Useful to compare 2 or more frequency distributions.
• A frequency polygon is a variation of a histogram, in
which the bars are replaced by lines connecting the
midpoints of the tops of the bars.
• Advocates of the frequency polygon argue that the
purpose of a histogram is to show the shape of the
data distribution and removing the bars makes the
shape clearer and smoother.
Fig-17
58
Construction of frequency polygon:
• Variables is taken along the X axis and frequencies along the Y axis
• Class frequencies are plotted against the class mid-values and then these points are joined by a
straight line which gives a figure of frequency polygon.
• Total area under the frequency curve represents the total frequency.
Advantages of frequency polygon:
• It is very easy to construct and very easy to interpret.
• It is useful in portraying more than two distributions on the same graph paper with different
colours. So it is very useful to compare 2 or more than 2 distributions.
59
Frequency curve:-
When the number of observations are very large and class intervals very much reduced the
frequency polygon tends to loose its angulation and it forms a smooth curve known as frequency
curve.
• Variables is taken along the X axis and frequency along Y axis
• Frequencies are plotted against the class mid-values and then, these points are joined by a smooth
curve.
• The curve so obtained is the frequency curve.
• Total area under the frequency curve represents total frequency.
Fig-18
60
Line diagram:
• This diagram is useful to study changes of values in the variable overtime.
• Simplest type of diagram.
• On the X axis the time such as hours, days, weeks, months or years are represented.
• The value of any quantity pertaining to this is represented along the Y axis.
Fig-19
61
MTPs during 2002 to 2022
Cumulative frequency diagram or Ogive
• Ogive is a graph of the cumulative relative frequency distribution.
• To draw this, an ordinary frequency distribution table in a quantitative data has to be converted
into a cumulative frequency table.
• Cumulative frequency of a class interval is the total number of persons from lowest value of the
characteristic up to the highest value of the class under consideration. It is obtained by adding the
frequencies of previous classes including the class in question.
• Here the frequency of data in each category represents the sum of data from the category and the
preceding categories.
• Cumulative frequencies are plotted opposite the group limits of the variable.
• These points are joined by smooth free hand curve to get a cumulative frequency diagram or
Ogive.
62
Example:
Fig-20
63
Distribution of weights in 156 individuals
Fig-21
Scatter diagram or dot diagram:
• It is a graphic presentation of data.
• It is used to show the nature of co-relation between 2 variables.
Also called as Correlation diagram ,it is useful to represent the relationship between two
numeric measurements, each observation being represented by a point corresponding to its value
on each axis.
If the data points make a straight line going from the origin out to high x
‐
and y
‐
values, then the
variables are said to have a positive correlation. If the line goes from a high value on the y
‐
axis
down to a high value on the x
‐
axis, the variables have a negative correlation. In case no trend was
shown, it is called no correlation.[10]
Fig-22
64
BLAND–ALTMAN PLOT
A Bland–Altman plot (difference plot) is a method of data plotting used in analyzing the agreement
between two different assays. In the Bland–Altman plot, the differences (between the two methods)
are plotted against the averages of the two methods. Alternatively, we can choose to plot the
differences (between the two methods) against one of the two methods, if this is a reference method
of both methods. Potassium level
(mEq/L) (Obtained
from venous blood
gas analysis)
Potassium level
(mEq/L) (Obtained
from blood
electrolyte levels)
Mean potassium
level (mEq/L)
Difference between
potassium levels
(mEq/L)
Patient Nr.
1
4.5 4.7 4.6 0.2
Patient Nr.
2
3.8 4.2 4.0 0.4
Patient Nr.
3
5.1 5.1 5.1 0.0
Patient Nr.
4
4.9 5.3 5.1 0.4
Patient Nr.
5
3.9 4.0 3.95 0.1
Patient Nr.
6
4.0 3.8 3.9 -0.2
Patient Nr.
7
4.1 4.0 4.05 -0.1
Patient Nr.
8
4.3 4.0 4.15 -0.3
Patient Nr.
9
5.3 5.3 5.3 0.0
Patient Nr.
10
5.2 5.1 5.15 -0.1
Patient Nr.
11
3.9 4.0 3.95 0.1
Patient Nr.
12
4.1 4.4 4.25 0.3
Patient Nr.
13
4.0 4.2 4.1 0.2
Patient Nr.
14
5.3 5.1 5.2 -0.2
Patient Nr.
15
5.5 5.3 5.4 -0.2
Patient Nr.
16
4.4 4.2 4.3 -0.2
Patient Nr.
17
4.9 5.0 4.95 0.1
Patient Nr.
18
3.7 3.9 3.8 0.2
Patient Nr.
19
3.9 3.7 3.8 -0.2
Patient Nr.
20
4.8 4.7 4.75 -0.1
Patient Nr.
21
5.5 5.2 5.35 -0.3
Patient Nr.
22
3.7 3.8 3.75 0.1
Patient Nr.
23
3.7 3.9 3.80 0.2
Patient Nr.
24
4.8 4.2 4.5 -0.6
Patient Nr.
25
5.1 5.6 5.35 0.5
Dataset for potassium levels in venous blood gases and blood electrolyte work-up.
65
For our dataset, the mean difference (mean bias) was found as 0.012 with an SD of 0.260. A scatterplot
should be drawn to understand dispersion of variables using X-axis (average) and Y-axis (difference). The
LOA can be drawn manually if the statistical software does not automatically demonstrate them. In our
data set, the upper limit can be calculated using mean + 1.96 x SD (0.012 + 1.96 x 0.260 = 0.522) and the
lower limit can be calculated using mean – 1.96 x SD (0.012–1.96 x 0.260 = –0.498). The appropriate
statement used in the manuscript can be following: The Bland-Altman plot showed the mean bias ±SD
between first and second potassium levels as 0.012 ± 0.260 mEq/L, and the limits of agreement were
−0.498 and 0.522[13]
Fig-22
Agreement between two potassium measurements (Bland-Altman plot).
66
FOREST PLOT
A forest plot, also known as a blobbogram, is a graphical display of estimated results from a
number of scientific studies addressing the same question, along with the overall results. It is a
graphical representation of a meta
‐
analysis. It is usually accompanied by a table listing references
(author and date) of the studies with their estimated result included in the meta
‐
analysis.[10]
Fig-24
67
f1
f2
f3
f4
f5
Factors
0.0 0.5 1.0 1.5
Odds ratio (95% CI)
2.0 2.5
*
*
Fig. 12. An example of a dot plot with an error bar. For each level
of factors (y-axis), corresponding odds ratio (OR) and 95% CIs are
presented using dots and accompanying horizontal error bar. The
dotted line indicates the reference value of 1. The estimated OR
would not be different from 1.0 statistically if its error bar crossed this
reference line.
An example of a dot plot with an error bar. For each level
of factors (y-axis), corresponding odds ratio (OR) and 95% CIs are
presented using dots and accompanying horizontal error bar. The
dotted line indicates the reference value of 1. The estimated OR
would not be different from 1.0 statistically if its error bar crossed this
reference line.
of the 95% CI of the estimated coefficient. The estimated regression line formula is a
Table 6. Estimated OR and 95% CI of Logistic Regression Model
Factor OR (95% CI) P value
F1 1.24 (1.12, 1.38)* < 0.001
F2 1.76 (1.26, 2.51)* 0.001
F3 1.10 (0.80, 1.50) 0.557
F4 1.00 (0.98, 1.02) 0.810
F5 1.09 (0.99, 1.20) 0.083
OR: odds ratio. *Two-sided P < 0.05.
Survival analysis
Survival analysis is a statistical method that can be applied to
mortality data and various types of longitudinal data. There are
various methods, from the nonparametric Kaplan-Meier method
to more complex methods involving different parametric models.
Kaplan-Meier survival analysis and Cox regression models are
widely used in the medical field. Survival analysis results usually
accompany the survival curve, which can increase the reader’s un-
derstanding of the results through visualization. For details on the
survival curve, refer to the previous Statistical Round article [5,6]. Dose-re
f1
f2
f3
f4
f5
Factors
0.0
Fig. 12.
of factor
presente
dotted l
would n
referenc
Estimated OR and 95% CI of Logistic Regression Model
Fig-25
68
Bar diagram
1. This diagram is used to represent qualitative data.
2. It represent only one variable.
3. The width of the bar remains the same and only the length varies according to the frequency in
each category.
There are 3 types of bars:
simple bar
multiple bar or compound bar
component bar diagram or proportional bar or stacked bar
69
Simple bar:
The limitation of this method is that they can represent only on the classification and hence cannot be
used for comparison.
Fig-26
70
Mortality due to various cases
Fig-27 Cases of gastroenteritis in a hospital in 2022
Multiple bar or compound bar:
Here two or more bars are grouped together, as in
fi
g.28 population of a country is shown with three
bars each showing population of Hindus, Muslims and others over two censuses. Fig.29 shows
sexwise and standard wise distribution of students passing from a school.
Fig-28 Population of a country as per the religion Fig-29 %of students passing in school
71
Component bar diagram:
• This diagram is used to represent qualitative data.
• It is desired to represent both the no of cases in major groups as well as the subgroups
simultaneously.
Fig-30
72
Expenditures on various items in two communities
Fig-31 Proportion of energy obtained from various food stuffs
by rich and poor community
Pie diagram:
• These are popularly used to show percentage break downs for qualitative data.
• It is so called because the entire graph looks like a pie and its components represent slices cut
from a pie.
• A circle is divided into different sectors corresponding to the frequencies of the distribution.
• Some knowledge of circles and degrees is necessary.
• The total angle at the center of the circle is 360 degrees and
it represents the total frequency.
• After the calculation of angle, segments are drawn in the
circle and the segments are shaded with different shades or
colors and an index is provided for the shaded colors.
• Cannot be used to represent 2 or more data set.
73
Fig-32 pattern of expenditure in an urban
community
hysterectomy), laparoscopic anterior resection of the colon, and TKRA.
TKRA: total knee replacement arthroplasty, RMW: regulated medical
waste (Adapted from Korean J Anesthesiol 2017; 70: 100-4).
Fig. 5. Pie chart. Total weight of each component from the three
operations. RMW: regulated medical waste (Adapted from Korean J
Anesthesiol 2017; 70: 100-4).
RMW
Blue wrap
Clear wrap
Plastics
Cardboard
29,344 g
2,102 g
2,838 g
2,388 g
1,564 g
the median and "whiskers" above a
of the minimum and maximum.
Fig. 7. Box graph with whiskers
consumed during the observat
significantly. Data are expressed
quartile, third interquartile, and m
from Korean J Anesthesiol 2017; 70
0
60
40
20
Control
Calculated
amount
of
consumption
volume
of
desflurane
(ml)
Pie chart. Total weight of each component from the three operations. RMW: regulated medical waste (Adapted from Korean J
Anesthesiol 2017; 70: 100-4).[11]
74
Fig-33
75
Venn Diagram
• It shows the degrees of overlap and exclusivity for two or more characteristics or factors
within a sample or population (in which case each characteristic is represented by a
whole circle) or for a characteristic or factor among two or more samples or populations
(in which case each sample or population is represented by a whole circle).
• The sizes of the circles (or other symbols) need not be equal and may represent the
relative size for each factor or population.
Fig -34 No of covid cases as per reporting agency
Pictogram
• Display of data through pictograms was initiated by Dr Otto Neurath in 1923.
• Data are displayed by the pictures of the items to which the data pertain.
• A single picture represents a fixed no.
• They are the least satisfactory type of diagrams.
• They are inaccurate too.
Fig-35
76
Map diagram or spot map or cartograms:
1. These maps are used to show geographical distribution of frequencies of a characteristics such as
IMR, MMR, etc.
Estimated Infant Mortality Rate-2015
Fig-29
77
Other types of presentation of data
STEM AND LEAF-
• It is mainly used for the presentation of quantitative data.
• It is used to study the shape of the distribution.
• Can be used to compare two or more distributions.
• It is useful for smaller data set.
• It can be displayed by two whole digits, one for the stem and one for the leaf.
Consider this example of two groups of patients with hypertension having weights as given below:
Group I: 50, 51, 60, 62, 63, 65, 68, 74, 78, 82,83,84,85
Group II: 51, 52, 53, 54, 56, 58, 61, 63, 65, 67, 68, 71, 72, 80, 85
We can present this in tabular form as below:
78
Fig-30
Class intervals are represented by stem. For group one class intervals 50 to 59, 60 to 69, 70 to 79 and 80 to 89
are represented by stems 5, 6, 7 and 8 respectively. Now the weights of 51 and 68 are represented by leaf 1 to
stem 5 and leaf 8 to stem 6 respectively.
The stem and leaf plot for group I data can be shown as below:
The stem and leaf plot for group I and group Il data can be shown as below:
79
Fig-31
Fig-32
Box and whisker plot :
It is a representation of the quartiles (25%, 50% & 75% ) and the range of a continuous and ordered
data set. The y-axis can be arithmetic or logarithmic. Box plots can be used to compare different
distributions of data values.
Steps for drawing box and whisker plots:
• Determine from the given data set smallest, largest Q1,02 and Q3 i.e. first, second and third
quartile respectively.
• Mark the scale on X or Y axis
Draw a box (that is a rectangle with width as much as possible and length as Q3- Q1) with ends
through the points for the first and third quartiles.
• Draw a vertical line through the box at the median point (Q2)
• Draw the whiskers (lines) from each end of the box to the smallest and largest values.
80
Box plots characterize a sample using the minimum, 25th, 50th, and 75th percentiles, maximum
values. The interquartile range (IQR = Q3 − Q1, where Q1 is first quartile or 25th percentile while
Q3 is third quartile or 75th percentile) which covers the central 50% of the data. Quartiles are
insensitive to outliers and preserve information about the center and spread (variation). If a data point
is below Q1−1.5×IQR or above Q3+1.5×IQR ,it is viewed as being too far from the central values
(median), which are called outliers.
An example of a box-whisker plot. Estimated median (Q1, Q3)
[min:max] from the sample data is 1.1 (0.8, 1.3) [0.1:2.1]. This graph
includes explanations of the components of the box-whisker plot.
These are not necessary for the general purpose of publication. A
significance marker can be added, though it was not used in this
graph. If a significance maker is added, it should be located on the
shoulder or alongside the whisker. If markers are located over the
mid-top of the whiskers, these could be interpreted as outliers if no
detailed explanation is provided. The limits of the whiskers can be
varied depending on the purpose.
Fig-33
81
Fig-33 Box & whisker plot showing the distribution of height of boys in two classes A & B
82
Types of Charts Depending on the Method of Analysis of the Data
Analysis Subgroup Number of variables Type
Comparison Among items Two per items Variable width column
chart
One per item Bar/column chart
Over time Many periods Circular area/line chart
Few periods Column/line chart
Relationship Two Scatter chart
Three Bubble chart
Distribution Single Column/line histogram
Two Scatter chart
Three Three-dimensional area
chart
Comparison Changing over time Only relative di
ff
erences
matter
Stacked 100% column
chart
Relative and absolute
di
ff
erences matter
Stacked column chart
Static Simple share of total Pie chart
Accumulation Waterfall chart
Components of
components
Stacked 100% column
chart with
subcomponents
83
In conclusion we have covered the basics of data collection, from defining data types to
exploring measurement scales. We discussed and outlined various sources for data
collection. Text, tables, and graphs are effective communication media that present and
convey data and information. They aid readers in understanding the content of research,
sustain their interest, and effectively present large quantities of complex information. As
journal editors and reviewers will scan through these presentations before reading the
entire text, their importance cannot be disregarded. For this reason, authors must pay as
close attention to selecting appropriate methods of data presentation as when they were
collecting data of good quality and analyzing them. In addition, having a well-
established understanding of different methods of data presentation and their appropriate
use will enable one to develop the ability to recognize and interpret inappropriately
presented data or data presented in such a way that it deceives readers' eyes.
CONCLUSION
84
1.Jay S. Kim And Ronald J. Dailey. Biostatistics For Oral Healthcare. Blackwell Publishing
Company.2008
2.C.R Kothari. Research Methodology methods and technologies. 4th edition. New age international
private Ltd publishers; 2019. reprint 2021
3.Khanal AB. Mahajan’s methods in biostatistics for medical students and research workers. 9th ed.
New Delhi, India: Jaypee Brothers Medical; 2015.
4.Dr. J.V Dixit. Principles and Practice Of Biostatistics. 8th edition.Bhanot
5. Rao TB. Methods of biostatistics. 3rd ed. Hyderabad: Paras Medical Publisher; 2010
6. C.M. Marya. A textbook of public health dentistry. 1st Edition. New Delhi: Jaypee Brothers Medical
Publishers. 2011
7.Mazhar SA, Anjum R, Anwar AI, Khan AA.Methods of Data Collection: A Fundamental Tool of
Research. J Integ Comm Health. 2021;10(1):6-10.
8.Researchgate.net. [cited 2023 Dec 18]. Available from: https://www.researchgate.net/publication/
325846997_METHODS_OF_DATA_COLLECTIONenrichId=rgreqf6733eb7ba5b1666d4b32342979e
ad09XXX&enrichSource=Y292ZXJQYWdlOzMyNTg0Njk5NztBUzo2NDE0NjI5MDc3MTU1ODVAMT
UyOTk0ODA4MzU4Ng%3D%3D&el=1_x_2&_esc=publicationCoverPdf
9.Bhandari P. Data collection [Internet]. Scribbr. 2020 [cited 2023 Dec 19]. Available from: https://
www.scribbr.com/methodology/data-collection/
86
REFERENCES
10. Mishra P, Pandey CM, Singh U, Gupta A. Scales of measurement and presentation of statistical
data. Ann Card Anaesth 2018;21:419-22
11. Shinn HK, Hwang Y, Kim BG, Yang C, Na W, Song JH, et al. Segregation for reduction of
regulated medical waste in the operating room: a case report. Korean J Anesthesiol 2017; 70: 100-4.
12. Duquia RP, Bastos JL, Bonamigo RR, González-Chica DA, Martínez-Mesa J. Presenting data in
tables and charts. An Bras Dermatol. 2014;89(2):280-5.
13. Nurettin Özgür Doğan, Bland-Altman analysis: A paradigm to understand correlation and
agreement, Turkish Journal of Emergency Medicine, Volume 18, Issue 4, 2018, Pages 139-141
87
THANKYOU
88

More Related Content

What's hot

Pulp therapy for primary and young teeth
Pulp therapy for primary and young teethPulp therapy for primary and young teeth
Pulp therapy for primary and young teethSaeed Bajafar
 
PERIODONTAL ABSCESS
PERIODONTAL ABSCESSPERIODONTAL ABSCESS
PERIODONTAL ABSCESSShilpa Shiv
 
Sampling in public health dentistry
Sampling in public health dentistrySampling in public health dentistry
Sampling in public health dentistryNaiya Virani
 
Management of hot tooth
Management of hot toothManagement of hot tooth
Management of hot toothHrudi Sahoo
 
Fenestration and dehiscence
Fenestration and dehiscenceFenestration and dehiscence
Fenestration and dehiscenceAhmed Baattiah
 
Principles Of Radiographic Interpretation
Principles Of Radiographic InterpretationPrinciples Of Radiographic Interpretation
Principles Of Radiographic InterpretationDrJamilAlossaimi
 
Endodontics periodontal lesions
Endodontics periodontal lesionsEndodontics periodontal lesions
Endodontics periodontal lesionsArshe Gs
 
Diagnosis and treatment planning in removable partial denture
Diagnosis and treatment planning in removable partial dentureDiagnosis and treatment planning in removable partial denture
Diagnosis and treatment planning in removable partial dentureVinay Kadavakolanu
 
Cbct in endodontics ppt
Cbct in endodontics pptCbct in endodontics ppt
Cbct in endodontics pptDr kausar banu
 
Obturating materials for primary tooth
Obturating materials for primary toothObturating materials for primary tooth
Obturating materials for primary toothjhansi mutyala
 

What's hot (20)

Periodontal indices
Periodontal indicesPeriodontal indices
Periodontal indices
 
Pulp therapy for primary and young teeth
Pulp therapy for primary and young teethPulp therapy for primary and young teeth
Pulp therapy for primary and young teeth
 
Epidemiology
Epidemiology   Epidemiology
Epidemiology
 
PERIODONTAL ABSCESS
PERIODONTAL ABSCESSPERIODONTAL ABSCESS
PERIODONTAL ABSCESS
 
Obturation
ObturationObturation
Obturation
 
Sampling in public health dentistry
Sampling in public health dentistrySampling in public health dentistry
Sampling in public health dentistry
 
Management of hot tooth
Management of hot toothManagement of hot tooth
Management of hot tooth
 
Neurophysiology of pulp
Neurophysiology of pulpNeurophysiology of pulp
Neurophysiology of pulp
 
Pontics
PonticsPontics
Pontics
 
Development of Occlusion
Development of OcclusionDevelopment of Occlusion
Development of Occlusion
 
Fenestration and dehiscence
Fenestration and dehiscenceFenestration and dehiscence
Fenestration and dehiscence
 
Principles Of Radiographic Interpretation
Principles Of Radiographic InterpretationPrinciples Of Radiographic Interpretation
Principles Of Radiographic Interpretation
 
Endodontics periodontal lesions
Endodontics periodontal lesionsEndodontics periodontal lesions
Endodontics periodontal lesions
 
Diagnosis and treatment planning in removable partial denture
Diagnosis and treatment planning in removable partial dentureDiagnosis and treatment planning in removable partial denture
Diagnosis and treatment planning in removable partial denture
 
Cbct in endodontics ppt
Cbct in endodontics pptCbct in endodontics ppt
Cbct in endodontics ppt
 
Fluorides...........
Fluorides...........Fluorides...........
Fluorides...........
 
Endodontic instruments
Endodontic instrumentsEndodontic instruments
Endodontic instruments
 
Obturating materials for primary tooth
Obturating materials for primary toothObturating materials for primary tooth
Obturating materials for primary tooth
 
dentin bonding agents
dentin bonding agentsdentin bonding agents
dentin bonding agents
 
Trauma from occlusion
Trauma from occlusionTrauma from occlusion
Trauma from occlusion
 

Similar to DATA COLLECTION AND PRESENTATION IN PUBLIC HEALTH DENTISTRY

Advanced Biostatistics presentation pptx
Advanced Biostatistics presentation  pptxAdvanced Biostatistics presentation  pptx
Advanced Biostatistics presentation pptxAbebe334138
 
BUSINESS RESEARCH METHODS-DATA COLLECTION METHODS
BUSINESS RESEARCH METHODS-DATA COLLECTION METHODSBUSINESS RESEARCH METHODS-DATA COLLECTION METHODS
BUSINESS RESEARCH METHODS-DATA COLLECTION METHODSmariaboaler1
 
1 Introduction to Biostatistics.pptx
1 Introduction to Biostatistics.pptx1 Introduction to Biostatistics.pptx
1 Introduction to Biostatistics.pptxAyeleBizuneh1
 
1 Introduction to Biostatistics.pdf
1 Introduction to Biostatistics.pdf1 Introduction to Biostatistics.pdf
1 Introduction to Biostatistics.pdfbayisahrsa
 
meaurement scale,data collection and questioner design.pptx
meaurement scale,data collection and questioner design.pptxmeaurement scale,data collection and questioner design.pptx
meaurement scale,data collection and questioner design.pptxdebabatolosa
 
Biostatistics introduction.pptx
Biostatistics introduction.pptxBiostatistics introduction.pptx
Biostatistics introduction.pptxMohammedAbdela7
 
Types of research methods and data collection methods_d3ff56772231bde3e368818...
Types of research methods and data collection methods_d3ff56772231bde3e368818...Types of research methods and data collection methods_d3ff56772231bde3e368818...
Types of research methods and data collection methods_d3ff56772231bde3e368818...Ankitha30
 
Module-7-Descriptive Research-survey.pdf
Module-7-Descriptive Research-survey.pdfModule-7-Descriptive Research-survey.pdf
Module-7-Descriptive Research-survey.pdfVikramjit Singh
 
Descriptive research-survey
Descriptive research-surveyDescriptive research-survey
Descriptive research-surveyVikramjit Singh
 
Q3-M7-1styr-howtomakeinquiriesss_(1).pptx
Q3-M7-1styr-howtomakeinquiriesss_(1).pptxQ3-M7-1styr-howtomakeinquiriesss_(1).pptx
Q3-M7-1styr-howtomakeinquiriesss_(1).pptxMarielleGuanioMabaca
 
Tools Of Data Collection.pptx
Tools Of Data Collection.pptxTools Of Data Collection.pptx
Tools Of Data Collection.pptxPariNaz10
 
COMMUNITY NEED ASSESSMENT.pptx
COMMUNITY NEED ASSESSMENT.pptxCOMMUNITY NEED ASSESSMENT.pptx
COMMUNITY NEED ASSESSMENT.pptxGhaffarAhmed9
 
Share MED3-DATA COLLECTION AND PRESENTATION(METHODS OF DATA) PPT.pptx
Share MED3-DATA COLLECTION AND PRESENTATION(METHODS OF DATA) PPT.pptxShare MED3-DATA COLLECTION AND PRESENTATION(METHODS OF DATA) PPT.pptx
Share MED3-DATA COLLECTION AND PRESENTATION(METHODS OF DATA) PPT.pptxShenaCanoCover
 

Similar to DATA COLLECTION AND PRESENTATION IN PUBLIC HEALTH DENTISTRY (20)

Advanced Biostatistics presentation pptx
Advanced Biostatistics presentation  pptxAdvanced Biostatistics presentation  pptx
Advanced Biostatistics presentation pptx
 
BUSINESS RESEARCH METHODS-DATA COLLECTION METHODS
BUSINESS RESEARCH METHODS-DATA COLLECTION METHODSBUSINESS RESEARCH METHODS-DATA COLLECTION METHODS
BUSINESS RESEARCH METHODS-DATA COLLECTION METHODS
 
Data collection
Data collectionData collection
Data collection
 
1 Introduction to Biostatistics.pptx
1 Introduction to Biostatistics.pptx1 Introduction to Biostatistics.pptx
1 Introduction to Biostatistics.pptx
 
1 Introduction to Biostatistics.pdf
1 Introduction to Biostatistics.pdf1 Introduction to Biostatistics.pdf
1 Introduction to Biostatistics.pdf
 
meaurement scale,data collection and questioner design.pptx
meaurement scale,data collection and questioner design.pptxmeaurement scale,data collection and questioner design.pptx
meaurement scale,data collection and questioner design.pptx
 
Data collection
Data collectionData collection
Data collection
 
Biostatistics introduction.pptx
Biostatistics introduction.pptxBiostatistics introduction.pptx
Biostatistics introduction.pptx
 
Data collection
Data collectionData collection
Data collection
 
DATA COLLECTION
DATA COLLECTIONDATA COLLECTION
DATA COLLECTION
 
Types of research methods and data collection methods_d3ff56772231bde3e368818...
Types of research methods and data collection methods_d3ff56772231bde3e368818...Types of research methods and data collection methods_d3ff56772231bde3e368818...
Types of research methods and data collection methods_d3ff56772231bde3e368818...
 
Module-7-Descriptive Research-survey.pdf
Module-7-Descriptive Research-survey.pdfModule-7-Descriptive Research-survey.pdf
Module-7-Descriptive Research-survey.pdf
 
Dr.TK-Business Research Methods -Data Analysis
Dr.TK-Business Research Methods -Data AnalysisDr.TK-Business Research Methods -Data Analysis
Dr.TK-Business Research Methods -Data Analysis
 
Session 4 logic models and indicators
Session 4   logic models and indicatorsSession 4   logic models and indicators
Session 4 logic models and indicators
 
1.introduction
1.introduction1.introduction
1.introduction
 
Descriptive research-survey
Descriptive research-surveyDescriptive research-survey
Descriptive research-survey
 
Q3-M7-1styr-howtomakeinquiriesss_(1).pptx
Q3-M7-1styr-howtomakeinquiriesss_(1).pptxQ3-M7-1styr-howtomakeinquiriesss_(1).pptx
Q3-M7-1styr-howtomakeinquiriesss_(1).pptx
 
Tools Of Data Collection.pptx
Tools Of Data Collection.pptxTools Of Data Collection.pptx
Tools Of Data Collection.pptx
 
COMMUNITY NEED ASSESSMENT.pptx
COMMUNITY NEED ASSESSMENT.pptxCOMMUNITY NEED ASSESSMENT.pptx
COMMUNITY NEED ASSESSMENT.pptx
 
Share MED3-DATA COLLECTION AND PRESENTATION(METHODS OF DATA) PPT.pptx
Share MED3-DATA COLLECTION AND PRESENTATION(METHODS OF DATA) PPT.pptxShare MED3-DATA COLLECTION AND PRESENTATION(METHODS OF DATA) PPT.pptx
Share MED3-DATA COLLECTION AND PRESENTATION(METHODS OF DATA) PPT.pptx
 

Recently uploaded

Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting DataJhengPantaleon
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxRaymartEstabillo3
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxsocialsciencegdgrohi
 

Recently uploaded (20)

Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data_Math 4-Q4 Week 5.pptx Steps in Collecting Data
_Math 4-Q4 Week 5.pptx Steps in Collecting Data
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptxEPANDING THE CONTENT OF AN OUTLINE using notes.pptx
EPANDING THE CONTENT OF AN OUTLINE using notes.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptxHistory Class XII Ch. 3 Kinship, Caste and Class (1).pptx
History Class XII Ch. 3 Kinship, Caste and Class (1).pptx
 

DATA COLLECTION AND PRESENTATION IN PUBLIC HEALTH DENTISTRY

  • 1. DATA COLLECTION AND PRESENTATION BY DR. POONAM NARANG P.G 1ST YEAR DEPT. OF PUBLIC HEALTH DENTISTRY DATA PRESENTATION
  • 2. CONTENTS • OBJECTIVES • INTRODUCTION AND DEFINITION • CLASSIFICATION OF DATA • METHODS OF COLLECTION OF DATA • DATA PRESENTATION • CONCLUSION • REFERENCES 02
  • 3. LEARNING OBJECTIVES 1. To know about data. 2. To enumerate various types of data 3. To know about scales of measurement 4. To enumerate the methods for collection of data 5. To know about various methods of data presentation 03
  • 4. INTRODUCTION Data, the plural of datum, are facts expressed in numerical terms. In statistical language, it is also called variable as it is a character, characteristics or quality that varies.These do not convey any meaning by themselves. Hence, these have to be worked upon using a set of statistical tools to convert them into meaningful information. INFORMATION STATISTICS DATA 04
  • 5. Characteristics of Data Collection Data collection can be characterized by several important characteristics that help to ensure the quality and accuracy of the data gathered. These characteristics include: •Validity •Reliability •Objectivity •Precision •Timeliness •Ethical considerations 05
  • 6. ADVANTAGES OF DATA •Better decision-making •Improved understanding •Evaluation of interventions •Identifying trends and patterns •Validation of theories •Improved quality 06
  • 7. Limitations of Data Collection While data collection has several advantages, it also has some limitations that must be considered. These limitations include: •Bias •Sampling bias. •Cost •Limited scope •Ethical considerations •Data quality issues 07
  • 8. CLASSIFICATION OF DATA A. Based on nature of variable • Qualitative Data • Quantitative Data • Discrete Data • Continuous Data B. Based on sources • Primary Data • Secondary Data D. Based on presentation • Grouped Data • Ungrouped Data C. According to highest level which it fits • Nominal Data • Ordinal Data • Interval Data • Ratio data 08
  • 9. •Qualitative data ~ Classification of data according to qualitative characteristics such as sex, honesty, intelligence, literacy, colour, religion, marital status etc. Fig-01 09
  • 10. •Quantitative data - Classification of data according to quantitative characteristics such as age, weight, height, marks etc. 10 Fig-02
  • 11. •Discrete data - Classification of data which takes exact numerical values (whole numbers). Eg: No of Children in a family, shoe size 11 Fig-03
  • 12. •Continuous data - Classification of data which takes numerical values within a certain range. Eg: Weight of girl baby of one month is given as 3.8kg, but exact weight could be between 3.2 and 5.4 12 Fig-04
  • 13. • Primary data- Data which is directly collected by the researcher/investigator. • Secondary data- Data which is not directly collected by the researcher/investigator. Primary Quantitative Data:- Questionnaires Structured Interviews Secondary Quantitative Data:- Official statistics Primary Qualitative Data:- Participant Observation Unstructured interviews • Secondary Qualitative Data: Letters, articles, newspapers 13
  • 14. •Grouped data- Data which is presented in group Eg: Age: 20-25 (12 persons),25-30 (8 persons)….. •Ungrouped data- Data which is presented individually Eg: Age: 28 years, 27 years, 23 years, 25 years, 26 years..... Another classification - acc to the highest level which it fits:- → Nominal - Lowest level - only names are meaningful. For ex- in a classroom student can be hindu, muslim, christian, etc, so the student belongs to one category. →Ordinal - Adds an order to the names. For ex- post surgical pain can be classified in to its severity: 0 means no pain, 1 means mild pain, 2 means moderate pain, 3 means severe pain. → Interval - Adds meaningful differences. No true zero For ex- Knoop hardness no. for composites. →Ratio- Adds a zero so that ratios are meaningful. Has true zero or starting point. For ex- height, weight length, etc. like twice the weight. 14
  • 15. Main sources of for collection of data A. Experiments B. Surveys C. Records A. Experiments- Experiments are performed in the lab of various branches of medical sciences like physiology, biochem, pharmacology and clinical pathology or in the hospital ward or in community. B. Survey- Surveys are carried out for epidemiological studies in the field by trained teams Are specially applied to generate data needed for specific purposes and comprises of primary data Records provides readymade data for routine and continuous information which may be used for research as secondary data • To find the incidences or prevalence of health or diseases statistics in a community- like incidences of malaria - prevalence of leprosy • To identify risks factors associated with disease occurrence • Also need in operational research such as assessment of existing conditions of a program, health services or facility, • Evaluating new strategies for prevention and control of health problems 15
  • 16. Survey provides useful information like- A). Changing trends in health statics; morbidity; mortality; health practices etc. B). Provide feedback to modify policy, system redefinition of objectives C). Provide timely warning of public health hazards C. Records- records are maintained as a routine in registers or books over a long period of time, for various purposes such as vital statistics- births, marriage and deaths or for illness in hospitals. There are various methods of data collection:- Experiments Surveys Observation method Interview method Questionnaire method Schedule method • Other methods include warranty cards, pantry audits, distributary audits, consumer panels, using mechanical devices, through projective technique, depth interviews and content analysis. 16
  • 17. Data can be collected either through primary sources or secondary sources Primary Sources- Here the data is obtained by the investigator himself. This is the first hand information. 1. Observation method- This is the most frequently used in practice. Observation is said to be a scientific tool and a means of data collection for the researcher.(9) Types of Observation Methods • Structured Observation • Unstructured Observation • Controlled Observation • Uncontrolled Observation • Participant Observation • Non-participant Observation • Disguised Observation 17
  • 18. 2. Health interview survey- It is invaluable method of measuring subjective phenomena, such as perceived morbidity, disability and impairments; opinions, beliefs and attributes and some behavioural characteristics. • Direct personal investigation • Indirect oral investigation • Easy to conduct in urbans • Little use in developing countries 18 18
  • 19. •Structured interviews: The questions are predetermined in both topic and order. •Semi-structured interviews: A few questions are predetermined, but other questions aren’t planned. •Unstructured interviews: None of the questions are predetermined. •Focussed interview: focus attention on the given experience of the respondent. 3. Questionnaire Method- Standard method of data collection in clinical, epidemiological, psychosocial and demographic research. It is used for measuring subjective phenomena. 19
  • 20. WHAT IS QUESTIONNAIRE? "A document containing set of questions logically related to the problem under study.” ◦If the questions are filled by respondents, then its called as 'Questionnaire' ◦If filled by enumerators, it's called as ‘Schedule’ STRUCTURED QUESTIONNAIRES Questionnaires in which there are definite, concrete and pre-determined questions. The questions are presented with exactly the same wording and in the same order to all respondents. The form of the question may be either closed (i.e., of the type 'yes' or 'no') or open (i.e., inviting free response). UNSTRUCTURED QUESTIONNAIRES Interviewer is provided with a general guide on type of information to be obtained. Question formulation is his own responsibility and replies taken down in respondent's own words. 20
  • 21. GUTTMAN SCALE: (Cumulative) ◦Contain a series of statements that express increasing intensity of a characteristic. ◦Respondent is asked to agree or disagree with each statement. ◦Respondents score is the total number of items with which he agrees or disagrees. The 2 types of scales most commonly used are the Likert and Guttman scales. LIKERT SCALE : (Summative) ◦Commonly used to quantify attitudes & behaviour. ◦Respondents are asked to select a response that best represents the rank or degree of their answer. ◦Eg: respondent may be asked to indicate whether he strongly agrees, agrees, neither, disagrees, or strongly disagrees with the statement. 21
  • 22. ADVANTAGES ◦Simple ◦Economical ◦Standardisation ◦Anonymity DISADVANTAGES ◦Used only when respondent is educated and cooperating ◦Usually increases rate of non-responses ◦Inflexibility ◦Time consuming Types of Questions OPEN ENDED (i.e., inviting free responses) CLOSED ENDED (i.e., of the type ‘yes’ or ‘no’) QUESTIONNAIRE 22
  • 23. HOW TO CONSTRUCT A QUESTIONNAIRE Researcher should note the following with regard to these three main aspects of a questionnaire: General form Question Sequence Determine the type the Questions : A) Direct Question B) Indirect Question C) Open Form Questionnaire D) Closed Form Questionnaire E) Dichotomous Questions F) Multiple Choice Questions (MCQ) 23
  • 24. 5. SCHEDULE METHOD ◦A schedule is a structure of set of questions on a given topic which are asked by the interviewer or investigator personally. ◦Like questionnaire but filled by enumerators who are especially appointed for filling questionnaire. Questionnaire vs schedule • Questionnaires generally sent through mail and no further assistance from sender. • Questionnaire is cheaper method. • Non response is high. • In questionnaires identity of respondent is unknown • Very slow method • No personal contact • Schedule is generally filled by enumerator or research worker • Costly, requires field workers • Non response is low • In schedule identity of person is known • Information is collected well in time • Direct personal contact 24
  • 25. Other Methods of Data Collection •Warranty Cards: They are also called feedback cards. They are usually a postal size card with some questions along with a request to the consumers to fill and return them. •Distributor or Store Audit: This can be performed by distributers or manufacturers through their sales representatives commonly and seasonal purchasing pattern. •Pantry Audit: It is applied to estimate consumption of basket of goods at the consumer level. •Consumer Panel: It is an extension of pantry audit. It is approached on a regular basis. •Use of Mechanical Devices: Eye camera, pupilometric camera, psychogalvanometer, motion picture camera 25
  • 26. SECONDARY SOURCES Secondary data means data that are already available i.e., they refer to the data which have already been collected and analyzed by someone else. When the researcher utilizes secondary data, then he has to look into various sources from where he can obtain them. Published data ◦books, magazines and newspapers ◦reports prepared by research scholars, universities historical documents Unpublished data diaries, letters, unpublished biographies and autobiographies 26
  • 27. 1) Published sources a. Reports and of fi cial publications of i. International bodies such as World Health Organization ii. Central and state governments such as Census data iii. Reports of committees and commissions appointed by government b. Semi of fi cial publications of various local bodies such as municipal corporations. C. Publications of autonomous and private institutes such as • Trade and professional bodies. • Financial and economic journals. • Annual reports of companies and corporations. • Publications brought out by various autonomous research institutes and scholars. 2) Unpublished sources:- There are various unpublished data sources such as records maintained by various government and private agencies, studies conducted by research institutions, scholars etc... like dissertations of medical students of health university. 27
  • 28. Factors to be considered before using secondary data Reliability of data - Who, when , which methods, at what time etc. Suitability of data - Object ,scope, and nature of original inquiry should be studied, as if the study was with different objective then that data is not suitable for current study. Adequacy of data- Level of accuracy, Area differences then data is not adequate for study. 28
  • 29. Selection of proper Method for collection of Data 1. Nature ,Scope and object of inquiry 2. Availability of Funds 3. Time Factor 4.Precision Required 29
  • 30. 1) Census:- In India from the fi rst census of 1881, every 10 years census is taken. It is de fi ned as "the total process of collecting, compiling and publishing demographic, economic and social data pertaining to all persons in a country or delimited territory at a speci fi ed time or times". Last census was held in March 2011. The data on age, sex, income and other basic information obtained in census provides a base for planning, action and research in fi eld of medicine as well as other sectors. 2)Registration of vital events :- In India, registration of births, deaths and marriages is mandatory by law. This forms foundation of health and vital statistics. 3) Sample Registration System(SRS) :- It is a dual record system, consisting of continuous enumeration of births and deaths by an enumerator and an independent survey every 6 months by an investigator - supervisor. Due to complete coverage of our country by SRS, we are able to get more reliable information on birth and death rates, age speci fi c fertility, mortality rates and infant mortality. SOURCES FOR COLLECTION OF DATA: 30
  • 31. 4) Noti fi cation of diseases :- It is a valuable source of morbidity data such as incidence, prevalence and distribution of certain speci fi ed diseases which noti fi able. Diseases to be are noti fi ed are different in various countries as well as states in the same country. Cholera, plague and yellow fever are internationally noti fi able diseases. 5) Hospital Records :- This forms basic and primary source of information about diseases prevalent in the community due to the fact that in India registration of vital events is faulty and noti fi cation of infectious diseases is far from adequate. Serious limitation of hospital data is that it represents only those individuals who seek medical care and we do not know the denominator due to lack of precise boundaries of the catchment area of al hospital. Still it gives useful information regarding time, place and person distribution of various diseases. 6) Epidemiological Surveillance :- Special surveillance activities are conducted for diseases like malaria, AIDS in our country. This provides considerable morbidity and mortality data for the speci fi c diseases. E.g. Sentinel surveillance data. 31
  • 32. 7) Surveys :- Population surveys supplement routinely collected statistics. The term “health survey" is used for surveys relating to any aspect of health-morbidty, mortality, nutritional status etc. When main emphasis is on disease in the community the survey is labelled as "morbidity survey". These surveys can be conducted for evaluating health status of a population, for investigation of factors affecting health and disease or for improving administration of health services. These surveys can be cross-sectional or longitudinal; descriptive or analytic or both. Methods used for data collection in surveys include health interview, health examination, study of health records and mailed questionnaires. eg. NFHS data. 8) Research Findings :- In various departments of Medical Colleges Hospitals experiments are performed for investigations and research. Similarly in biomedical institutions & pharmaceutical industries lot of research activities are conducted with speci fi c objectives. This data is useful for planning and implementation of health activities in general. E.g. Dissertations, research papers. 32
  • 33. DATA PRESENTATION The objective of classification of data is to make the data simple, concise, meaningful and interesting and helpful in further analysis. DATA COLLECTED FROM VARIOUS EXPERIMENTS COMPILATION AND CLASSIFICATION PRESENTATION 33
  • 34. Principles of presentation of data • Data should be arranged in such a way that it will arouse interest in reader. • The data should be made sufficiently concise without losing important details. • The data should presented in simple form to enable the reader to form quick impressions and to draw some conclusions, directly or indirectly. • Should facilitate further statistical analysis. • It should define the problem and suggest its solution. 34
  • 35. The main methods of presenting frequencies of a variable or data:- 1.Textual 2. Tabulation 3. Charts and  Diagrams METHODS OF PRESENTATION OF DATA: 35 TEXTUAL PRESENTATION OF DATA In textual presentation, data are described within the text. When the quantity of data is not too large this form of presentation is more suitable. Look at the following cases: Case 1 In a bandh call given on 08 September 2005 protesting the hike in prices of petrol and diesel, 5 petrol pumps were found open and 17 were closed whereas 2 schools were closed and remaining 9 schools were found open in a town of Bihar.
  • 36. Case 2 Census of India 2001 reported that Indian population had risen to 102 crore of which only 49 crore were females against 53 crore males. Seventy-four crore people resided in rural India and only 28 crore lived in towns or cities. While there were 62 crore non-worker population against 40 crore workers in the entire country. Urban population had an even higher share of non-workers (19 crore) against workers (9 crore) as compared to the rural population where there were 31 crore workers out of a 74 crore population... In both the cases data have been presented only in the text. A serious drawback of this method of presentation is that one has to go through the complete text of presentation for comprehension. But, it is also true that this matter often enables one to emphasise certain points of the presentation. Tabulation :- It is the first step before the data is used for analysis or interpretation. In the process of tabulation the following type of classification are encountered. • Geographical i.e area wise • Chronological i.e on the basis of time • Qualitative i.e. according to attribute • Quantitative i.e. in terms of magnitude 36
  • 37. MEANING OF VARIOUS TERMS  Grouped Frequency Distribution: a frequency distribution when several numbers are grouped in one class.  Class limits: Separates one class in a grouped frequency distribution from another. The limits could actually appear in the data and have gaps between the upper limits of one class and lower limit of the next.  Class boundaries: Separates one class in a grouped frequency distribution from another. The boundaries have one more decimal places than the row data and therefore do not appear in the data. There is no gap between the upper boundary of one class and lower boundary of the next class. The lower class boundary is found by subtracting U/2 from the corresponding lower class limit and the upper class boundary is found by adding U/2 to the corresponding upper class limit.  Class width: the difference between the upper and lower class boundaries of any class. It is also the difference between the lower limits of any two consecutive classes or the difference between any two consecutive class marks. 37
  • 38.  Class mark (Mid points): it is the average of the lower and upper class limits or the average of upper and lower class boundary.  Cumulative frequency: is the number of observations less than/more than or equal to a specific value.  Relative frequency (rf): it is the frequency divided by the total frequency.  Relative cumulative frequency (rcf): it is the cumulative frequency divided by the total frequency. Classi fi cation and tabulation are not two distinct processes but actually they go together, classi fi cation is the fi rst step in tabulation. 38
  • 39. A) Tabulation : It is usually the fi rst step in presentation and analysis of data. A table can be simple or complex depending upon the number of measurements of a single set or multiple sets of items. Let us take an example to understand tabulation. Number of deaths due to neonatal tetanus in 97 districts of India in one year are given below :- 70, 71, 72, 79, 84, 92, 141 70, 73, 73, 77, 79, 84, 93, 146, 74, 75, 77, 84, 70, 72, 88, 88, 141, 75, 77, 82, 93, 109, 134, 147, 79, 79, 87, 95, 107, 125, 140, 148 76, 78, 83, 106, 124, 135, 141, 71, 70, 79, 87, 97, 117, 140, 150, 160, 73, 78, 82, 103, 116, 135, 148 160, 160, 160, 160, 150, 88, 72, 76, 78, 84, 73, 80, 98, 113, 137, 157 74, 75, 81, 88, 99, 102, 113, 158, 84, 73, 76, 78, 82, 101, 113, 158, 88, 75, 108 39
  • 40. • It is obvious that we can understand very little from the fi gures. A better way can be to arrange the fi gures in an ascending or descending order, i.e. from 70 to 160, but still bulk of the data remains. • A simpler method of reducing bulk of data can be tally mark method. In this method a vertical bar (I) is put against the concerned number when it occurs. So if 70 occurs four times we represent it by IIII. For fi fth observation, instead of a vertical bar we put a cross tally (/) on the fi rst four tallies. Thus we can get sets of fi ve each. This representation of the data is known as frequency distribution. • Neonatal deaths are called the variable (x) and number of districts against the neonatal deaths are known as frequency (f) of the variable. The term 'frequency' is derived from 'how frequently' a variable occurs. Fig.05 example 40
  • 41. In this example frequency of 73 neonatal deaths is 5 whereas frequency of 158 deaths is 3. Though this method reduces the data to some extent, still it can not be called the best method. In such a case to condense data further the observed range of variable can be divided in to suitable no class intervals and no of observations in each class are recorded. Such a figure Fig. 06 showing the distribution of frequencies in the different classes is called a frequency distribution table. And the manner in which the class frequencies are distributed over the class intervals is called the grouped frequency distribution of the variable. The merits of a frequency distribution table are that, • It shows at a glance how many individual observations are in a group, and where the main concentration lies. • It also shows the range, and the shape of the distribution. 41 Fig.06
  • 42. Rules and guidelines for tabular presentation - • A number should be assigned to the table (Table No.). • A title should be given to the table, it should be concise and self explanatory. • Contents of the table should be defined clearly. • Subtitles should be properly mentioned with columns and rows • Group intervals classes in columns and rows should neither be too narrow nor too wide. They should also be mutually exclusive and non overlapping. • Unit of measurement must be mentioned clearly where ever necessary. • Number of classes should be neither too large nor small. There can be 10 to 20 classes. Following formula, can be used to find out approximate number of "K" classes. • K= 1 + 3.322 log10 N, Where N is the total frequency. • Foot notes be given whenever necessary providing additional information, source or explanatory notes. • Any short forms /symbols, if used should be explained in the footnote. • No place should be left in the body of tables. • There should be logical arrangement of data in the table. 42
  • 43. 43 Fig.07. PARTS OF TABLE Fig.08
  • 44. 44 Below are given examples of tables. Fig.09
  • 46. 1. Classification by Space (geographical) :- • Data are classified by location of occurrence. • Arrangement of set of categories in alphabetical order of the terms defining these categories, • In the order of their geographical location may be found to be suitable in many case. Fig-11 46
  • 47. 2. Chronological i.e. On the basis of time :- • In this case data are classified by time of occurrence of the observations • Arrangement of categories is almost always in chronological order Fig-12 47
  • 48. 3. Classification by attribute :- • When the data represent observations made on a qualitative characteristic the classification in such a case is made according to this qualities. • Alphabetical arrangement of categories may be suitable for general purpose table. • In the case of special purpose table arrangement may be made in the order of importance of these categories. Fig-13 48
  • 49. 4. Classification by the size of observations :- • When the data represent observations of some characteristic on a numerical scale, classification is made on the basis of the individual observations. • The range of observations is suitable divided into smaller divisions called class intervals. • The numerical scale adopted may be either discrete or continuous. Fig-14 49
  • 50. Advantages of tabular presentation • It is convenient and suf fi cient form for presenting the statistical information. • It summarises the information and displays important features of it. • Unnecessary repetitions that may appear in texts are avoided. • Comparison between localities, age groups etc. can be made easily. • Errors and omissions in the information can be easily detected. • Reference to any details of the data is facilitated. 50
  • 51. B) Presentation by Graphs and Diagrams:- After class wise or group wise tabulation, the frequencies of a characteristic can be presented by two kinds of drawings: Graphs and diagrams. They may be shown either by lines and dots or by figures. The drawings are meant for the non-statistical-minded people who want to study the relative values or frequencies of persons or events. For the statistical-mined persons, they are for quick eye readings. Diagrams and graphs are extremely useful because:- • They are attractive to the eyes. • Give a birds eye view of the entire data. • Have a lasting impression on the mind of the layman. • Facilitate comparison of data. 51
  • 52. Demerits of Diagrams: Simplicity vs. Details: Diagrams often prioritize simplicity over details and accuracy. Loss of Original Data: The simplicity in charts and diagrams may lead to the loss of crucial details from the original data. Need for Original Data: In-depth studies may require referring back to the original data. Guidelines for Graphs, Figures, and Pictures: Clear Titles: Ensure all graphs, figures, and pictures have clearly stated and informative titles. Labeling: Clearly label all classes and keys for better understanding. Unit of Measurement: Include the appropriate unit of measurement for clarity. 52
  • 53. Presentation of quantitative, continuous or measured data is through graphs. The common graphs in use are: Histogram Frequency polygon Frequency curve Line chart or graph Cumulative frequency diagram Scatter or dot diagram Bland–Altman plot Forest plot Presentation of qualitative, discrete or counted data is through diagrams. The common diagrams in use are: Bar diagram Pie or sector diagram Venn diagram Pictogram or picture diagram Map diagram or spot map. 53
  • 54. Histogram It is a graphical presentation of frequency distribution. Variable characters of the different groups are indicated on the horizontal line (X-axis) called abscissa while frequency, i.e. number of observations is marked on the vertical line (Y-axis) called ordinate. Frequency of each group will form a column or rectangle. Such a diagram is called 'histogram' and is made use of in presenting any quantitative data. It is a bar diagram without gap between bars. If we draw frequencies of each group or class intervals in the form of columns or rectangles such a diagram is called histogram. It represents a frequency distribution. 54
  • 55. The histogram is constructed as follows: • On the X axis, the size of the observation is marked. • Starting from 0 the limit of each class interval is marked, the width corresponding to the width of the class interval in the frequency distribution. • On the Y axis the frequencies are marked. • A rectangle is drawn above each class interval with height proportional to the frequency of that interval. Advantages of Histogram: Easy to understand Disadvantages of Histogram: Only 1 histogram can be placed at a time. More time consuming to construct than a frequency polygon. 55
  • 56. Assessing the relationship between two variables The forms of data presentation that have been described up to this point illustrated the distribution of a given variable, whether categorical or numerical. In addition, it is possible to present the relationship between two variables of interest, either categorical or numerical. The relationship between categorical variables may be investigated using a contingency table, which has the purpose of analyzing the association between two or more variables. The lines of this type of table usually display the exposure variable (independent variable), and the columns, the outcome variable (dependent variable). For example, in order to study the effect of sun exposure (exposure variable) on the development of skin cancer (outcome variable), it is Weight at 18 years of age (in kg) Absolute frequency(n) Relative frequency (%) 40.5 to 59.9 554 25.25 60.0 to 65.8 543 24.75 65.9 to 74.6 551 25.11 74.7 to 147.8 546 24.89 Total 2.194 100.00 TABLE 3: Weight distribution among 18-year-old young male sex (n = 2.194). Pelotas, Brazil, 2010 0 20 40 60 80 100 120 140 Weight distribution at 18 years of age 40 30 20 10 0 FIGURE 4: Weight distribution at 18 years of age among youngsters from the city of Pelotas. Pelotas (n = 2.194), Brazil, 2010 Weight distribution at 18 years of age Percentage Assessing the relationship between two variables The forms of data presentation that have bee described up to this point illustrated the distributio of a given variable, whether categorical or numerica In addition, it is possible to present the relationsh between two variables of interest, either categorical numerical. The relationship between categorical variabl may be investigated using a contingency table, whic has the purpose of analyzing the association betwee two or more variables. The lines of this type of tab usually display the exposure variable (independe variable), and the columns, the outcome variab (dependent variable). For example, in order to stud 40.5 to 59.9 554 25.25 60.0 to 65.8 543 24.75 65.9 to 74.6 551 25.11 74.7 to 147.8 546 24.89 Total 2.194 100.00 0 20 40 60 80 100 120 140 Weight distribution at 18 years of age 40 30 20 10 0 Weight distribution at 18 years of age Percentage Weight distribution among 18-year-old young male sex (n = 2.194). Pelotas, Brazil, 2010.[12] Weight distribution at 18 years of age among youngsters from the city of Pelotas. Pelotas (n = 2.194), Brazil Fig-15 56
  • 58. Frequency polygon: 1. The most commonly used graphic device to illustrate statistical distribution. 2. Used to represent frequency distribution of quantitative data. 3. Useful to compare 2 or more frequency distributions. • A frequency polygon is a variation of a histogram, in which the bars are replaced by lines connecting the midpoints of the tops of the bars. • Advocates of the frequency polygon argue that the purpose of a histogram is to show the shape of the data distribution and removing the bars makes the shape clearer and smoother. Fig-17 58
  • 59. Construction of frequency polygon: • Variables is taken along the X axis and frequencies along the Y axis • Class frequencies are plotted against the class mid-values and then these points are joined by a straight line which gives a figure of frequency polygon. • Total area under the frequency curve represents the total frequency. Advantages of frequency polygon: • It is very easy to construct and very easy to interpret. • It is useful in portraying more than two distributions on the same graph paper with different colours. So it is very useful to compare 2 or more than 2 distributions. 59
  • 60. Frequency curve:- When the number of observations are very large and class intervals very much reduced the frequency polygon tends to loose its angulation and it forms a smooth curve known as frequency curve. • Variables is taken along the X axis and frequency along Y axis • Frequencies are plotted against the class mid-values and then, these points are joined by a smooth curve. • The curve so obtained is the frequency curve. • Total area under the frequency curve represents total frequency. Fig-18 60
  • 61. Line diagram: • This diagram is useful to study changes of values in the variable overtime. • Simplest type of diagram. • On the X axis the time such as hours, days, weeks, months or years are represented. • The value of any quantity pertaining to this is represented along the Y axis. Fig-19 61 MTPs during 2002 to 2022
  • 62. Cumulative frequency diagram or Ogive • Ogive is a graph of the cumulative relative frequency distribution. • To draw this, an ordinary frequency distribution table in a quantitative data has to be converted into a cumulative frequency table. • Cumulative frequency of a class interval is the total number of persons from lowest value of the characteristic up to the highest value of the class under consideration. It is obtained by adding the frequencies of previous classes including the class in question. • Here the frequency of data in each category represents the sum of data from the category and the preceding categories. • Cumulative frequencies are plotted opposite the group limits of the variable. • These points are joined by smooth free hand curve to get a cumulative frequency diagram or Ogive. 62
  • 63. Example: Fig-20 63 Distribution of weights in 156 individuals Fig-21
  • 64. Scatter diagram or dot diagram: • It is a graphic presentation of data. • It is used to show the nature of co-relation between 2 variables. Also called as Correlation diagram ,it is useful to represent the relationship between two numeric measurements, each observation being represented by a point corresponding to its value on each axis. If the data points make a straight line going from the origin out to high x ‐ and y ‐ values, then the variables are said to have a positive correlation. If the line goes from a high value on the y ‐ axis down to a high value on the x ‐ axis, the variables have a negative correlation. In case no trend was shown, it is called no correlation.[10] Fig-22 64
  • 65. BLAND–ALTMAN PLOT A Bland–Altman plot (difference plot) is a method of data plotting used in analyzing the agreement between two different assays. In the Bland–Altman plot, the differences (between the two methods) are plotted against the averages of the two methods. Alternatively, we can choose to plot the differences (between the two methods) against one of the two methods, if this is a reference method of both methods. Potassium level (mEq/L) (Obtained from venous blood gas analysis) Potassium level (mEq/L) (Obtained from blood electrolyte levels) Mean potassium level (mEq/L) Difference between potassium levels (mEq/L) Patient Nr. 1 4.5 4.7 4.6 0.2 Patient Nr. 2 3.8 4.2 4.0 0.4 Patient Nr. 3 5.1 5.1 5.1 0.0 Patient Nr. 4 4.9 5.3 5.1 0.4 Patient Nr. 5 3.9 4.0 3.95 0.1 Patient Nr. 6 4.0 3.8 3.9 -0.2 Patient Nr. 7 4.1 4.0 4.05 -0.1 Patient Nr. 8 4.3 4.0 4.15 -0.3 Patient Nr. 9 5.3 5.3 5.3 0.0 Patient Nr. 10 5.2 5.1 5.15 -0.1 Patient Nr. 11 3.9 4.0 3.95 0.1 Patient Nr. 12 4.1 4.4 4.25 0.3 Patient Nr. 13 4.0 4.2 4.1 0.2 Patient Nr. 14 5.3 5.1 5.2 -0.2 Patient Nr. 15 5.5 5.3 5.4 -0.2 Patient Nr. 16 4.4 4.2 4.3 -0.2 Patient Nr. 17 4.9 5.0 4.95 0.1 Patient Nr. 18 3.7 3.9 3.8 0.2 Patient Nr. 19 3.9 3.7 3.8 -0.2 Patient Nr. 20 4.8 4.7 4.75 -0.1 Patient Nr. 21 5.5 5.2 5.35 -0.3 Patient Nr. 22 3.7 3.8 3.75 0.1 Patient Nr. 23 3.7 3.9 3.80 0.2 Patient Nr. 24 4.8 4.2 4.5 -0.6 Patient Nr. 25 5.1 5.6 5.35 0.5 Dataset for potassium levels in venous blood gases and blood electrolyte work-up. 65
  • 66. For our dataset, the mean difference (mean bias) was found as 0.012 with an SD of 0.260. A scatterplot should be drawn to understand dispersion of variables using X-axis (average) and Y-axis (difference). The LOA can be drawn manually if the statistical software does not automatically demonstrate them. In our data set, the upper limit can be calculated using mean + 1.96 x SD (0.012 + 1.96 x 0.260 = 0.522) and the lower limit can be calculated using mean – 1.96 x SD (0.012–1.96 x 0.260 = –0.498). The appropriate statement used in the manuscript can be following: The Bland-Altman plot showed the mean bias ±SD between first and second potassium levels as 0.012 ± 0.260 mEq/L, and the limits of agreement were −0.498 and 0.522[13] Fig-22 Agreement between two potassium measurements (Bland-Altman plot). 66
  • 67. FOREST PLOT A forest plot, also known as a blobbogram, is a graphical display of estimated results from a number of scientific studies addressing the same question, along with the overall results. It is a graphical representation of a meta ‐ analysis. It is usually accompanied by a table listing references (author and date) of the studies with their estimated result included in the meta ‐ analysis.[10] Fig-24 67
  • 68. f1 f2 f3 f4 f5 Factors 0.0 0.5 1.0 1.5 Odds ratio (95% CI) 2.0 2.5 * * Fig. 12. An example of a dot plot with an error bar. For each level of factors (y-axis), corresponding odds ratio (OR) and 95% CIs are presented using dots and accompanying horizontal error bar. The dotted line indicates the reference value of 1. The estimated OR would not be different from 1.0 statistically if its error bar crossed this reference line. An example of a dot plot with an error bar. For each level of factors (y-axis), corresponding odds ratio (OR) and 95% CIs are presented using dots and accompanying horizontal error bar. The dotted line indicates the reference value of 1. The estimated OR would not be different from 1.0 statistically if its error bar crossed this reference line. of the 95% CI of the estimated coefficient. The estimated regression line formula is a Table 6. Estimated OR and 95% CI of Logistic Regression Model Factor OR (95% CI) P value F1 1.24 (1.12, 1.38)* < 0.001 F2 1.76 (1.26, 2.51)* 0.001 F3 1.10 (0.80, 1.50) 0.557 F4 1.00 (0.98, 1.02) 0.810 F5 1.09 (0.99, 1.20) 0.083 OR: odds ratio. *Two-sided P < 0.05. Survival analysis Survival analysis is a statistical method that can be applied to mortality data and various types of longitudinal data. There are various methods, from the nonparametric Kaplan-Meier method to more complex methods involving different parametric models. Kaplan-Meier survival analysis and Cox regression models are widely used in the medical field. Survival analysis results usually accompany the survival curve, which can increase the reader’s un- derstanding of the results through visualization. For details on the survival curve, refer to the previous Statistical Round article [5,6]. Dose-re f1 f2 f3 f4 f5 Factors 0.0 Fig. 12. of factor presente dotted l would n referenc Estimated OR and 95% CI of Logistic Regression Model Fig-25 68
  • 69. Bar diagram 1. This diagram is used to represent qualitative data. 2. It represent only one variable. 3. The width of the bar remains the same and only the length varies according to the frequency in each category. There are 3 types of bars: simple bar multiple bar or compound bar component bar diagram or proportional bar or stacked bar 69
  • 70. Simple bar: The limitation of this method is that they can represent only on the classification and hence cannot be used for comparison. Fig-26 70 Mortality due to various cases Fig-27 Cases of gastroenteritis in a hospital in 2022
  • 71. Multiple bar or compound bar: Here two or more bars are grouped together, as in fi g.28 population of a country is shown with three bars each showing population of Hindus, Muslims and others over two censuses. Fig.29 shows sexwise and standard wise distribution of students passing from a school. Fig-28 Population of a country as per the religion Fig-29 %of students passing in school 71
  • 72. Component bar diagram: • This diagram is used to represent qualitative data. • It is desired to represent both the no of cases in major groups as well as the subgroups simultaneously. Fig-30 72 Expenditures on various items in two communities Fig-31 Proportion of energy obtained from various food stuffs by rich and poor community
  • 73. Pie diagram: • These are popularly used to show percentage break downs for qualitative data. • It is so called because the entire graph looks like a pie and its components represent slices cut from a pie. • A circle is divided into different sectors corresponding to the frequencies of the distribution. • Some knowledge of circles and degrees is necessary. • The total angle at the center of the circle is 360 degrees and it represents the total frequency. • After the calculation of angle, segments are drawn in the circle and the segments are shaded with different shades or colors and an index is provided for the shaded colors. • Cannot be used to represent 2 or more data set. 73 Fig-32 pattern of expenditure in an urban community
  • 74. hysterectomy), laparoscopic anterior resection of the colon, and TKRA. TKRA: total knee replacement arthroplasty, RMW: regulated medical waste (Adapted from Korean J Anesthesiol 2017; 70: 100-4). Fig. 5. Pie chart. Total weight of each component from the three operations. RMW: regulated medical waste (Adapted from Korean J Anesthesiol 2017; 70: 100-4). RMW Blue wrap Clear wrap Plastics Cardboard 29,344 g 2,102 g 2,838 g 2,388 g 1,564 g the median and "whiskers" above a of the minimum and maximum. Fig. 7. Box graph with whiskers consumed during the observat significantly. Data are expressed quartile, third interquartile, and m from Korean J Anesthesiol 2017; 70 0 60 40 20 Control Calculated amount of consumption volume of desflurane (ml) Pie chart. Total weight of each component from the three operations. RMW: regulated medical waste (Adapted from Korean J Anesthesiol 2017; 70: 100-4).[11] 74 Fig-33
  • 75. 75 Venn Diagram • It shows the degrees of overlap and exclusivity for two or more characteristics or factors within a sample or population (in which case each characteristic is represented by a whole circle) or for a characteristic or factor among two or more samples or populations (in which case each sample or population is represented by a whole circle). • The sizes of the circles (or other symbols) need not be equal and may represent the relative size for each factor or population. Fig -34 No of covid cases as per reporting agency
  • 76. Pictogram • Display of data through pictograms was initiated by Dr Otto Neurath in 1923. • Data are displayed by the pictures of the items to which the data pertain. • A single picture represents a fixed no. • They are the least satisfactory type of diagrams. • They are inaccurate too. Fig-35 76
  • 77. Map diagram or spot map or cartograms: 1. These maps are used to show geographical distribution of frequencies of a characteristics such as IMR, MMR, etc. Estimated Infant Mortality Rate-2015 Fig-29 77
  • 78. Other types of presentation of data STEM AND LEAF- • It is mainly used for the presentation of quantitative data. • It is used to study the shape of the distribution. • Can be used to compare two or more distributions. • It is useful for smaller data set. • It can be displayed by two whole digits, one for the stem and one for the leaf. Consider this example of two groups of patients with hypertension having weights as given below: Group I: 50, 51, 60, 62, 63, 65, 68, 74, 78, 82,83,84,85 Group II: 51, 52, 53, 54, 56, 58, 61, 63, 65, 67, 68, 71, 72, 80, 85 We can present this in tabular form as below: 78 Fig-30
  • 79. Class intervals are represented by stem. For group one class intervals 50 to 59, 60 to 69, 70 to 79 and 80 to 89 are represented by stems 5, 6, 7 and 8 respectively. Now the weights of 51 and 68 are represented by leaf 1 to stem 5 and leaf 8 to stem 6 respectively. The stem and leaf plot for group I data can be shown as below: The stem and leaf plot for group I and group Il data can be shown as below: 79 Fig-31 Fig-32
  • 80. Box and whisker plot : It is a representation of the quartiles (25%, 50% & 75% ) and the range of a continuous and ordered data set. The y-axis can be arithmetic or logarithmic. Box plots can be used to compare different distributions of data values. Steps for drawing box and whisker plots: • Determine from the given data set smallest, largest Q1,02 and Q3 i.e. first, second and third quartile respectively. • Mark the scale on X or Y axis Draw a box (that is a rectangle with width as much as possible and length as Q3- Q1) with ends through the points for the first and third quartiles. • Draw a vertical line through the box at the median point (Q2) • Draw the whiskers (lines) from each end of the box to the smallest and largest values. 80
  • 81. Box plots characterize a sample using the minimum, 25th, 50th, and 75th percentiles, maximum values. The interquartile range (IQR = Q3 − Q1, where Q1 is first quartile or 25th percentile while Q3 is third quartile or 75th percentile) which covers the central 50% of the data. Quartiles are insensitive to outliers and preserve information about the center and spread (variation). If a data point is below Q1−1.5×IQR or above Q3+1.5×IQR ,it is viewed as being too far from the central values (median), which are called outliers. An example of a box-whisker plot. Estimated median (Q1, Q3) [min:max] from the sample data is 1.1 (0.8, 1.3) [0.1:2.1]. This graph includes explanations of the components of the box-whisker plot. These are not necessary for the general purpose of publication. A significance marker can be added, though it was not used in this graph. If a significance maker is added, it should be located on the shoulder or alongside the whisker. If markers are located over the mid-top of the whiskers, these could be interpreted as outliers if no detailed explanation is provided. The limits of the whiskers can be varied depending on the purpose. Fig-33 81
  • 82. Fig-33 Box & whisker plot showing the distribution of height of boys in two classes A & B 82
  • 83. Types of Charts Depending on the Method of Analysis of the Data Analysis Subgroup Number of variables Type Comparison Among items Two per items Variable width column chart One per item Bar/column chart Over time Many periods Circular area/line chart Few periods Column/line chart Relationship Two Scatter chart Three Bubble chart Distribution Single Column/line histogram Two Scatter chart Three Three-dimensional area chart Comparison Changing over time Only relative di ff erences matter Stacked 100% column chart Relative and absolute di ff erences matter Stacked column chart Static Simple share of total Pie chart Accumulation Waterfall chart Components of components Stacked 100% column chart with subcomponents 83
  • 84. In conclusion we have covered the basics of data collection, from defining data types to exploring measurement scales. We discussed and outlined various sources for data collection. Text, tables, and graphs are effective communication media that present and convey data and information. They aid readers in understanding the content of research, sustain their interest, and effectively present large quantities of complex information. As journal editors and reviewers will scan through these presentations before reading the entire text, their importance cannot be disregarded. For this reason, authors must pay as close attention to selecting appropriate methods of data presentation as when they were collecting data of good quality and analyzing them. In addition, having a well- established understanding of different methods of data presentation and their appropriate use will enable one to develop the ability to recognize and interpret inappropriately presented data or data presented in such a way that it deceives readers' eyes. CONCLUSION 84
  • 85. 1.Jay S. Kim And Ronald J. Dailey. Biostatistics For Oral Healthcare. Blackwell Publishing Company.2008 2.C.R Kothari. Research Methodology methods and technologies. 4th edition. New age international private Ltd publishers; 2019. reprint 2021 3.Khanal AB. Mahajan’s methods in biostatistics for medical students and research workers. 9th ed. New Delhi, India: Jaypee Brothers Medical; 2015. 4.Dr. J.V Dixit. Principles and Practice Of Biostatistics. 8th edition.Bhanot 5. Rao TB. Methods of biostatistics. 3rd ed. Hyderabad: Paras Medical Publisher; 2010 6. C.M. Marya. A textbook of public health dentistry. 1st Edition. New Delhi: Jaypee Brothers Medical Publishers. 2011 7.Mazhar SA, Anjum R, Anwar AI, Khan AA.Methods of Data Collection: A Fundamental Tool of Research. J Integ Comm Health. 2021;10(1):6-10. 8.Researchgate.net. [cited 2023 Dec 18]. Available from: https://www.researchgate.net/publication/ 325846997_METHODS_OF_DATA_COLLECTIONenrichId=rgreqf6733eb7ba5b1666d4b32342979e ad09XXX&enrichSource=Y292ZXJQYWdlOzMyNTg0Njk5NztBUzo2NDE0NjI5MDc3MTU1ODVAMT UyOTk0ODA4MzU4Ng%3D%3D&el=1_x_2&_esc=publicationCoverPdf 9.Bhandari P. Data collection [Internet]. Scribbr. 2020 [cited 2023 Dec 19]. Available from: https:// www.scribbr.com/methodology/data-collection/ 86 REFERENCES
  • 86. 10. Mishra P, Pandey CM, Singh U, Gupta A. Scales of measurement and presentation of statistical data. Ann Card Anaesth 2018;21:419-22 11. Shinn HK, Hwang Y, Kim BG, Yang C, Na W, Song JH, et al. Segregation for reduction of regulated medical waste in the operating room: a case report. Korean J Anesthesiol 2017; 70: 100-4. 12. Duquia RP, Bastos JL, Bonamigo RR, González-Chica DA, Martínez-Mesa J. Presenting data in tables and charts. An Bras Dermatol. 2014;89(2):280-5. 13. Nurettin Özgür Doğan, Bland-Altman analysis: A paradigm to understand correlation and agreement, Turkish Journal of Emergency Medicine, Volume 18, Issue 4, 2018, Pages 139-141 87