SlideShare a Scribd company logo
1 of 88
1
Introduction to SPSS
Data types and SPSS
data entry and analysis
2
In this session
 What does SPSS look like?
 Types of data (revision)
 Data Entry in SPSS
 Simple charts in SPSS
 Summary statistics
 Contingency tables and crosstabulations
 Scatterplots and correlations
 Tests of differences of means
3
SPSS
4
Aspects of SPSS
 Menus - Analyse and Charts esp.
 Spreadsheet view of data
 Rows are cases (people, respondents etc.)
 Columns are Variables
 Variable view of data
 Shows detail of each variable type
5
Questionnaire Data Coding
6
In SPSS
 We change ticks etc. on a questionnaire into
numbers
 One number for each variable for each case
 How we do this depends on the type of
variable/data
7
Types of data
 Nominal
 Ranked
 Scales/measures
 Mixed types
 Text answers (open ended questions)
8
Nominal (categorical)
 order is arbitrary
 e.g. sex, country of birth, personality type, yes or no.
 Use numeric in SPSS and give value labels.
(e.g. 1=Female, 2=Male, 99=Missing)
(e.g. 1=Yes, 2=No, 99=Missing)
(e.g. 1=UK, 2=Ireland, 3=Pakistan, 4=India, 5=other,
99=Missing)
9
Ranks or Ordinal
 in order, 1st, 2nd, 3rd etc.
 e.g. status, social class
 Use numeric in SPSS with value labels
 E.g. 1=Working class, 2=Middle class, 3=Upper
class
 E.g. Class of degree, 1=First, 2=Upper second,
3=Lower second, 4=Third, 5=Ordinary,
99=Missing
10
Measures, scales
1. Interval - equal units
 e.g. IQ
2. Ratio - equal units, zero on scale
 e.g. height, income, family size, age
 Makes sense to say one value is twice another
 Use numeric (or comma, dot or scientific) in
SPSS
 E.g. family size, 1, 2, 3, 4 etc.
 E.g. income per year, 25000, 14500, 18650 etc.
11
Mixed type
 Categorised data
 Actually ranked, but used to identify
categories or groups
 e.g. age groups
 = ratio data put into groups
 Use numeric in SPSS and use value
labels.
 E.g. Age group, 1=‘Under 18’, 2=‘18-24’, 3=‘25-
34’, 4=‘35-44’, 5=‘45-54’, 6=‘55 or greater’
12
Text answers
 E.g. answers to open-ended questions
 Either enter text as given (Use String in SPSS)
 Or
 Code or classify answers into one of a small number
types. (Use numeric/nominal in SPSS)
Quantifying Data
 Before we can do any kind of analysis, we
need to quantify our data
 “Quantification” is the process of converting
data to a numeric format
 Convert social science data into a “machine-
readable” form, a form that can be read &
manipulated by computer programs
Quantifying Data
Some transformations are simple:
 Assign numeric representations to nominal
or ordinal variables:
 Turning male into “1” and female into “2”
 Assigning “3” to Very Interested, “2” to
Somewhat Interested, “1” to Not Interested
 Assign numeric values to continuous
variables:
 Turning born in 1973 to “35”
Developing Code Categories
Some data are more challenging. Open-ended
responses must be coded.
 Two basic approaches:
 Begin with a coding scheme derived from the
research purpose.
 Generate codes from the data.
Coding Quantitative Data
 Goal – reduce a wide variety of information to
a more limited set of variable attributes:
 “What is your occupation?”
 Use pre-established scheme: Professional,
Managerial, Clerical, Semi-skilled, etc.
 Create a scheme after reviewing the data
 Assign value to each category in the scheme:
Professional = 1, Managerial = 2, etc.
 Classify the response: “Secretary” is “clerical” and is
coded as “3”
Coding Quantitative Data
 Points to remember:
 If the data are coded to maintain a good amount
of detail, they can always be combined (reduced)
later
 However, if you start off with too little detail, you
can’t get it back
 If you’re using a survey / questionnaire, it’s a
good idea to do your coding on the form so that it
can be entered properly (i.e. create a “codebook”)
Codebook Construction
Purposes:
 Primary guide used in the coding process.
 Should note the value assigned to each variable
attribute (response)
 Guide for locating variables and interpreting
codes in the data file during analysis.
 If you’re doing your own input, this will also
guide data set construction
19
Data Entry in SPSS
 Video by Andy Field
 https://www.youtube.com/watch?v=b163iBBy
ycw&index=1&list=PL25257A24840423AE
20
SPSS Variable View
21
Data Entry into SPSS
There are 2 ways to enter data into SPSS:
1. Directly enter in to SPSS by typing in Data View
2. Enter into other database software such as
Excel then import into SPSS
Let’s start with the second option, using data in Excel.
22
Data from Hell
23
Data from Heaven
24
Importing data from Excel spreadsheet into SPSS.
In SPSS, go to:
File, Open, Data
Select Type of file (for example, Excel) you want to open
Select File name you want to open
25
Importing data from SPSS to Excel.
In SPSS, go to:
Data, Save as,
Select Type of file (for example, Excel) you want to save into
Give File name you want to save into
26
Frequency counts
 Used with categorical and ranked variables
 e.g. gender of students taking Health and
Illness option
Sex of student
Frequency Percent Valid Percent
Cumulative
Percent
Female 25 73.5 73.5 73.5
Male 9 26.5 26.5 100.0
Valid
Total 34 100.0 100.0
27
e.g. Number of GCSEs passed by students taking
Health and Illness option
Number of GCSEs
Frequency Percent Valid Percent
Cumulative
Percent
0 1 2.9 2.9 2.9
1 1 2.9 2.9 5.9
2 4 11.8 11.8 17.6
3 6 17.6 17.6 35.3
4 4 11.8 11.8 47.1
5 2 5.9 5.9 52.9
6 6 17.6 17.6 70.6
7 3 8.8 8.8 79.4
8 2 5.9 5.9 85.3
9 3 8.8 8.8 94.1
13 1 2.9 2.9 97.1
14 1 2.9 2.9 100.0
Valid
Total 34 100.0 100.0
28
Central Tendency
 Mean
 = average value
 sum of all the values divided by the number of values
 Mode
 = the most frequent value in a distribution
 (N.B. it is possible to have 2 or more modes, e.g. bimodal
distribution)
 Median
 = the half-way value, or the value that divides the ordered
distribution in the middle
 The middle score when scores are ordered
 N.B. need to put values into order first
29
Dispersion and variability
 Quartiles
 The three values that split the sorted data into
four equal parts.
 Second Quartile = median.
 Lower quartile = median of lower half of the data
 Upper quartile = median of upper half of the data
 Need to order the individuals first
 One quarter of the individuals are in each inter-
quartile range
30
Used on Box Plot
Statistics
Age
Valid 34
N
Missing 0
Mean 24.03
Median 21.00
Upper quartile
Lower quartile
Median
Age of Health and Illness students
31
Variance
 Average deviation from the mean, squared
 5.20 is the Sum of Squares
 This depends on number of individuals so we divide by n (5)
 Gives 1.04 which is the variance
Score Mean Deviation
Squared
Deviation
1 2.6 -1.6 2.56
2 2.6 -0.6 0.36
3 2.6 0.4 0.16
3 2.6 0.4 0.16
4 2.6 1.4 1.96
Total 5.20
32
Standard Deviation
 The variance has one problem: it is
measured in units squared.
 This isn’t a very meaningful metric so we take
the square root value.
 This is the Standard Deviation
33
Using SPSS
 ‘Analyse>Descriptive>Explore’ menu.
 Gives mean, median, SD, variance, min,
max, range, skew and kurtosis.
 Can also produce stem and leaf, and
histogram.
34
Charts in SPSS
 Use ‘Chart Builder’ from ‘Graph’ menu or the
Legacy menu
 And/or double click chart to edit it.
 E.g. double click to edit bars (e.g. to change
from colour to fill pattern).
 Do this in SPSS first before cut and paste to
Word
 Label the chart (in SPSS or in Word)
35
Stem and leaf plots
 e.g. age of students taking Health and Illness
option
 good at showing
 distribution of data
 outliers
 range
36
Stem and leaf plots e.g.
Age Stem-and-Leaf Plot
Frequency Stem & Leaf
6.00 1 . 999999
17.00 2 . 00000000001111134
5.00 2 . 55678
3.00 3 . 123
1.00 3 . 5
2.00 Extremes (>=36)
Stem width: 10
Each leaf: 1 case(s)
37
Box Plot
Statistics
Age
Valid 34
N
Missing 0
Mean 24.03
Median 21.00
38
Box Plot
Fill colour
changed.
N.B. numbers refer
to case numbers.
39
Histograms and bar charts
 Length/height of bar indicates frequency
40
Histogram
Fill pattern suitable
for black and white
printing
41
Changing the bin size
Bin size made
smaller to show
more bars
42
Pie chart
 angle of segment indicates proportion of the
whole
Pie Chart
Shadow and one
slice moved out for
emphasis
Analysing relationships
 Contingency tables or crosstabulations
 Compares nominal/categorical variables
 But can include ordinal variables
 N.B. table contains counts (= frequency data)
 One variable on horizontal axis
 One variable on vertical axis
 Row and column total counts known as marginals
Example
 In the Health and
Illness class, are
women more
likely to be under
21 than men?
Crosstabulations
 e.g.
 Use column and row percentages to look for
relationships
SPSS output
Chi-square ²
Cross tabulations and Chi-square are tests that
can be used to look for a relationship between
two variables:
 When the variables are categorical so the
data are nominal (or frequency).
 For example, if we wanted to look at the
relationship between gender and age.
 There are several different types of Chi-square
(²), we will be using the 2 x 2 Chi-square
2x2 Chi-square results in
SPSS
Another example
 The Bank employees data
Bank Employees
Chi-Square tests
Chi-Square analysis on SPSS
 http://www.youtube.com/watch?v=Ahs8jS5m
JKk 4m15s
 http://www.youtube.com/watch?v=IRCzOD27
NQU
 From 6m:30s to 9m:50s
 http://www.youtube.com/watch?v=532QXt1P
M-
Q&feature=plcp&context=C3ba91a4UDOEgs
ToPDskJ-ABupdp-Yfvuf4j4fJGzV 12m30s
Low values in cells
 Get SPSS to output expected values
 Look where these are <5
 Consider recoding to combine cols or rows
Tabulating questionnaire
responses
 Categorical survey data often “collapsed” for purposes of data
analysis
Original category Frequency Collapsed category Frequency
White British 284 White 304
White Irish 7
Other White 13
Indian 40 South Asian 105
Pakistani 32
Bangladeshi 33
Chinese 16 Chinese 16
Black British 30 Black 44
Afro-Caribbean 12
African 2
An analysis on a sample of 2 (e.g. Black African) would not have been very meaningful!
Recoding variables
 http://www.youtube.com/watch?v=uzQ_522F
2SM&feature=related
 Ignore t-test for now 6m11s
 http://www.youtube.com/watch?v=FUoYZ_f6
Lxc
 Uses old version of SPSS, no submenu now. 6m
Scatterplots and correlations
 Looks for association between variables, e.g.
 Population size and GDP
 crime and unemployment rates
 height and weight
 Both variables must be rank, interval or
ratio (scale or ordinal in SPSS).
 Thus cannot use variables like, gender,
ethnicity, town of birth, occupation.
56
57
Scatterplots
 e.g. age (in years) versus Number of GCSEs
Interpretation
 As Y increases
X increases
 Called
correlation
 Regression line
model in red
58
Correlation measures
association not causation
 The older the child the better s/he is at reading
 The less your income the greater the risk of
schizophrenia
 Height correlates with weight
 But weight does not cause height
 Height is one of the causes of weight (also body
shape, diet, fitness level etc.)
 Numbers of ice creams sold is correlated with
the rate of drowning
 Ice creams do not cause drowning (nor vice versa)
 Third variable involved – people swim more and buy
more ice creams when it’s warm
59
Scatterplot in SPSS
 Use Graph menu
 http://www.youtube.com/watch?v=74BjgPQvI
Eg 8m34s
 http://www.youtube.com/watch?v=blfflA-
34pQ&feature=related 4m04s
 http://www.youtube.com/watch?v=UVylQoG4
hZM 1m50s, ignore polynomial regression
60
Modifying the Scatterplot
 http://www.youtube.com/watch?v=803YCYA2
AoQ&feature=related 4m04s
 http://www.youtube.com/watch?v=vPzvuMuV
Xk8&feature=related 3m40s
61
If mixed data sets
 Change point icon and/or colour to see
different subsets.
 Overall data may have no relationship but
subsets might.
 E.g. show male and female respondents.
 Use Chart builder
62
63
Correlation
 Correlation coefficient = measure of strength
of relationship, e.g. Pearson’s r
 varies from 0 to 1 with a plus or minus sign
Correlations
Number of
GCSEs Age
Pearson Correlation 1 -.415
*
Sig. (2-tailed) .015
Number of GCSEs
N 34 34
Pearson Correlation -.415
*
1
Sig. (2-tailed) .015
Age
N 34 34
*. Correlation is significant at the 0.05 level (2-tailed).
64
Positive correlation
 as x increases, y increases
r = 0.7
65
Negative correlation
 as x increases, y decreases
r = -0.7
66
Strong correlation (i.e. close to 1)
r = 0.9
67
Weak correlation (i.e. close to 0)
r = 0.2
Interpretation cont.
 r2 is a measure of degree of variation in
one variable accounted for by variation
in the other.
 E.g. If r=0.7 then r2=.49 i.e. just under half
the variation is accounted for (rest
accounted for by other factors).
 If r=0.3 then r2=0.09 so 91% of the
variation is explained by other things.
68
Significance of r
 SPSS reports if r is significant at α=0.05
 N.B. this is dependent on sample size to a
large extent.
 Other things being equal, larger samples
more likely to be significant.
 Usually, size of r is more important than
its significance
69
Pearson’s r in SPSS
 http://www.youtube.com/watch?v=loFLqZmvf
zU 6m57s
70
Parametric and non-parametric
 Some statistics rely on the variables being
investigated following a normal distribution. –
Called Parametric statistics
 Others can be used if variables are not
distributed normally – called Non-parametric
statistics.
 Pearson’s r is a parametric statistic
 Kendal’s tau and Spearman’s rho (rank
correlation) are non-parametric.
71
Assessing normality
 Produce histogram and normal plot
72
Use statistical test
 SPSS provides two formal tests for normality
: Kolmogorov-Smirnov (K-S) and Shapiro-
Wilks (S-W)
 But, there is debate about KS
 Extremely sensitive to departure from normality
 May erroneously imply parametric test not
suitable – especially in small sample
 So, always use a histogram as well.
73
Often can use parametric tests
 Parametric tests (e.g. Pearson’s r) are robust
to departures from normality
 Small, non-normal samples OK
 But use non-parametric if
 Data are skewed (questionnaire data often is)
 Data are bimodal
74
Spearmans’s rho
 http://www.youtube.com/watch?v=r_WQe2c-
ISU From 4.14 to 4.56
 http://www.youtube.com/watch?v=POkFi5vKv
I8&feature=fvwrel 6m16s
75
So far…
 Looked at relationships between nominal
variables
 Gender vs age group
 Looked at relationships between scale
variables
 Height vs. Weight
 Now combine the two
 Groups vs a scale variable
 E.g. Gender vs income
76
Reminder – IV vs DV
 IV = independent variable
 What makes a difference, causes effects, is responsible
for differences.
 DV = dependent variable
 What is affected by things, what is changed by the IV.
 Gender vs income. Gender = IV, income = DV
 So we investigate the effect of gender on income
77
Example 1
Age group vs. no. of GCSEs
 Using the Health and Illness class data
 Age group defines 2 groups
 Under 21
 21 and over
 Just two groups
 Can use independent samples t-test
 Independent because the two groups consist
of different people.
 t-test compares the means of the 2 groups. 78
79
Difference of means
 Do under 21s have more or fewer GCSEs
than 21 and overs?
 Means are different (6.44 & 4.28) but is that
significant?
Group Statistics
Age group N Mean Std. Deviation Std. Error Mean
Under 21 16 6.44 3.140 .785
Number of GCSEs
21 and over 18 4.28 2.906 .685
80
Independent Samples Test
s Test for Equality of
Variances t-test for Equality of Mean
Sig. t df Sig. (2-tailed)
Mean
Difference
Std.
Diffe
.164 .689 2.082 32 .045 2.160
2.073 30.789 .047 2.160
Independent Samples Test
Levene's Test for Equality of
Variances t-test for Equality of Means
95% Confidence Interval of the
Difference
F Sig. t df Sig. (2-tailed)
Mean
Difference
Std. Error
Difference Lower Upper
Equal variances assumed .164 .689 2.082 32 .045 2.160 1.037 .047 4.272
Number of GCSEs
Equal variances not
assumed
2.073 30.789 .047 2.160 1.042 .034 4.285
No significant difference therefore
assume equal variances
Means are
statistically
significantly
different
Parametric vs non-parametric
 Just as in the case of correlations, there are
both kinds of tests.
 Need to check if DV is normally distributed.
 Do this visually
 Also use statistical tests
81
Tests for normality
 Kolmogorov-Smirnov and Shapiro-Wilk
 If n>50 use KS
 If n≤50 use SW
 Null hypothesis is ‘data are normally distributed’.
 So if p<0.05 then data are significantly different
from a normal distribution – use non-
parametric tests
 If p≥0.05 then no significant difference – use
parametric tests
82
Checking normality
 Produce histogram of DV
 Tick box to undertake statistical test
 Interpret results.
83
t-test
 Identify your two groups.
 Determine what values in the data indicate
those two groups (e.g. 1=female, 2=male)
 Select Analyze:Compare Means:Independent
samples t-test
 http://www.youtube.com/watch?v=_KHI3ScO
8sc 9m40s
84
Mann-Whitney U test
 Use this when comparing two groups and the
DV is not normally distributed
 http://www.youtube.com/watch?v=7iTvv3m9d
_g 3m45s
85
Comparing 3 or more groups
 ANOVA = Analysis of Variance
 Analyze: Compare Means: One-way ANOVA
 http://www.youtube.com/watch?v=wFq1b3QjI
1U 4m04s
Useful to get table of means (descriptives) and
means plots from ANOVA options.
86
ANOVA Means and F value
87
ANOVA Means Plot
88

More Related Content

Similar to Intro to SPSS.ppt

Spss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatSpss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatMarwa Zalat
 
Business statistics (Basics)
Business statistics (Basics)Business statistics (Basics)
Business statistics (Basics)AhmedToheed3
 
n 2 3 n99 2.58 95 1.96 90 1.645.docx
n 2 3 n99 2.58 95 1.96 90 1.645.docxn 2 3 n99 2.58 95 1.96 90 1.645.docx
n 2 3 n99 2.58 95 1.96 90 1.645.docxgilpinleeanna
 
MELJUN CORTES research seminar_1__data_analysis_basics_slides
MELJUN CORTES research seminar_1__data_analysis_basics_slidesMELJUN CORTES research seminar_1__data_analysis_basics_slides
MELJUN CORTES research seminar_1__data_analysis_basics_slidesMELJUN CORTES
 
MELJUN CORTES research seminar_1_data_analysis_basics
MELJUN CORTES research seminar_1_data_analysis_basicsMELJUN CORTES research seminar_1_data_analysis_basics
MELJUN CORTES research seminar_1_data_analysis_basicsMELJUN CORTES
 
MELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updates
MELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updatesMELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updates
MELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updatesMELJUN CORTES
 
data analysis techniques and statistical softwares
data analysis techniques and statistical softwaresdata analysis techniques and statistical softwares
data analysis techniques and statistical softwaresDr.ammara khakwani
 
EDUCATIONAL STATISTICS_Unit_I.ppt
EDUCATIONAL STATISTICS_Unit_I.pptEDUCATIONAL STATISTICS_Unit_I.ppt
EDUCATIONAL STATISTICS_Unit_I.pptSasi Kumar
 
General Statistics boa
General Statistics boaGeneral Statistics boa
General Statistics boaraileeanne
 
B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2marshalkalra
 
1. chapter i(pasw)
1. chapter i(pasw)1. chapter i(pasw)
1. chapter i(pasw)Chhom Karath
 
PUH 6301, Public Health Research 1 Course Learning Ou
 PUH 6301, Public Health Research 1 Course Learning Ou PUH 6301, Public Health Research 1 Course Learning Ou
PUH 6301, Public Health Research 1 Course Learning OuTatianaMajor22
 
Introduction To Statistics
Introduction To StatisticsIntroduction To Statistics
Introduction To Statisticsalbertlaporte
 

Similar to Intro to SPSS.ppt (20)

Spss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatSpss basic Dr Marwa Zalat
Spss basic Dr Marwa Zalat
 
Business statistics (Basics)
Business statistics (Basics)Business statistics (Basics)
Business statistics (Basics)
 
Statistics
StatisticsStatistics
Statistics
 
n 2 3 n99 2.58 95 1.96 90 1.645.docx
n 2 3 n99 2.58 95 1.96 90 1.645.docxn 2 3 n99 2.58 95 1.96 90 1.645.docx
n 2 3 n99 2.58 95 1.96 90 1.645.docx
 
MELJUN CORTES research seminar_1__data_analysis_basics_slides
MELJUN CORTES research seminar_1__data_analysis_basics_slidesMELJUN CORTES research seminar_1__data_analysis_basics_slides
MELJUN CORTES research seminar_1__data_analysis_basics_slides
 
MELJUN CORTES research seminar_1_data_analysis_basics
MELJUN CORTES research seminar_1_data_analysis_basicsMELJUN CORTES research seminar_1_data_analysis_basics
MELJUN CORTES research seminar_1_data_analysis_basics
 
MELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updates
MELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updatesMELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updates
MELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updates
 
Tps4e ch1 1.1
Tps4e ch1 1.1Tps4e ch1 1.1
Tps4e ch1 1.1
 
Elementary Statistics
Elementary Statistics Elementary Statistics
Elementary Statistics
 
SPSS FINAL.pdf
SPSS FINAL.pdfSPSS FINAL.pdf
SPSS FINAL.pdf
 
data analysis techniques and statistical softwares
data analysis techniques and statistical softwaresdata analysis techniques and statistical softwares
data analysis techniques and statistical softwares
 
Analyzing survey data
Analyzing survey dataAnalyzing survey data
Analyzing survey data
 
EDUCATIONAL STATISTICS_Unit_I.ppt
EDUCATIONAL STATISTICS_Unit_I.pptEDUCATIONAL STATISTICS_Unit_I.ppt
EDUCATIONAL STATISTICS_Unit_I.ppt
 
General Statistics boa
General Statistics boaGeneral Statistics boa
General Statistics boa
 
Chapter3
Chapter3Chapter3
Chapter3
 
B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2
 
1. chapter i(pasw)
1. chapter i(pasw)1. chapter i(pasw)
1. chapter i(pasw)
 
Bba 2001
Bba 2001Bba 2001
Bba 2001
 
PUH 6301, Public Health Research 1 Course Learning Ou
 PUH 6301, Public Health Research 1 Course Learning Ou PUH 6301, Public Health Research 1 Course Learning Ou
PUH 6301, Public Health Research 1 Course Learning Ou
 
Introduction To Statistics
Introduction To StatisticsIntroduction To Statistics
Introduction To Statistics
 

More from HasanGilani3

Cross_Tabs_lecture.ppt
Cross_Tabs_lecture.pptCross_Tabs_lecture.ppt
Cross_Tabs_lecture.pptHasanGilani3
 
2013 SC retail image and consumer perceptions(1).pptx
2013 SC retail image and consumer perceptions(1).pptx2013 SC retail image and consumer perceptions(1).pptx
2013 SC retail image and consumer perceptions(1).pptxHasanGilani3
 
Chapter 9 Fundamental of Hypothesis Testing.ppt
Chapter 9 Fundamental of Hypothesis Testing.pptChapter 9 Fundamental of Hypothesis Testing.ppt
Chapter 9 Fundamental of Hypothesis Testing.pptHasanGilani3
 
business ethics.ppt
business ethics.pptbusiness ethics.ppt
business ethics.pptHasanGilani3
 
Session 12 How_To_Make_an_Effective_Poster.pptx
Session 12 How_To_Make_an_Effective_Poster.pptxSession 12 How_To_Make_an_Effective_Poster.pptx
Session 12 How_To_Make_an_Effective_Poster.pptxHasanGilani3
 
Week 3 Consumer psychology.ppt
Week 3 Consumer psychology.pptWeek 3 Consumer psychology.ppt
Week 3 Consumer psychology.pptHasanGilani3
 
Luxury brand social media.pdf
Luxury brand social media.pdfLuxury brand social media.pdf
Luxury brand social media.pdfHasanGilani3
 
Decolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptx
Decolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptxDecolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptx
Decolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptxHasanGilani3
 

More from HasanGilani3 (11)

Balmer ACID.ppt
Balmer ACID.pptBalmer ACID.ppt
Balmer ACID.ppt
 
ad writing.ppt
ad writing.pptad writing.ppt
ad writing.ppt
 
chapter11.ppt
chapter11.pptchapter11.ppt
chapter11.ppt
 
Cross_Tabs_lecture.ppt
Cross_Tabs_lecture.pptCross_Tabs_lecture.ppt
Cross_Tabs_lecture.ppt
 
2013 SC retail image and consumer perceptions(1).pptx
2013 SC retail image and consumer perceptions(1).pptx2013 SC retail image and consumer perceptions(1).pptx
2013 SC retail image and consumer perceptions(1).pptx
 
Chapter 9 Fundamental of Hypothesis Testing.ppt
Chapter 9 Fundamental of Hypothesis Testing.pptChapter 9 Fundamental of Hypothesis Testing.ppt
Chapter 9 Fundamental of Hypothesis Testing.ppt
 
business ethics.ppt
business ethics.pptbusiness ethics.ppt
business ethics.ppt
 
Session 12 How_To_Make_an_Effective_Poster.pptx
Session 12 How_To_Make_an_Effective_Poster.pptxSession 12 How_To_Make_an_Effective_Poster.pptx
Session 12 How_To_Make_an_Effective_Poster.pptx
 
Week 3 Consumer psychology.ppt
Week 3 Consumer psychology.pptWeek 3 Consumer psychology.ppt
Week 3 Consumer psychology.ppt
 
Luxury brand social media.pdf
Luxury brand social media.pdfLuxury brand social media.pdf
Luxury brand social media.pdf
 
Decolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptx
Decolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptxDecolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptx
Decolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptx
 

Recently uploaded

BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurVIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurSuhani Kapoor
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechNewman George Leech
 
Banana Powder Manufacturing Plant Project Report 2024 Edition.pptx
Banana Powder Manufacturing Plant Project Report 2024 Edition.pptxBanana Powder Manufacturing Plant Project Report 2024 Edition.pptx
Banana Powder Manufacturing Plant Project Report 2024 Edition.pptxgeorgebrinton95
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...lizamodels9
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...anilsa9823
 
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCRsoniya singh
 
Tech Startup Growth Hacking 101 - Basics on Growth Marketing
Tech Startup Growth Hacking 101  - Basics on Growth MarketingTech Startup Growth Hacking 101  - Basics on Growth Marketing
Tech Startup Growth Hacking 101 - Basics on Growth MarketingShawn Pang
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...lizamodels9
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Roomdivyansh0kumar0
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024christinemoorman
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMRavindra Nath Shukla
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...lizamodels9
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in managementchhavia330
 
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckPitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckHajeJanKamps
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...lizamodels9
 

Recently uploaded (20)

BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Old Faridabad ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service JamshedpurVIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
VIP Call Girl Jamshedpur Aashi 8250192130 Independent Escort Service Jamshedpur
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
RE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman LeechRE Capital's Visionary Leadership under Newman Leech
RE Capital's Visionary Leadership under Newman Leech
 
Banana Powder Manufacturing Plant Project Report 2024 Edition.pptx
Banana Powder Manufacturing Plant Project Report 2024 Edition.pptxBanana Powder Manufacturing Plant Project Report 2024 Edition.pptx
Banana Powder Manufacturing Plant Project Report 2024 Edition.pptx
 
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
Lowrate Call Girls In Sector 18 Noida ❤️8860477959 Escorts 100% Genuine Servi...
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
 
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
(8264348440) 🔝 Call Girls In Mahipalpur 🔝 Delhi NCR
 
Tech Startup Growth Hacking 101 - Basics on Growth Marketing
Tech Startup Growth Hacking 101  - Basics on Growth MarketingTech Startup Growth Hacking 101  - Basics on Growth Marketing
Tech Startup Growth Hacking 101 - Basics on Growth Marketing
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
 
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130  Available With RoomVIP Kolkata Call Girl Howrah 👉 8250192130  Available With Room
VIP Kolkata Call Girl Howrah 👉 8250192130 Available With Room
 
KestrelPro Flyer Japan IT Week 2024 (English)
KestrelPro Flyer Japan IT Week 2024 (English)KestrelPro Flyer Japan IT Week 2024 (English)
KestrelPro Flyer Japan IT Week 2024 (English)
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 
The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024The CMO Survey - Highlights and Insights Report - Spring 2024
The CMO Survey - Highlights and Insights Report - Spring 2024
 
Monte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSMMonte Carlo simulation : Simulation using MCSM
Monte Carlo simulation : Simulation using MCSM
 
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
Call Girls In Sikandarpur Gurgaon ❤️8860477959_Russian 100% Genuine Escorts I...
 
GD Birla and his contribution in management
GD Birla and his contribution in managementGD Birla and his contribution in management
GD Birla and his contribution in management
 
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deckPitch Deck Teardown: NOQX's $200k Pre-seed deck
Pitch Deck Teardown: NOQX's $200k Pre-seed deck
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
 

Intro to SPSS.ppt

  • 1. 1 Introduction to SPSS Data types and SPSS data entry and analysis
  • 2. 2 In this session  What does SPSS look like?  Types of data (revision)  Data Entry in SPSS  Simple charts in SPSS  Summary statistics  Contingency tables and crosstabulations  Scatterplots and correlations  Tests of differences of means
  • 4. 4 Aspects of SPSS  Menus - Analyse and Charts esp.  Spreadsheet view of data  Rows are cases (people, respondents etc.)  Columns are Variables  Variable view of data  Shows detail of each variable type
  • 6. 6 In SPSS  We change ticks etc. on a questionnaire into numbers  One number for each variable for each case  How we do this depends on the type of variable/data
  • 7. 7 Types of data  Nominal  Ranked  Scales/measures  Mixed types  Text answers (open ended questions)
  • 8. 8 Nominal (categorical)  order is arbitrary  e.g. sex, country of birth, personality type, yes or no.  Use numeric in SPSS and give value labels. (e.g. 1=Female, 2=Male, 99=Missing) (e.g. 1=Yes, 2=No, 99=Missing) (e.g. 1=UK, 2=Ireland, 3=Pakistan, 4=India, 5=other, 99=Missing)
  • 9. 9 Ranks or Ordinal  in order, 1st, 2nd, 3rd etc.  e.g. status, social class  Use numeric in SPSS with value labels  E.g. 1=Working class, 2=Middle class, 3=Upper class  E.g. Class of degree, 1=First, 2=Upper second, 3=Lower second, 4=Third, 5=Ordinary, 99=Missing
  • 10. 10 Measures, scales 1. Interval - equal units  e.g. IQ 2. Ratio - equal units, zero on scale  e.g. height, income, family size, age  Makes sense to say one value is twice another  Use numeric (or comma, dot or scientific) in SPSS  E.g. family size, 1, 2, 3, 4 etc.  E.g. income per year, 25000, 14500, 18650 etc.
  • 11. 11 Mixed type  Categorised data  Actually ranked, but used to identify categories or groups  e.g. age groups  = ratio data put into groups  Use numeric in SPSS and use value labels.  E.g. Age group, 1=‘Under 18’, 2=‘18-24’, 3=‘25- 34’, 4=‘35-44’, 5=‘45-54’, 6=‘55 or greater’
  • 12. 12 Text answers  E.g. answers to open-ended questions  Either enter text as given (Use String in SPSS)  Or  Code or classify answers into one of a small number types. (Use numeric/nominal in SPSS)
  • 13. Quantifying Data  Before we can do any kind of analysis, we need to quantify our data  “Quantification” is the process of converting data to a numeric format  Convert social science data into a “machine- readable” form, a form that can be read & manipulated by computer programs
  • 14. Quantifying Data Some transformations are simple:  Assign numeric representations to nominal or ordinal variables:  Turning male into “1” and female into “2”  Assigning “3” to Very Interested, “2” to Somewhat Interested, “1” to Not Interested  Assign numeric values to continuous variables:  Turning born in 1973 to “35”
  • 15. Developing Code Categories Some data are more challenging. Open-ended responses must be coded.  Two basic approaches:  Begin with a coding scheme derived from the research purpose.  Generate codes from the data.
  • 16. Coding Quantitative Data  Goal – reduce a wide variety of information to a more limited set of variable attributes:  “What is your occupation?”  Use pre-established scheme: Professional, Managerial, Clerical, Semi-skilled, etc.  Create a scheme after reviewing the data  Assign value to each category in the scheme: Professional = 1, Managerial = 2, etc.  Classify the response: “Secretary” is “clerical” and is coded as “3”
  • 17. Coding Quantitative Data  Points to remember:  If the data are coded to maintain a good amount of detail, they can always be combined (reduced) later  However, if you start off with too little detail, you can’t get it back  If you’re using a survey / questionnaire, it’s a good idea to do your coding on the form so that it can be entered properly (i.e. create a “codebook”)
  • 18. Codebook Construction Purposes:  Primary guide used in the coding process.  Should note the value assigned to each variable attribute (response)  Guide for locating variables and interpreting codes in the data file during analysis.  If you’re doing your own input, this will also guide data set construction
  • 19. 19 Data Entry in SPSS  Video by Andy Field  https://www.youtube.com/watch?v=b163iBBy ycw&index=1&list=PL25257A24840423AE
  • 21. 21 Data Entry into SPSS There are 2 ways to enter data into SPSS: 1. Directly enter in to SPSS by typing in Data View 2. Enter into other database software such as Excel then import into SPSS Let’s start with the second option, using data in Excel.
  • 24. 24 Importing data from Excel spreadsheet into SPSS. In SPSS, go to: File, Open, Data Select Type of file (for example, Excel) you want to open Select File name you want to open
  • 25. 25 Importing data from SPSS to Excel. In SPSS, go to: Data, Save as, Select Type of file (for example, Excel) you want to save into Give File name you want to save into
  • 26. 26 Frequency counts  Used with categorical and ranked variables  e.g. gender of students taking Health and Illness option Sex of student Frequency Percent Valid Percent Cumulative Percent Female 25 73.5 73.5 73.5 Male 9 26.5 26.5 100.0 Valid Total 34 100.0 100.0
  • 27. 27 e.g. Number of GCSEs passed by students taking Health and Illness option Number of GCSEs Frequency Percent Valid Percent Cumulative Percent 0 1 2.9 2.9 2.9 1 1 2.9 2.9 5.9 2 4 11.8 11.8 17.6 3 6 17.6 17.6 35.3 4 4 11.8 11.8 47.1 5 2 5.9 5.9 52.9 6 6 17.6 17.6 70.6 7 3 8.8 8.8 79.4 8 2 5.9 5.9 85.3 9 3 8.8 8.8 94.1 13 1 2.9 2.9 97.1 14 1 2.9 2.9 100.0 Valid Total 34 100.0 100.0
  • 28. 28 Central Tendency  Mean  = average value  sum of all the values divided by the number of values  Mode  = the most frequent value in a distribution  (N.B. it is possible to have 2 or more modes, e.g. bimodal distribution)  Median  = the half-way value, or the value that divides the ordered distribution in the middle  The middle score when scores are ordered  N.B. need to put values into order first
  • 29. 29 Dispersion and variability  Quartiles  The three values that split the sorted data into four equal parts.  Second Quartile = median.  Lower quartile = median of lower half of the data  Upper quartile = median of upper half of the data  Need to order the individuals first  One quarter of the individuals are in each inter- quartile range
  • 30. 30 Used on Box Plot Statistics Age Valid 34 N Missing 0 Mean 24.03 Median 21.00 Upper quartile Lower quartile Median Age of Health and Illness students
  • 31. 31 Variance  Average deviation from the mean, squared  5.20 is the Sum of Squares  This depends on number of individuals so we divide by n (5)  Gives 1.04 which is the variance Score Mean Deviation Squared Deviation 1 2.6 -1.6 2.56 2 2.6 -0.6 0.36 3 2.6 0.4 0.16 3 2.6 0.4 0.16 4 2.6 1.4 1.96 Total 5.20
  • 32. 32 Standard Deviation  The variance has one problem: it is measured in units squared.  This isn’t a very meaningful metric so we take the square root value.  This is the Standard Deviation
  • 33. 33 Using SPSS  ‘Analyse>Descriptive>Explore’ menu.  Gives mean, median, SD, variance, min, max, range, skew and kurtosis.  Can also produce stem and leaf, and histogram.
  • 34. 34 Charts in SPSS  Use ‘Chart Builder’ from ‘Graph’ menu or the Legacy menu  And/or double click chart to edit it.  E.g. double click to edit bars (e.g. to change from colour to fill pattern).  Do this in SPSS first before cut and paste to Word  Label the chart (in SPSS or in Word)
  • 35. 35 Stem and leaf plots  e.g. age of students taking Health and Illness option  good at showing  distribution of data  outliers  range
  • 36. 36 Stem and leaf plots e.g. Age Stem-and-Leaf Plot Frequency Stem & Leaf 6.00 1 . 999999 17.00 2 . 00000000001111134 5.00 2 . 55678 3.00 3 . 123 1.00 3 . 5 2.00 Extremes (>=36) Stem width: 10 Each leaf: 1 case(s)
  • 38. 38 Box Plot Fill colour changed. N.B. numbers refer to case numbers.
  • 39. 39 Histograms and bar charts  Length/height of bar indicates frequency
  • 40. 40 Histogram Fill pattern suitable for black and white printing
  • 41. 41 Changing the bin size Bin size made smaller to show more bars
  • 42. 42 Pie chart  angle of segment indicates proportion of the whole
  • 43. Pie Chart Shadow and one slice moved out for emphasis
  • 44. Analysing relationships  Contingency tables or crosstabulations  Compares nominal/categorical variables  But can include ordinal variables  N.B. table contains counts (= frequency data)  One variable on horizontal axis  One variable on vertical axis  Row and column total counts known as marginals
  • 45. Example  In the Health and Illness class, are women more likely to be under 21 than men?
  • 46. Crosstabulations  e.g.  Use column and row percentages to look for relationships
  • 48. Chi-square ² Cross tabulations and Chi-square are tests that can be used to look for a relationship between two variables:  When the variables are categorical so the data are nominal (or frequency).  For example, if we wanted to look at the relationship between gender and age.  There are several different types of Chi-square (²), we will be using the 2 x 2 Chi-square
  • 50. Another example  The Bank employees data
  • 52. Chi-Square analysis on SPSS  http://www.youtube.com/watch?v=Ahs8jS5m JKk 4m15s  http://www.youtube.com/watch?v=IRCzOD27 NQU  From 6m:30s to 9m:50s  http://www.youtube.com/watch?v=532QXt1P M- Q&feature=plcp&context=C3ba91a4UDOEgs ToPDskJ-ABupdp-Yfvuf4j4fJGzV 12m30s
  • 53. Low values in cells  Get SPSS to output expected values  Look where these are <5  Consider recoding to combine cols or rows
  • 54. Tabulating questionnaire responses  Categorical survey data often “collapsed” for purposes of data analysis Original category Frequency Collapsed category Frequency White British 284 White 304 White Irish 7 Other White 13 Indian 40 South Asian 105 Pakistani 32 Bangladeshi 33 Chinese 16 Chinese 16 Black British 30 Black 44 Afro-Caribbean 12 African 2 An analysis on a sample of 2 (e.g. Black African) would not have been very meaningful!
  • 55. Recoding variables  http://www.youtube.com/watch?v=uzQ_522F 2SM&feature=related  Ignore t-test for now 6m11s  http://www.youtube.com/watch?v=FUoYZ_f6 Lxc  Uses old version of SPSS, no submenu now. 6m
  • 56. Scatterplots and correlations  Looks for association between variables, e.g.  Population size and GDP  crime and unemployment rates  height and weight  Both variables must be rank, interval or ratio (scale or ordinal in SPSS).  Thus cannot use variables like, gender, ethnicity, town of birth, occupation. 56
  • 57. 57 Scatterplots  e.g. age (in years) versus Number of GCSEs
  • 58. Interpretation  As Y increases X increases  Called correlation  Regression line model in red 58
  • 59. Correlation measures association not causation  The older the child the better s/he is at reading  The less your income the greater the risk of schizophrenia  Height correlates with weight  But weight does not cause height  Height is one of the causes of weight (also body shape, diet, fitness level etc.)  Numbers of ice creams sold is correlated with the rate of drowning  Ice creams do not cause drowning (nor vice versa)  Third variable involved – people swim more and buy more ice creams when it’s warm 59
  • 60. Scatterplot in SPSS  Use Graph menu  http://www.youtube.com/watch?v=74BjgPQvI Eg 8m34s  http://www.youtube.com/watch?v=blfflA- 34pQ&feature=related 4m04s  http://www.youtube.com/watch?v=UVylQoG4 hZM 1m50s, ignore polynomial regression 60
  • 61. Modifying the Scatterplot  http://www.youtube.com/watch?v=803YCYA2 AoQ&feature=related 4m04s  http://www.youtube.com/watch?v=vPzvuMuV Xk8&feature=related 3m40s 61
  • 62. If mixed data sets  Change point icon and/or colour to see different subsets.  Overall data may have no relationship but subsets might.  E.g. show male and female respondents.  Use Chart builder 62
  • 63. 63 Correlation  Correlation coefficient = measure of strength of relationship, e.g. Pearson’s r  varies from 0 to 1 with a plus or minus sign Correlations Number of GCSEs Age Pearson Correlation 1 -.415 * Sig. (2-tailed) .015 Number of GCSEs N 34 34 Pearson Correlation -.415 * 1 Sig. (2-tailed) .015 Age N 34 34 *. Correlation is significant at the 0.05 level (2-tailed).
  • 64. 64 Positive correlation  as x increases, y increases r = 0.7
  • 65. 65 Negative correlation  as x increases, y decreases r = -0.7
  • 66. 66 Strong correlation (i.e. close to 1) r = 0.9
  • 67. 67 Weak correlation (i.e. close to 0) r = 0.2
  • 68. Interpretation cont.  r2 is a measure of degree of variation in one variable accounted for by variation in the other.  E.g. If r=0.7 then r2=.49 i.e. just under half the variation is accounted for (rest accounted for by other factors).  If r=0.3 then r2=0.09 so 91% of the variation is explained by other things. 68
  • 69. Significance of r  SPSS reports if r is significant at α=0.05  N.B. this is dependent on sample size to a large extent.  Other things being equal, larger samples more likely to be significant.  Usually, size of r is more important than its significance 69
  • 70. Pearson’s r in SPSS  http://www.youtube.com/watch?v=loFLqZmvf zU 6m57s 70
  • 71. Parametric and non-parametric  Some statistics rely on the variables being investigated following a normal distribution. – Called Parametric statistics  Others can be used if variables are not distributed normally – called Non-parametric statistics.  Pearson’s r is a parametric statistic  Kendal’s tau and Spearman’s rho (rank correlation) are non-parametric. 71
  • 72. Assessing normality  Produce histogram and normal plot 72
  • 73. Use statistical test  SPSS provides two formal tests for normality : Kolmogorov-Smirnov (K-S) and Shapiro- Wilks (S-W)  But, there is debate about KS  Extremely sensitive to departure from normality  May erroneously imply parametric test not suitable – especially in small sample  So, always use a histogram as well. 73
  • 74. Often can use parametric tests  Parametric tests (e.g. Pearson’s r) are robust to departures from normality  Small, non-normal samples OK  But use non-parametric if  Data are skewed (questionnaire data often is)  Data are bimodal 74
  • 75. Spearmans’s rho  http://www.youtube.com/watch?v=r_WQe2c- ISU From 4.14 to 4.56  http://www.youtube.com/watch?v=POkFi5vKv I8&feature=fvwrel 6m16s 75
  • 76. So far…  Looked at relationships between nominal variables  Gender vs age group  Looked at relationships between scale variables  Height vs. Weight  Now combine the two  Groups vs a scale variable  E.g. Gender vs income 76
  • 77. Reminder – IV vs DV  IV = independent variable  What makes a difference, causes effects, is responsible for differences.  DV = dependent variable  What is affected by things, what is changed by the IV.  Gender vs income. Gender = IV, income = DV  So we investigate the effect of gender on income 77
  • 78. Example 1 Age group vs. no. of GCSEs  Using the Health and Illness class data  Age group defines 2 groups  Under 21  21 and over  Just two groups  Can use independent samples t-test  Independent because the two groups consist of different people.  t-test compares the means of the 2 groups. 78
  • 79. 79 Difference of means  Do under 21s have more or fewer GCSEs than 21 and overs?  Means are different (6.44 & 4.28) but is that significant? Group Statistics Age group N Mean Std. Deviation Std. Error Mean Under 21 16 6.44 3.140 .785 Number of GCSEs 21 and over 18 4.28 2.906 .685
  • 80. 80 Independent Samples Test s Test for Equality of Variances t-test for Equality of Mean Sig. t df Sig. (2-tailed) Mean Difference Std. Diffe .164 .689 2.082 32 .045 2.160 2.073 30.789 .047 2.160 Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means 95% Confidence Interval of the Difference F Sig. t df Sig. (2-tailed) Mean Difference Std. Error Difference Lower Upper Equal variances assumed .164 .689 2.082 32 .045 2.160 1.037 .047 4.272 Number of GCSEs Equal variances not assumed 2.073 30.789 .047 2.160 1.042 .034 4.285 No significant difference therefore assume equal variances Means are statistically significantly different
  • 81. Parametric vs non-parametric  Just as in the case of correlations, there are both kinds of tests.  Need to check if DV is normally distributed.  Do this visually  Also use statistical tests 81
  • 82. Tests for normality  Kolmogorov-Smirnov and Shapiro-Wilk  If n>50 use KS  If n≤50 use SW  Null hypothesis is ‘data are normally distributed’.  So if p<0.05 then data are significantly different from a normal distribution – use non- parametric tests  If p≥0.05 then no significant difference – use parametric tests 82
  • 83. Checking normality  Produce histogram of DV  Tick box to undertake statistical test  Interpret results. 83
  • 84. t-test  Identify your two groups.  Determine what values in the data indicate those two groups (e.g. 1=female, 2=male)  Select Analyze:Compare Means:Independent samples t-test  http://www.youtube.com/watch?v=_KHI3ScO 8sc 9m40s 84
  • 85. Mann-Whitney U test  Use this when comparing two groups and the DV is not normally distributed  http://www.youtube.com/watch?v=7iTvv3m9d _g 3m45s 85
  • 86. Comparing 3 or more groups  ANOVA = Analysis of Variance  Analyze: Compare Means: One-way ANOVA  http://www.youtube.com/watch?v=wFq1b3QjI 1U 4m04s Useful to get table of means (descriptives) and means plots from ANOVA options. 86
  • 87. ANOVA Means and F value 87