SlideShare a Scribd company logo
1 of 88
1
Introduction to SPSS
Data types and SPSS
data entry and analysis
2
In this session
 What does SPSS look like?
 Types of data (revision)
 Data Entry in SPSS
 Simple charts in SPSS
 Summary statistics
 Contingency tables and crosstabulations
 Scatterplots and correlations
 Tests of differences of means
3
SPSS
4
Aspects of SPSS
 Menus - Analyse and Charts esp.
 Spreadsheet view of data
 Rows are cases (people, respondents etc.)
 Columns are Variables
 Variable view of data
 Shows detail of each variable type
5
Questionnaire Data Coding
6
In SPSS
 We change ticks etc. on a questionnaire into
numbers
 One number for each variable for each case
 How we do this depends on the type of
variable/data
7
Types of data
 Nominal
 Ranked
 Scales/measures
 Mixed types
 Text answers (open ended questions)
8
Nominal (categorical)
 order is arbitrary
 e.g. sex, country of birth, personality type, yes or no.
 Use numeric in SPSS and give value labels.
(e.g. 1=Female, 2=Male, 99=Missing)
(e.g. 1=Yes, 2=No, 99=Missing)
(e.g. 1=UK, 2=Ireland, 3=Pakistan, 4=India, 5=other,
99=Missing)
9
Ranks or Ordinal
 in order, 1st, 2nd, 3rd etc.
 e.g. status, social class
 Use numeric in SPSS with value labels
 E.g. 1=Working class, 2=Middle class, 3=Upper
class
 E.g. Class of degree, 1=First, 2=Upper second,
3=Lower second, 4=Third, 5=Ordinary,
99=Missing
10
Measures, scales
1. Interval - equal units
 e.g. IQ
2. Ratio - equal units, zero on scale
 e.g. height, income, family size, age
 Makes sense to say one value is twice another
 Use numeric (or comma, dot or scientific) in
SPSS
 E.g. family size, 1, 2, 3, 4 etc.
 E.g. income per year, 25000, 14500, 18650 etc.
11
Mixed type
 Categorised data
 Actually ranked, but used to identify
categories or groups
 e.g. age groups
 = ratio data put into groups
 Use numeric in SPSS and use value
labels.
 E.g. Age group, 1=‘Under 18’, 2=‘18-24’, 3=‘25-
34’, 4=‘35-44’, 5=‘45-54’, 6=‘55 or greater’
12
Text answers
 E.g. answers to open-ended questions
 Either enter text as given (Use String in SPSS)
 Or
 Code or classify answers into one of a small number
types. (Use numeric/nominal in SPSS)
Quantifying Data
 Before we can do any kind of analysis, we
need to quantify our data
 “Quantification” is the process of converting
data to a numeric format
 Convert social science data into a “machine-
readable” form, a form that can be read &
manipulated by computer programs
Quantifying Data
Some transformations are simple:
 Assign numeric representations to nominal
or ordinal variables:
 Turning male into “1” and female into “2”
 Assigning “3” to Very Interested, “2” to
Somewhat Interested, “1” to Not Interested
 Assign numeric values to continuous
variables:
 Turning born in 1973 to “35”
Developing Code Categories
Some data are more challenging. Open-ended
responses must be coded.
 Two basic approaches:
 Begin with a coding scheme derived from the
research purpose.
 Generate codes from the data.
Coding Quantitative Data
 Goal – reduce a wide variety of information to
a more limited set of variable attributes:
 “What is your occupation?”
 Use pre-established scheme: Professional,
Managerial, Clerical, Semi-skilled, etc.
 Create a scheme after reviewing the data
 Assign value to each category in the scheme:
Professional = 1, Managerial = 2, etc.
 Classify the response: “Secretary” is “clerical” and is
coded as “3”
Coding Quantitative Data
 Points to remember:
 If the data are coded to maintain a good amount
of detail, they can always be combined (reduced)
later
 However, if you start off with too little detail, you
can’t get it back
 If you’re using a survey / questionnaire, it’s a
good idea to do your coding on the form so that it
can be entered properly (i.e. create a “codebook”)
Codebook Construction
Purposes:
 Primary guide used in the coding process.
 Should note the value assigned to each variable
attribute (response)
 Guide for locating variables and interpreting
codes in the data file during analysis.
 If you’re doing your own input, this will also
guide data set construction
19
Data Entry in SPSS
 Video by Andy Field
 https://www.youtube.com/watch?v=b163iBBy
ycw&index=1&list=PL25257A24840423AE
20
SPSS Variable View
21
Data Entry into SPSS
There are 2 ways to enter data into SPSS:
1. Directly enter in to SPSS by typing in Data View
2. Enter into other database software such as
Excel then import into SPSS
Let’s start with the second option, using data in Excel.
22
Data from Hell
23
Data from Heaven
24
Importing data from Excel spreadsheet into SPSS.
In SPSS, go to:
File, Open, Data
Select Type of file (for example, Excel) you want to open
Select File name you want to open
25
Importing data from SPSS to Excel.
In SPSS, go to:
Data, Save as,
Select Type of file (for example, Excel) you want to save into
Give File name you want to save into
26
Frequency counts
 Used with categorical and ranked variables
 e.g. gender of students taking Health and
Illness option
Sex of student
Frequency Percent Valid Percent
Cumulative
Percent
Female 25 73.5 73.5 73.5
Male 9 26.5 26.5 100.0
Valid
Total 34 100.0 100.0
27
e.g. Number of GCSEs passed by students taking
Health and Illness option
Number of GCSEs
Frequency Percent Valid Percent
Cumulative
Percent
0 1 2.9 2.9 2.9
1 1 2.9 2.9 5.9
2 4 11.8 11.8 17.6
3 6 17.6 17.6 35.3
4 4 11.8 11.8 47.1
5 2 5.9 5.9 52.9
6 6 17.6 17.6 70.6
7 3 8.8 8.8 79.4
8 2 5.9 5.9 85.3
9 3 8.8 8.8 94.1
13 1 2.9 2.9 97.1
14 1 2.9 2.9 100.0
Valid
Total 34 100.0 100.0
28
Central Tendency
 Mean
 = average value
 sum of all the values divided by the number of values
 Mode
 = the most frequent value in a distribution
 (N.B. it is possible to have 2 or more modes, e.g. bimodal
distribution)
 Median
 = the half-way value, or the value that divides the ordered
distribution in the middle
 The middle score when scores are ordered
 N.B. need to put values into order first
29
Dispersion and variability
 Quartiles
 The three values that split the sorted data into
four equal parts.
 Second Quartile = median.
 Lower quartile = median of lower half of the data
 Upper quartile = median of upper half of the data
 Need to order the individuals first
 One quarter of the individuals are in each inter-
quartile range
30
Used on Box Plot
Statistics
Age
Valid 34
N
Missing 0
Mean 24.03
Median 21.00
Upper quartile
Lower quartile
Median
Age of Health and Illness students
31
Variance
 Average deviation from the mean, squared
 5.20 is the Sum of Squares
 This depends on number of individuals so we divide by n (5)
 Gives 1.04 which is the variance
Score Mean Deviation
Squared
Deviation
1 2.6 -1.6 2.56
2 2.6 -0.6 0.36
3 2.6 0.4 0.16
3 2.6 0.4 0.16
4 2.6 1.4 1.96
Total 5.20
32
Standard Deviation
 The variance has one problem: it is
measured in units squared.
 This isn’t a very meaningful metric so we take
the square root value.
 This is the Standard Deviation
33
Using SPSS
 ‘Analyse>Descriptive>Explore’ menu.
 Gives mean, median, SD, variance, min,
max, range, skew and kurtosis.
 Can also produce stem and leaf, and
histogram.
34
Charts in SPSS
 Use ‘Chart Builder’ from ‘Graph’ menu or the
Legacy menu
 And/or double click chart to edit it.
 E.g. double click to edit bars (e.g. to change
from colour to fill pattern).
 Do this in SPSS first before cut and paste to
Word
 Label the chart (in SPSS or in Word)
35
Stem and leaf plots
 e.g. age of students taking Health and Illness
option
 good at showing
 distribution of data
 outliers
 range
36
Stem and leaf plots e.g.
Age Stem-and-Leaf Plot
Frequency Stem & Leaf
6.00 1 . 999999
17.00 2 . 00000000001111134
5.00 2 . 55678
3.00 3 . 123
1.00 3 . 5
2.00 Extremes (>=36)
Stem width: 10
Each leaf: 1 case(s)
37
Box Plot
Statistics
Age
Valid 34
N
Missing 0
Mean 24.03
Median 21.00
38
Box Plot
Fill colour
changed.
N.B. numbers refer
to case numbers.
39
Histograms and bar charts
 Length/height of bar indicates frequency
40
Histogram
Fill pattern suitable
for black and white
printing
41
Changing the bin size
Bin size made
smaller to show
more bars
42
Pie chart
 angle of segment indicates proportion of the
whole
Pie Chart
Shadow and one
slice moved out for
emphasis
Analysing relationships
 Contingency tables or crosstabulations
 Compares nominal/categorical variables
 But can include ordinal variables
 N.B. table contains counts (= frequency data)
 One variable on horizontal axis
 One variable on vertical axis
 Row and column total counts known as marginals
Example
 In the Health and
Illness class, are
women more
likely to be under
21 than men?
Crosstabulations
 e.g.
 Use column and row percentages to look for
relationships
SPSS output
Chi-square ²
Cross tabulations and Chi-square are tests that
can be used to look for a relationship between
two variables:
 When the variables are categorical so the
data are nominal (or frequency).
 For example, if we wanted to look at the
relationship between gender and age.
 There are several different types of Chi-square
(²), we will be using the 2 x 2 Chi-square
2x2 Chi-square results in
SPSS
Another example
 The Bank employees data
Bank Employees
Chi-Square tests
Chi-Square analysis on SPSS
 http://www.youtube.com/watch?v=Ahs8jS5m
JKk 4m15s
 http://www.youtube.com/watch?v=IRCzOD27
NQU
 From 6m:30s to 9m:50s
 http://www.youtube.com/watch?v=532QXt1P
M-
Q&feature=plcp&context=C3ba91a4UDOEgs
ToPDskJ-ABupdp-Yfvuf4j4fJGzV 12m30s
Low values in cells
 Get SPSS to output expected values
 Look where these are <5
 Consider recoding to combine cols or rows
Tabulating questionnaire
responses
 Categorical survey data often “collapsed” for purposes of data
analysis
Original category Frequency Collapsed category Frequency
White British 284 White 304
White Irish 7
Other White 13
Indian 40 South Asian 105
Pakistani 32
Bangladeshi 33
Chinese 16 Chinese 16
Black British 30 Black 44
Afro-Caribbean 12
African 2
An analysis on a sample of 2 (e.g. Black African) would not have been very meaningful!
Recoding variables
 http://www.youtube.com/watch?v=uzQ_522F
2SM&feature=related
 Ignore t-test for now 6m11s
 http://www.youtube.com/watch?v=FUoYZ_f6
Lxc
 Uses old version of SPSS, no submenu now. 6m
Scatterplots and correlations
 Looks for association between variables, e.g.
 Population size and GDP
 crime and unemployment rates
 height and weight
 Both variables must be rank, interval or
ratio (scale or ordinal in SPSS).
 Thus cannot use variables like, gender,
ethnicity, town of birth, occupation.
56
57
Scatterplots
 e.g. age (in years) versus Number of GCSEs
Interpretation
 As Y increases
X increases
 Called
correlation
 Regression line
model in red
58
Correlation measures
association not causation
 The older the child the better s/he is at reading
 The less your income the greater the risk of
schizophrenia
 Height correlates with weight
 But weight does not cause height
 Height is one of the causes of weight (also body
shape, diet, fitness level etc.)
 Numbers of ice creams sold is correlated with
the rate of drowning
 Ice creams do not cause drowning (nor vice versa)
 Third variable involved – people swim more and buy
more ice creams when it’s warm
59
Scatterplot in SPSS
 Use Graph menu
 http://www.youtube.com/watch?v=74BjgPQvI
Eg 8m34s
 http://www.youtube.com/watch?v=blfflA-
34pQ&feature=related 4m04s
 http://www.youtube.com/watch?v=UVylQoG4
hZM 1m50s, ignore polynomial regression
60
Modifying the Scatterplot
 http://www.youtube.com/watch?v=803YCYA2
AoQ&feature=related 4m04s
 http://www.youtube.com/watch?v=vPzvuMuV
Xk8&feature=related 3m40s
61
If mixed data sets
 Change point icon and/or colour to see
different subsets.
 Overall data may have no relationship but
subsets might.
 E.g. show male and female respondents.
 Use Chart builder
62
63
Correlation
 Correlation coefficient = measure of strength
of relationship, e.g. Pearson’s r
 varies from 0 to 1 with a plus or minus sign
Correlations
Number of
GCSEs Age
Pearson Correlation 1 -.415
*
Sig. (2-tailed) .015
Number of GCSEs
N 34 34
Pearson Correlation -.415
*
1
Sig. (2-tailed) .015
Age
N 34 34
*. Correlation is significant at the 0.05 level (2-tailed).
64
Positive correlation
 as x increases, y increases
r = 0.7
65
Negative correlation
 as x increases, y decreases
r = -0.7
66
Strong correlation (i.e. close to 1)
r = 0.9
67
Weak correlation (i.e. close to 0)
r = 0.2
Interpretation cont.
 r2 is a measure of degree of variation in
one variable accounted for by variation
in the other.
 E.g. If r=0.7 then r2=.49 i.e. just under half
the variation is accounted for (rest
accounted for by other factors).
 If r=0.3 then r2=0.09 so 91% of the
variation is explained by other things.
68
Significance of r
 SPSS reports if r is significant at α=0.05
 N.B. this is dependent on sample size to a
large extent.
 Other things being equal, larger samples
more likely to be significant.
 Usually, size of r is more important than
its significance
69
Pearson’s r in SPSS
 http://www.youtube.com/watch?v=loFLqZmvf
zU 6m57s
70
Parametric and non-parametric
 Some statistics rely on the variables being
investigated following a normal distribution. –
Called Parametric statistics
 Others can be used if variables are not
distributed normally – called Non-parametric
statistics.
 Pearson’s r is a parametric statistic
 Kendal’s tau and Spearman’s rho (rank
correlation) are non-parametric.
71
Assessing normality
 Produce histogram and normal plot
72
Use statistical test
 SPSS provides two formal tests for normality
: Kolmogorov-Smirnov (K-S) and Shapiro-
Wilks (S-W)
 But, there is debate about KS
 Extremely sensitive to departure from normality
 May erroneously imply parametric test not
suitable – especially in small sample
 So, always use a histogram as well.
73
Often can use parametric tests
 Parametric tests (e.g. Pearson’s r) are robust
to departures from normality
 Small, non-normal samples OK
 But use non-parametric if
 Data are skewed (questionnaire data often is)
 Data are bimodal
74
Spearmans’s rho
 http://www.youtube.com/watch?v=r_WQe2c-
ISU From 4.14 to 4.56
 http://www.youtube.com/watch?v=POkFi5vKv
I8&feature=fvwrel 6m16s
75
So far…
 Looked at relationships between nominal
variables
 Gender vs age group
 Looked at relationships between scale
variables
 Height vs. Weight
 Now combine the two
 Groups vs a scale variable
 E.g. Gender vs income
76
Reminder – IV vs DV
 IV = independent variable
 What makes a difference, causes effects, is responsible
for differences.
 DV = dependent variable
 What is affected by things, what is changed by the IV.
 Gender vs income. Gender = IV, income = DV
 So we investigate the effect of gender on income
77
Example 1
Age group vs. no. of GCSEs
 Using the Health and Illness class data
 Age group defines 2 groups
 Under 21
 21 and over
 Just two groups
 Can use independent samples t-test
 Independent because the two groups consist
of different people.
 t-test compares the means of the 2 groups. 78
79
Difference of means
 Do under 21s have more or fewer GCSEs
than 21 and overs?
 Means are different (6.44 & 4.28) but is that
significant?
Group Statistics
Age group N Mean Std. Deviation Std. Error Mean
Under 21 16 6.44 3.140 .785
Number of GCSEs
21 and over 18 4.28 2.906 .685
80
Independent Samples Test
s Test for Equality of
Variances t-test for Equality of Mean
Sig. t df Sig. (2-tailed)
Mean
Difference
Std.
Diffe
.164 .689 2.082 32 .045 2.160
2.073 30.789 .047 2.160
Independent Samples Test
Levene's Test for Equality of
Variances t-test for Equality of Means
95% Confidence Interval of the
Difference
F Sig. t df Sig. (2-tailed)
Mean
Difference
Std. Error
Difference Lower Upper
Equal variances assumed .164 .689 2.082 32 .045 2.160 1.037 .047 4.272
Number of GCSEs
Equal variances not
assumed
2.073 30.789 .047 2.160 1.042 .034 4.285
No significant difference therefore
assume equal variances
Means are
statistically
significantly
different
Parametric vs non-parametric
 Just as in the case of correlations, there are
both kinds of tests.
 Need to check if DV is normally distributed.
 Do this visually
 Also use statistical tests
81
Tests for normality
 Kolmogorov-Smirnov and Shapiro-Wilk
 If n>50 use KS
 If n≤50 use SW
 Null hypothesis is ‘data are normally distributed’.
 So if p<0.05 then data are significantly different
from a normal distribution – use non-
parametric tests
 If p≥0.05 then no significant difference – use
parametric tests
82
Checking normality
 Produce histogram of DV
 Tick box to undertake statistical test
 Interpret results.
83
t-test
 Identify your two groups.
 Determine what values in the data indicate
those two groups (e.g. 1=female, 2=male)
 Select Analyze:Compare Means:Independent
samples t-test
 http://www.youtube.com/watch?v=_KHI3ScO
8sc 9m40s
84
Mann-Whitney U test
 Use this when comparing two groups and the
DV is not normally distributed
 http://www.youtube.com/watch?v=7iTvv3m9d
_g 3m45s
85
Comparing 3 or more groups
 ANOVA = Analysis of Variance
 Analyze: Compare Means: One-way ANOVA
 http://www.youtube.com/watch?v=wFq1b3QjI
1U 4m04s
Useful to get table of means (descriptives) and
means plots from ANOVA options.
86
ANOVA Means and F value
87
ANOVA Means Plot
88

More Related Content

Similar to Intro to SPSS.ppt

Spss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatSpss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatMarwa Zalat
 
Business statistics (Basics)
Business statistics (Basics)Business statistics (Basics)
Business statistics (Basics)AhmedToheed3
 
n 2 3 n99 2.58 95 1.96 90 1.645.docx
n 2 3 n99 2.58 95 1.96 90 1.645.docxn 2 3 n99 2.58 95 1.96 90 1.645.docx
n 2 3 n99 2.58 95 1.96 90 1.645.docxgilpinleeanna
 
MELJUN CORTES research seminar_1__data_analysis_basics_slides
MELJUN CORTES research seminar_1__data_analysis_basics_slidesMELJUN CORTES research seminar_1__data_analysis_basics_slides
MELJUN CORTES research seminar_1__data_analysis_basics_slidesMELJUN CORTES
 
MELJUN CORTES research seminar_1_data_analysis_basics
MELJUN CORTES research seminar_1_data_analysis_basicsMELJUN CORTES research seminar_1_data_analysis_basics
MELJUN CORTES research seminar_1_data_analysis_basicsMELJUN CORTES
 
MELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updates
MELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updatesMELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updates
MELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updatesMELJUN CORTES
 
data analysis techniques and statistical softwares
data analysis techniques and statistical softwaresdata analysis techniques and statistical softwares
data analysis techniques and statistical softwaresDr.ammara khakwani
 
EDUCATIONAL STATISTICS_Unit_I.ppt
EDUCATIONAL STATISTICS_Unit_I.pptEDUCATIONAL STATISTICS_Unit_I.ppt
EDUCATIONAL STATISTICS_Unit_I.pptSasi Kumar
 
General Statistics boa
General Statistics boaGeneral Statistics boa
General Statistics boaraileeanne
 
B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2marshalkalra
 
1. chapter i(pasw)
1. chapter i(pasw)1. chapter i(pasw)
1. chapter i(pasw)Chhom Karath
 
PUH 6301, Public Health Research 1 Course Learning Ou
 PUH 6301, Public Health Research 1 Course Learning Ou PUH 6301, Public Health Research 1 Course Learning Ou
PUH 6301, Public Health Research 1 Course Learning OuTatianaMajor22
 
Introduction To Statistics
Introduction To StatisticsIntroduction To Statistics
Introduction To Statisticsalbertlaporte
 

Similar to Intro to SPSS.ppt (20)

Spss basic Dr Marwa Zalat
Spss basic Dr Marwa ZalatSpss basic Dr Marwa Zalat
Spss basic Dr Marwa Zalat
 
Business statistics (Basics)
Business statistics (Basics)Business statistics (Basics)
Business statistics (Basics)
 
Statistics
StatisticsStatistics
Statistics
 
n 2 3 n99 2.58 95 1.96 90 1.645.docx
n 2 3 n99 2.58 95 1.96 90 1.645.docxn 2 3 n99 2.58 95 1.96 90 1.645.docx
n 2 3 n99 2.58 95 1.96 90 1.645.docx
 
MELJUN CORTES research seminar_1__data_analysis_basics_slides
MELJUN CORTES research seminar_1__data_analysis_basics_slidesMELJUN CORTES research seminar_1__data_analysis_basics_slides
MELJUN CORTES research seminar_1__data_analysis_basics_slides
 
MELJUN CORTES research seminar_1_data_analysis_basics
MELJUN CORTES research seminar_1_data_analysis_basicsMELJUN CORTES research seminar_1_data_analysis_basics
MELJUN CORTES research seminar_1_data_analysis_basics
 
MELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updates
MELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updatesMELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updates
MELJUN CORTES research seminar_1__data_analysis_basics_slides_2nd_updates
 
Tps4e ch1 1.1
Tps4e ch1 1.1Tps4e ch1 1.1
Tps4e ch1 1.1
 
Elementary Statistics
Elementary Statistics Elementary Statistics
Elementary Statistics
 
SPSS FINAL.pdf
SPSS FINAL.pdfSPSS FINAL.pdf
SPSS FINAL.pdf
 
data analysis techniques and statistical softwares
data analysis techniques and statistical softwaresdata analysis techniques and statistical softwares
data analysis techniques and statistical softwares
 
Analyzing survey data
Analyzing survey dataAnalyzing survey data
Analyzing survey data
 
EDUCATIONAL STATISTICS_Unit_I.ppt
EDUCATIONAL STATISTICS_Unit_I.pptEDUCATIONAL STATISTICS_Unit_I.ppt
EDUCATIONAL STATISTICS_Unit_I.ppt
 
General Statistics boa
General Statistics boaGeneral Statistics boa
General Statistics boa
 
Chapter3
Chapter3Chapter3
Chapter3
 
B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2B409 W11 Sas Collaborative Stats Guide V4.2
B409 W11 Sas Collaborative Stats Guide V4.2
 
1. chapter i(pasw)
1. chapter i(pasw)1. chapter i(pasw)
1. chapter i(pasw)
 
Bba 2001
Bba 2001Bba 2001
Bba 2001
 
PUH 6301, Public Health Research 1 Course Learning Ou
 PUH 6301, Public Health Research 1 Course Learning Ou PUH 6301, Public Health Research 1 Course Learning Ou
PUH 6301, Public Health Research 1 Course Learning Ou
 
Introduction To Statistics
Introduction To StatisticsIntroduction To Statistics
Introduction To Statistics
 

More from HasanGilani3

Cross_Tabs_lecture.ppt
Cross_Tabs_lecture.pptCross_Tabs_lecture.ppt
Cross_Tabs_lecture.pptHasanGilani3
 
2013 SC retail image and consumer perceptions(1).pptx
2013 SC retail image and consumer perceptions(1).pptx2013 SC retail image and consumer perceptions(1).pptx
2013 SC retail image and consumer perceptions(1).pptxHasanGilani3
 
Chapter 9 Fundamental of Hypothesis Testing.ppt
Chapter 9 Fundamental of Hypothesis Testing.pptChapter 9 Fundamental of Hypothesis Testing.ppt
Chapter 9 Fundamental of Hypothesis Testing.pptHasanGilani3
 
business ethics.ppt
business ethics.pptbusiness ethics.ppt
business ethics.pptHasanGilani3
 
Session 12 How_To_Make_an_Effective_Poster.pptx
Session 12 How_To_Make_an_Effective_Poster.pptxSession 12 How_To_Make_an_Effective_Poster.pptx
Session 12 How_To_Make_an_Effective_Poster.pptxHasanGilani3
 
Week 3 Consumer psychology.ppt
Week 3 Consumer psychology.pptWeek 3 Consumer psychology.ppt
Week 3 Consumer psychology.pptHasanGilani3
 
Luxury brand social media.pdf
Luxury brand social media.pdfLuxury brand social media.pdf
Luxury brand social media.pdfHasanGilani3
 
Decolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptx
Decolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptxDecolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptx
Decolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptxHasanGilani3
 

More from HasanGilani3 (11)

Balmer ACID.ppt
Balmer ACID.pptBalmer ACID.ppt
Balmer ACID.ppt
 
ad writing.ppt
ad writing.pptad writing.ppt
ad writing.ppt
 
chapter11.ppt
chapter11.pptchapter11.ppt
chapter11.ppt
 
Cross_Tabs_lecture.ppt
Cross_Tabs_lecture.pptCross_Tabs_lecture.ppt
Cross_Tabs_lecture.ppt
 
2013 SC retail image and consumer perceptions(1).pptx
2013 SC retail image and consumer perceptions(1).pptx2013 SC retail image and consumer perceptions(1).pptx
2013 SC retail image and consumer perceptions(1).pptx
 
Chapter 9 Fundamental of Hypothesis Testing.ppt
Chapter 9 Fundamental of Hypothesis Testing.pptChapter 9 Fundamental of Hypothesis Testing.ppt
Chapter 9 Fundamental of Hypothesis Testing.ppt
 
business ethics.ppt
business ethics.pptbusiness ethics.ppt
business ethics.ppt
 
Session 12 How_To_Make_an_Effective_Poster.pptx
Session 12 How_To_Make_an_Effective_Poster.pptxSession 12 How_To_Make_an_Effective_Poster.pptx
Session 12 How_To_Make_an_Effective_Poster.pptx
 
Week 3 Consumer psychology.ppt
Week 3 Consumer psychology.pptWeek 3 Consumer psychology.ppt
Week 3 Consumer psychology.ppt
 
Luxury brand social media.pdf
Luxury brand social media.pdfLuxury brand social media.pdf
Luxury brand social media.pdf
 
Decolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptx
Decolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptxDecolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptx
Decolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptx
 

Recently uploaded

It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayNZSG
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst SummitHolger Mueller
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...Paul Menig
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxAndy Lambert
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...lizamodels9
 
Event mailer assignment progress report .pdf
Event mailer assignment progress report .pdfEvent mailer assignment progress report .pdf
Event mailer assignment progress report .pdftbatkhuu1
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Neil Kimberley
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Roland Driesen
 
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999Tina Ji
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
Sales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessSales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessAggregage
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.Aaiza Hassan
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesDipal Arora
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetDenis Gagné
 
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 DelhiCall Girls in Delhi
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Dave Litwiller
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...anilsa9823
 

Recently uploaded (20)

It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
Progress Report - Oracle Database Analyst Summit
Progress  Report - Oracle Database Analyst SummitProgress  Report - Oracle Database Analyst Summit
Progress Report - Oracle Database Analyst Summit
 
7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...7.pdf This presentation captures many uses and the significance of the number...
7.pdf This presentation captures many uses and the significance of the number...
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptx
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
 
Event mailer assignment progress report .pdf
Event mailer assignment progress report .pdfEvent mailer assignment progress report .pdf
Event mailer assignment progress report .pdf
 
Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...
 
Forklift Operations: Safety through Cartoons
Forklift Operations: Safety through CartoonsForklift Operations: Safety through Cartoons
Forklift Operations: Safety through Cartoons
 
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
Russian Faridabad Call Girls(Badarpur) : ☎ 8168257667, @4999
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
Sales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for SuccessSales & Marketing Alignment: How to Synergize for Success
Sales & Marketing Alignment: How to Synergize for Success
 
M.C Lodges -- Guest House in Jhang.
M.C Lodges --  Guest House in Jhang.M.C Lodges --  Guest House in Jhang.
M.C Lodges -- Guest House in Jhang.
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
 
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
9599632723 Top Call Girls in Delhi at your Door Step Available 24x7 Delhi
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
Lucknow 💋 Escorts in Lucknow - 450+ Call Girl Cash Payment 8923113531 Neha Th...
 

Intro to SPSS.ppt

  • 1. 1 Introduction to SPSS Data types and SPSS data entry and analysis
  • 2. 2 In this session  What does SPSS look like?  Types of data (revision)  Data Entry in SPSS  Simple charts in SPSS  Summary statistics  Contingency tables and crosstabulations  Scatterplots and correlations  Tests of differences of means
  • 4. 4 Aspects of SPSS  Menus - Analyse and Charts esp.  Spreadsheet view of data  Rows are cases (people, respondents etc.)  Columns are Variables  Variable view of data  Shows detail of each variable type
  • 6. 6 In SPSS  We change ticks etc. on a questionnaire into numbers  One number for each variable for each case  How we do this depends on the type of variable/data
  • 7. 7 Types of data  Nominal  Ranked  Scales/measures  Mixed types  Text answers (open ended questions)
  • 8. 8 Nominal (categorical)  order is arbitrary  e.g. sex, country of birth, personality type, yes or no.  Use numeric in SPSS and give value labels. (e.g. 1=Female, 2=Male, 99=Missing) (e.g. 1=Yes, 2=No, 99=Missing) (e.g. 1=UK, 2=Ireland, 3=Pakistan, 4=India, 5=other, 99=Missing)
  • 9. 9 Ranks or Ordinal  in order, 1st, 2nd, 3rd etc.  e.g. status, social class  Use numeric in SPSS with value labels  E.g. 1=Working class, 2=Middle class, 3=Upper class  E.g. Class of degree, 1=First, 2=Upper second, 3=Lower second, 4=Third, 5=Ordinary, 99=Missing
  • 10. 10 Measures, scales 1. Interval - equal units  e.g. IQ 2. Ratio - equal units, zero on scale  e.g. height, income, family size, age  Makes sense to say one value is twice another  Use numeric (or comma, dot or scientific) in SPSS  E.g. family size, 1, 2, 3, 4 etc.  E.g. income per year, 25000, 14500, 18650 etc.
  • 11. 11 Mixed type  Categorised data  Actually ranked, but used to identify categories or groups  e.g. age groups  = ratio data put into groups  Use numeric in SPSS and use value labels.  E.g. Age group, 1=‘Under 18’, 2=‘18-24’, 3=‘25- 34’, 4=‘35-44’, 5=‘45-54’, 6=‘55 or greater’
  • 12. 12 Text answers  E.g. answers to open-ended questions  Either enter text as given (Use String in SPSS)  Or  Code or classify answers into one of a small number types. (Use numeric/nominal in SPSS)
  • 13. Quantifying Data  Before we can do any kind of analysis, we need to quantify our data  “Quantification” is the process of converting data to a numeric format  Convert social science data into a “machine- readable” form, a form that can be read & manipulated by computer programs
  • 14. Quantifying Data Some transformations are simple:  Assign numeric representations to nominal or ordinal variables:  Turning male into “1” and female into “2”  Assigning “3” to Very Interested, “2” to Somewhat Interested, “1” to Not Interested  Assign numeric values to continuous variables:  Turning born in 1973 to “35”
  • 15. Developing Code Categories Some data are more challenging. Open-ended responses must be coded.  Two basic approaches:  Begin with a coding scheme derived from the research purpose.  Generate codes from the data.
  • 16. Coding Quantitative Data  Goal – reduce a wide variety of information to a more limited set of variable attributes:  “What is your occupation?”  Use pre-established scheme: Professional, Managerial, Clerical, Semi-skilled, etc.  Create a scheme after reviewing the data  Assign value to each category in the scheme: Professional = 1, Managerial = 2, etc.  Classify the response: “Secretary” is “clerical” and is coded as “3”
  • 17. Coding Quantitative Data  Points to remember:  If the data are coded to maintain a good amount of detail, they can always be combined (reduced) later  However, if you start off with too little detail, you can’t get it back  If you’re using a survey / questionnaire, it’s a good idea to do your coding on the form so that it can be entered properly (i.e. create a “codebook”)
  • 18. Codebook Construction Purposes:  Primary guide used in the coding process.  Should note the value assigned to each variable attribute (response)  Guide for locating variables and interpreting codes in the data file during analysis.  If you’re doing your own input, this will also guide data set construction
  • 19. 19 Data Entry in SPSS  Video by Andy Field  https://www.youtube.com/watch?v=b163iBBy ycw&index=1&list=PL25257A24840423AE
  • 21. 21 Data Entry into SPSS There are 2 ways to enter data into SPSS: 1. Directly enter in to SPSS by typing in Data View 2. Enter into other database software such as Excel then import into SPSS Let’s start with the second option, using data in Excel.
  • 24. 24 Importing data from Excel spreadsheet into SPSS. In SPSS, go to: File, Open, Data Select Type of file (for example, Excel) you want to open Select File name you want to open
  • 25. 25 Importing data from SPSS to Excel. In SPSS, go to: Data, Save as, Select Type of file (for example, Excel) you want to save into Give File name you want to save into
  • 26. 26 Frequency counts  Used with categorical and ranked variables  e.g. gender of students taking Health and Illness option Sex of student Frequency Percent Valid Percent Cumulative Percent Female 25 73.5 73.5 73.5 Male 9 26.5 26.5 100.0 Valid Total 34 100.0 100.0
  • 27. 27 e.g. Number of GCSEs passed by students taking Health and Illness option Number of GCSEs Frequency Percent Valid Percent Cumulative Percent 0 1 2.9 2.9 2.9 1 1 2.9 2.9 5.9 2 4 11.8 11.8 17.6 3 6 17.6 17.6 35.3 4 4 11.8 11.8 47.1 5 2 5.9 5.9 52.9 6 6 17.6 17.6 70.6 7 3 8.8 8.8 79.4 8 2 5.9 5.9 85.3 9 3 8.8 8.8 94.1 13 1 2.9 2.9 97.1 14 1 2.9 2.9 100.0 Valid Total 34 100.0 100.0
  • 28. 28 Central Tendency  Mean  = average value  sum of all the values divided by the number of values  Mode  = the most frequent value in a distribution  (N.B. it is possible to have 2 or more modes, e.g. bimodal distribution)  Median  = the half-way value, or the value that divides the ordered distribution in the middle  The middle score when scores are ordered  N.B. need to put values into order first
  • 29. 29 Dispersion and variability  Quartiles  The three values that split the sorted data into four equal parts.  Second Quartile = median.  Lower quartile = median of lower half of the data  Upper quartile = median of upper half of the data  Need to order the individuals first  One quarter of the individuals are in each inter- quartile range
  • 30. 30 Used on Box Plot Statistics Age Valid 34 N Missing 0 Mean 24.03 Median 21.00 Upper quartile Lower quartile Median Age of Health and Illness students
  • 31. 31 Variance  Average deviation from the mean, squared  5.20 is the Sum of Squares  This depends on number of individuals so we divide by n (5)  Gives 1.04 which is the variance Score Mean Deviation Squared Deviation 1 2.6 -1.6 2.56 2 2.6 -0.6 0.36 3 2.6 0.4 0.16 3 2.6 0.4 0.16 4 2.6 1.4 1.96 Total 5.20
  • 32. 32 Standard Deviation  The variance has one problem: it is measured in units squared.  This isn’t a very meaningful metric so we take the square root value.  This is the Standard Deviation
  • 33. 33 Using SPSS  ‘Analyse>Descriptive>Explore’ menu.  Gives mean, median, SD, variance, min, max, range, skew and kurtosis.  Can also produce stem and leaf, and histogram.
  • 34. 34 Charts in SPSS  Use ‘Chart Builder’ from ‘Graph’ menu or the Legacy menu  And/or double click chart to edit it.  E.g. double click to edit bars (e.g. to change from colour to fill pattern).  Do this in SPSS first before cut and paste to Word  Label the chart (in SPSS or in Word)
  • 35. 35 Stem and leaf plots  e.g. age of students taking Health and Illness option  good at showing  distribution of data  outliers  range
  • 36. 36 Stem and leaf plots e.g. Age Stem-and-Leaf Plot Frequency Stem & Leaf 6.00 1 . 999999 17.00 2 . 00000000001111134 5.00 2 . 55678 3.00 3 . 123 1.00 3 . 5 2.00 Extremes (>=36) Stem width: 10 Each leaf: 1 case(s)
  • 38. 38 Box Plot Fill colour changed. N.B. numbers refer to case numbers.
  • 39. 39 Histograms and bar charts  Length/height of bar indicates frequency
  • 40. 40 Histogram Fill pattern suitable for black and white printing
  • 41. 41 Changing the bin size Bin size made smaller to show more bars
  • 42. 42 Pie chart  angle of segment indicates proportion of the whole
  • 43. Pie Chart Shadow and one slice moved out for emphasis
  • 44. Analysing relationships  Contingency tables or crosstabulations  Compares nominal/categorical variables  But can include ordinal variables  N.B. table contains counts (= frequency data)  One variable on horizontal axis  One variable on vertical axis  Row and column total counts known as marginals
  • 45. Example  In the Health and Illness class, are women more likely to be under 21 than men?
  • 46. Crosstabulations  e.g.  Use column and row percentages to look for relationships
  • 48. Chi-square ² Cross tabulations and Chi-square are tests that can be used to look for a relationship between two variables:  When the variables are categorical so the data are nominal (or frequency).  For example, if we wanted to look at the relationship between gender and age.  There are several different types of Chi-square (²), we will be using the 2 x 2 Chi-square
  • 50. Another example  The Bank employees data
  • 52. Chi-Square analysis on SPSS  http://www.youtube.com/watch?v=Ahs8jS5m JKk 4m15s  http://www.youtube.com/watch?v=IRCzOD27 NQU  From 6m:30s to 9m:50s  http://www.youtube.com/watch?v=532QXt1P M- Q&feature=plcp&context=C3ba91a4UDOEgs ToPDskJ-ABupdp-Yfvuf4j4fJGzV 12m30s
  • 53. Low values in cells  Get SPSS to output expected values  Look where these are <5  Consider recoding to combine cols or rows
  • 54. Tabulating questionnaire responses  Categorical survey data often “collapsed” for purposes of data analysis Original category Frequency Collapsed category Frequency White British 284 White 304 White Irish 7 Other White 13 Indian 40 South Asian 105 Pakistani 32 Bangladeshi 33 Chinese 16 Chinese 16 Black British 30 Black 44 Afro-Caribbean 12 African 2 An analysis on a sample of 2 (e.g. Black African) would not have been very meaningful!
  • 55. Recoding variables  http://www.youtube.com/watch?v=uzQ_522F 2SM&feature=related  Ignore t-test for now 6m11s  http://www.youtube.com/watch?v=FUoYZ_f6 Lxc  Uses old version of SPSS, no submenu now. 6m
  • 56. Scatterplots and correlations  Looks for association between variables, e.g.  Population size and GDP  crime and unemployment rates  height and weight  Both variables must be rank, interval or ratio (scale or ordinal in SPSS).  Thus cannot use variables like, gender, ethnicity, town of birth, occupation. 56
  • 57. 57 Scatterplots  e.g. age (in years) versus Number of GCSEs
  • 58. Interpretation  As Y increases X increases  Called correlation  Regression line model in red 58
  • 59. Correlation measures association not causation  The older the child the better s/he is at reading  The less your income the greater the risk of schizophrenia  Height correlates with weight  But weight does not cause height  Height is one of the causes of weight (also body shape, diet, fitness level etc.)  Numbers of ice creams sold is correlated with the rate of drowning  Ice creams do not cause drowning (nor vice versa)  Third variable involved – people swim more and buy more ice creams when it’s warm 59
  • 60. Scatterplot in SPSS  Use Graph menu  http://www.youtube.com/watch?v=74BjgPQvI Eg 8m34s  http://www.youtube.com/watch?v=blfflA- 34pQ&feature=related 4m04s  http://www.youtube.com/watch?v=UVylQoG4 hZM 1m50s, ignore polynomial regression 60
  • 61. Modifying the Scatterplot  http://www.youtube.com/watch?v=803YCYA2 AoQ&feature=related 4m04s  http://www.youtube.com/watch?v=vPzvuMuV Xk8&feature=related 3m40s 61
  • 62. If mixed data sets  Change point icon and/or colour to see different subsets.  Overall data may have no relationship but subsets might.  E.g. show male and female respondents.  Use Chart builder 62
  • 63. 63 Correlation  Correlation coefficient = measure of strength of relationship, e.g. Pearson’s r  varies from 0 to 1 with a plus or minus sign Correlations Number of GCSEs Age Pearson Correlation 1 -.415 * Sig. (2-tailed) .015 Number of GCSEs N 34 34 Pearson Correlation -.415 * 1 Sig. (2-tailed) .015 Age N 34 34 *. Correlation is significant at the 0.05 level (2-tailed).
  • 64. 64 Positive correlation  as x increases, y increases r = 0.7
  • 65. 65 Negative correlation  as x increases, y decreases r = -0.7
  • 66. 66 Strong correlation (i.e. close to 1) r = 0.9
  • 67. 67 Weak correlation (i.e. close to 0) r = 0.2
  • 68. Interpretation cont.  r2 is a measure of degree of variation in one variable accounted for by variation in the other.  E.g. If r=0.7 then r2=.49 i.e. just under half the variation is accounted for (rest accounted for by other factors).  If r=0.3 then r2=0.09 so 91% of the variation is explained by other things. 68
  • 69. Significance of r  SPSS reports if r is significant at α=0.05  N.B. this is dependent on sample size to a large extent.  Other things being equal, larger samples more likely to be significant.  Usually, size of r is more important than its significance 69
  • 70. Pearson’s r in SPSS  http://www.youtube.com/watch?v=loFLqZmvf zU 6m57s 70
  • 71. Parametric and non-parametric  Some statistics rely on the variables being investigated following a normal distribution. – Called Parametric statistics  Others can be used if variables are not distributed normally – called Non-parametric statistics.  Pearson’s r is a parametric statistic  Kendal’s tau and Spearman’s rho (rank correlation) are non-parametric. 71
  • 72. Assessing normality  Produce histogram and normal plot 72
  • 73. Use statistical test  SPSS provides two formal tests for normality : Kolmogorov-Smirnov (K-S) and Shapiro- Wilks (S-W)  But, there is debate about KS  Extremely sensitive to departure from normality  May erroneously imply parametric test not suitable – especially in small sample  So, always use a histogram as well. 73
  • 74. Often can use parametric tests  Parametric tests (e.g. Pearson’s r) are robust to departures from normality  Small, non-normal samples OK  But use non-parametric if  Data are skewed (questionnaire data often is)  Data are bimodal 74
  • 75. Spearmans’s rho  http://www.youtube.com/watch?v=r_WQe2c- ISU From 4.14 to 4.56  http://www.youtube.com/watch?v=POkFi5vKv I8&feature=fvwrel 6m16s 75
  • 76. So far…  Looked at relationships between nominal variables  Gender vs age group  Looked at relationships between scale variables  Height vs. Weight  Now combine the two  Groups vs a scale variable  E.g. Gender vs income 76
  • 77. Reminder – IV vs DV  IV = independent variable  What makes a difference, causes effects, is responsible for differences.  DV = dependent variable  What is affected by things, what is changed by the IV.  Gender vs income. Gender = IV, income = DV  So we investigate the effect of gender on income 77
  • 78. Example 1 Age group vs. no. of GCSEs  Using the Health and Illness class data  Age group defines 2 groups  Under 21  21 and over  Just two groups  Can use independent samples t-test  Independent because the two groups consist of different people.  t-test compares the means of the 2 groups. 78
  • 79. 79 Difference of means  Do under 21s have more or fewer GCSEs than 21 and overs?  Means are different (6.44 & 4.28) but is that significant? Group Statistics Age group N Mean Std. Deviation Std. Error Mean Under 21 16 6.44 3.140 .785 Number of GCSEs 21 and over 18 4.28 2.906 .685
  • 80. 80 Independent Samples Test s Test for Equality of Variances t-test for Equality of Mean Sig. t df Sig. (2-tailed) Mean Difference Std. Diffe .164 .689 2.082 32 .045 2.160 2.073 30.789 .047 2.160 Independent Samples Test Levene's Test for Equality of Variances t-test for Equality of Means 95% Confidence Interval of the Difference F Sig. t df Sig. (2-tailed) Mean Difference Std. Error Difference Lower Upper Equal variances assumed .164 .689 2.082 32 .045 2.160 1.037 .047 4.272 Number of GCSEs Equal variances not assumed 2.073 30.789 .047 2.160 1.042 .034 4.285 No significant difference therefore assume equal variances Means are statistically significantly different
  • 81. Parametric vs non-parametric  Just as in the case of correlations, there are both kinds of tests.  Need to check if DV is normally distributed.  Do this visually  Also use statistical tests 81
  • 82. Tests for normality  Kolmogorov-Smirnov and Shapiro-Wilk  If n>50 use KS  If n≤50 use SW  Null hypothesis is ‘data are normally distributed’.  So if p<0.05 then data are significantly different from a normal distribution – use non- parametric tests  If p≥0.05 then no significant difference – use parametric tests 82
  • 83. Checking normality  Produce histogram of DV  Tick box to undertake statistical test  Interpret results. 83
  • 84. t-test  Identify your two groups.  Determine what values in the data indicate those two groups (e.g. 1=female, 2=male)  Select Analyze:Compare Means:Independent samples t-test  http://www.youtube.com/watch?v=_KHI3ScO 8sc 9m40s 84
  • 85. Mann-Whitney U test  Use this when comparing two groups and the DV is not normally distributed  http://www.youtube.com/watch?v=7iTvv3m9d _g 3m45s 85
  • 86. Comparing 3 or more groups  ANOVA = Analysis of Variance  Analyze: Compare Means: One-way ANOVA  http://www.youtube.com/watch?v=wFq1b3QjI 1U 4m04s Useful to get table of means (descriptives) and means plots from ANOVA options. 86
  • 87. ANOVA Means and F value 87