Table of Contents
COVER
TITLE PAGE
COPYRIGHT
DEDICATION
PREFACE
ACKNOWLEDGMENTS
CHAPTER 1: INTRODUCTION
Big Data Analysis
Visual Data Analysis
Importance of Statistics for the Social and Health Sciences and Medicine
Historical Notes: Early Use of Statistics
Approach of the Book
Cases from Current Research
Research Design
Focus on Interpretation
CHAPTER 2: DESCRIPTIVE STATISTICS: CENTRAL TENDENCY
What is the Whole Truth? Research Applications (Spuriousness)
Descriptive and Inferential Statistics
The Nature of Data: Scales of Measurement
Descriptive Statistics: Central Tendency
Using SPSS® and Excel to Understand Central Tendency
Distributions
Describing the Normal Distribution: Numerical Methods
Descriptive Statistics: Using Graphical Methods
Terms and Concepts
Data Lab and Examples (with Solutions)
Data Lab: Solutions
CHAPTER 3: DESCRIPTIVE STATISTICS: VARIABILITY
Range
Percentile
Scores Based on Percentiles
Using SPSS® and Excel to Identify Percentiles
Standard Deviation and Variance
Calculating the Variance and Standard Deviation
Population SD and Inferential SD
Obtaining SD from Excel and SPSS®
Terms and Concepts
Data Lab and Examples (with Solutions)
Data Lab: Solutions
CHAPTER 4: THE NORMAL DISTRIBUTION
The Nature of the Normal Curve
The Standard Normal Score: Z Score
The Z Score Table of Values
Navigating the Z Score Distribution
Calculating Percentiles
Creating Rules for Locating Z Scores
Calculating Z Scores
Working with Raw Score Distributions
Using SPSS® to Create Z Scores and Percentiles
Using Excel to Create Z Scores
Using Excel and SPSS® for Distribution Descriptions
Terms and Concepts
Data Lab and Examples (with Solutions)
Data Lab: Solutions
CHAPTER 5: PROBABILITY AND THE Z DISTRIBUTION
The Nature of Probability
Elements of Probability
Combinations and Permutations
Conditional Probability: Using Bayes' Theorem
Z Score Distribution and Probability
Using SPSS® and Excel to Transform Scores
Using the Attributes of the Normal Curve to Calculate Probability
“Exact” Probability
From Sample Values to Sample Distributions
Terms and Concepts
Data Lab and Examples (with Solutions)
Data Lab: Solutions
CHAPTER 6: RESEARCH DESIGN AND INFERENTIAL STATISTICS
Research Design
Experiment
Non-Experimental or Post Facto Research Designs
Inferential Statistics
Z Test
The Hypothesis Test
Statistical Significance
Practical Significance: Effect Size
Z Test Elements
Using SPSS® and Excel for the Z Test
Terms and Concepts
Data Lab and Examples (with Solutions)
Data Lab: Solutions
CHAPTER 7: THE T TEST FOR SINGLE SAMPLES
Introduction
Z Versus T : Making Accommodations
Research Design
Parameter Estimation
The T Test
The T Test: A Research Example
Interpreting the Results of the T Test for a Single Mean
The T Distribution
The Hypothesis Test for the Single Sample T Test
Type I and Type II Errors
Effect Size
Effect Size for the Single Sample T Test
Power, Effect Size, and Beta
One- and Two-Tailed Tests
Point and Interval Estimates
Using SPSS® and Excel with the Single Sample T Test
Terms and Concepts
Data Lab and Examples (with Solutions)
Data Lab: Solutions
CHAPTER 8: INDEPENDENT SAMPLE T TEST
A Lot of “T s ”
Research Design
Experimental Designs and the Independent T Test
Dependent Sample Designs
Between and Within Research Designs
Using Different T Tests
Independent T Test: The Procedure
Creating the Sampling Distribution of Differences
The Nature of the Sampling Distribution of Differences
Calculating the Estimated Standard Error of Difference with Equal Sample Size
Using Unequal Sample Sizes
The Independent T Ratio
Independent T Test Example
Hypothesis Test Elements for the Example
Before–After Convention with the Independent T Test
Confidence Intervals for the Independent T Test
Effect Size
The Assumptions for the Independent T Test
SPSS® Explore for Checking the Normal Distribution Assumption
Excel Procedures for Checking the Equal Variance Assumption
SPSS® Procedure for Checking the Equal Variance Assumption
Using SPSS® and Excel with the Independent T Test
SPSS® Procedures for the Independent T Test
Excel Procedures for the Independent T Test
Effect Size for the Independent T Test Example
Parting Comments
Nonparametric Statistics: The Mann–Whitney U Test
Terms and Concepts
Data Lab and Examples (With Solutions)
Data Lab: Solutions
Graphics in the Data Summary
CHAPTER 9: ANALYSIS OF VARIANCE
A Hypothetical Example of ANOVA
The Nature of ANOVA
The Components of Variance
The Process of ANOVA
Calculating ANOVA
Effect Size
Post Hoc Analyses
Assumptions of ANOVA
Additional Considerations with ANOVA
The Hypothesis Test: Interpreting ANOVA Results
Are the Assumptions Met?
Using SPSS® and Excel with One-Way ANOVA
The Need for Diagnostics
Non-Parametric ANOVA Tests: The Kruskal–Wallis Test
Terms and Concepts
Data Lab and Examples (With Solutions)
Data Lab: Solutions
CHAPTER 10: FACTORIAL ANOVA
Extensions of Anova
Ancova
Manova
Mancova
Factorial Anova
Interaction Effects
Simple Effects
2XANOVA: An Example
Calculating Factorial ANOVA
The Hypotheses Test: Interpreting Factorial ANOVA Results
Effect Size for 2XANOVA: Partial η 2
Discussing the Results
Using SPSS® to Analyze 2XANOVA
Summary Chart for 2XANOVA Procedures
Terms and Concepts
Data Lab and Examples (With Solutions)
Data Lab: Solutions
CHAPTER 11: CORRELATION
The Nature of Correlation
The Correlation Design
Pearson's Correlation Coefficient
Plotting the Correlation: The Scattergram
Using SPSS® to Create Scattergrams
Using Excel to Create Scattergrams
Calculating Pearson's r
The Z Score Method
The Computation Method
The Hypothesis Test for Pearson's r
Effect Size: the Coefficient of Determination
Diagnostics: Correlation Problems
Correlation Using SPSS® and Excel
Nonparametric Statistics: Spearman's Rank Order Correlation (r s )
Terms and Concepts
Data Lab and Examples (with Solutions)
Data Lab: Solutions
CHAPTER 12: BIVARIATE REGRESSION
The Nature of Regression
The Regression Line
Calculating Regression
Effect Size of Regression
The Z Score Formula for Regression
Testing the Regression Hypotheses
The Standard Error of Estimate
Confidence Interval
Explaining Variance Through Regression
A Numerical Example of Partitioning the Variation
Using Excel and SPSS® with Bivariate Regression
The SPSS® Regression Output
The Excel Regression Output
Complete Example of Bivariate Linear Regression
Assumptions of Bivariate Regression
The Omnibus Test Results
Effect Size
The Model Summary
The Regression Equation and Individual Predictor Test of Significance
Advanced Regression Procedures
Detecting Problems in Bivariate Linear Regression
Terms and Concepts
Data Lab and Examples (with Solutions)
Data Lab: Solutions
CHAPTER 13: INTRODUCTION TO MULTIPLE LINEAR REGRESSION
The Elements of Multiple Linear Regression
Same Process as Bivariate Regression
Some Differences between Bivariate Linear Regression and Multiple Linear Regression
Stuff not Covered
Assumptions of Multiple Linear Regression
Analyzing Residuals to Check MLR Assumptions
Diagnostics for MLR: Cleaning and Checking Data
Extreme Scores
Distance Statistics
Influence Statistics
MLR Extended Example Data
Assumptions Met?
Analyzing Residuals: Are Assumptions Met?
Interpreting the SPSS® Findings for MLR
Entering Predictors Together as a Block
Entering Predictors Separately
Additional Entry Methods for MLR Analyses
Example Study Conclusion
Terms and Concepts
Data Lab and Example (with Solution)
Data Lab: Solution
CHAPTER 14: CHI-SQUARE AND CONTINGENCY TABLE ANALYSIS
Contingency Tables
The Chi-square Procedure and Research Design
Chi-square Design One: Goodness of Fit
A Hypothetical Example: Goodness of Fit
Effect Size: Goodness of Fit
Chi-square Design Two: The Test of Independence
A Hypothetical Example: Test of Independence
Special 2 × 2 Chi-square
Effect Size in 2 × 2 Tables: PHI
Cramer's V : Effect Size for the Chi-square Test of Independence
Repeated Measures Chi-square: Mcnemar Test
Using SPSS® and Excel with Chi-square
Using SPSS® for the Chi-square Test of Independence
Using Excel for Chi-square Analyses
Terms and Concepts
Data Lab and Examples (with Solutions)
Data Lab: Solutions
CHAPTER 15: REPEATED MEASURES PROCEDURES: Tdep AND ANOVAWS
Independent and Dependent Samples in Research Designs
Using Different T Tests
The Dependent T Test Calculation: The “Long” Formula
Example: The Long Formula
The Dependent T Test Calculation: The “Difference” Formula
T dep and Power
Conducting The T dep Analysis Using SPSS®
Conducting The T dep Analysis Using Excel
Within-Subject ANOVA (ANOVAWS )
Experimental Designs
Post Facto Designs
Within-Subject Example
Using SPSS® for Within-Subject Data
The SPSS® Procedure
The SPSS® Output
Nonparametric Statistics
Terms and Concepts
APPENDIX A: SPSS® BASICS
Using SPSS®
General Features
Management Functions
Additional Management Functions
APPENDIX B: EXCEL BASICS
Data Management
The Excel Menus
Using Statistical Functions
Data Analysis Procedures
Missing Values and “0” Values in Excel Analyses
Using Excel with “Real Data”
APPENDIX C: STATISTICAL TABLES
REFERENCES
INDEX
END USER LICENSE AGREEMENT
Pages
xv
xvi
xvii
xix
1
2
3
4
5
6
7
8
9
10
11
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
549
550
551
552
553
555
556
557
558
559
560
561
562
Guide
Cover
Table of Contents
Preface
Begin Reading
List of Illustrations
CHAPTER 1: INTRODUCTION
Figure 1.1 William Playfair's pie chart.
Figure 1.2 John Snow's map showing deaths in the London cholera epidemic of 1854.
Figure 1.3 Florence Nightingale's polar chart comparing battlefield and nonbattlefield deaths.
CHAPTER 2: DESCRIPTIVE STATISTICS: CENTRAL TENDENCY
Figure 2.1 The possible spurious relationship between ice cream consumption and crime.
Figure 2.2 The BRFSS GENHLTH variable values.
Figure 2.3 Graph of bimodal distribution.
Figure 2.4 Descriptive frequencies menus in SPSS® .
Figure 2.5 Frequencies submenus in SPSS® .
Figure 2.6 SPSS® frequency output.
Figure 2.7 SPSS® Descriptive – Descriptives output.
Figure 2.8 Excel spreadsheet showing GENHLTH data.
Figure 2.9 Excel database showing “Data Analysis” submenu.
Figure 2.10 “Descriptive Statistics” drop box for calculating central tendency.
Figure 2.11 Descriptive statistics results worksheet.
Figure 2.12 Example of the normal distribution of values.
Figure 2.13 Illustration of a positively skewed distribution.
Figure 2.14
Figure 2.15 Excel output showing the Histogram specification and the data columns.
Figure 2.16 Excel output showing the Histogram specification and the data columns.
Figure 2.17 Excel chart output from the “INSERT” menu ribbon.
Figure 2.18 SPSS® procedure for creating the histogram.
Figure 2.19 SPSS® procedure for specifying the features for the histogram.
Figure 2.20 SPSS® histogram in the output file.
Figure 2.21 SPSS® histogram showing the distribution of JS scores.
Figure 2.22 SPSS® (Frequency) central tendency results for AP scores.
Figure 2.23 Excel (Frequency) central tendency results for AP scores.
Figure 2.24 SPSS® histogram of AP scores.
Figure 2.25 Excel column chart of AP scores.
CHAPTER 3: DESCRIPTIVE STATISTICS: VARIABILITY
Figure 3.1 The characteristics of the range.
Figure 3.2 The uneven scale of percentile scores.
Figure 3.3 Specifying a percentile with SPSS® “Frequencies: Statistics” functions.
Figure 3.4 SPSS® output for percentile calculation.
Figure 3.5 Using the Excel functions to create percentiles.
Figure 3.6 Using the specification window for PERCENTILE.EXC.
Figure 3.7 The components of the SD.
Figure 3.8 The Excel descriptive statistics output for AP scores.
Figure 3.9 Using the Excel functions to calculate the “actual” SD.
Figure 3.10 Using the “Descriptives” menus in SPSS® .
Figure 3.11 The Descriptive Statistics output from SPSS® .
Figure 3.12 The AP score histogram from SPSS® .
Figure 3.13
Figure 3.14
Figure 3.15
Figure 3.16 The SPSS® histogram output for Problem 2.
CHAPTER 4: THE NORMAL DISTRIBUTION
Figure 4.1 The normal curve with known properties.
Figure 4.2 The location of z = (−)1.96.
Figure 4.3 The (partial) Z Score Table of Values.
Figure 4.4 Using the Z Score Table to identify the percent of the distribution below a z score.
Figure 4.5 Using the Z Score Table of Values to identify the percent of the distribution below z = −1.96.
Figure 4.6 Identifying the area between z scores.
Figure 4.7 Identifying the tabled values of z scores of −1.96 and −1.35.
Figure 4.8 Subtracting the areas to identify the given area of the distribution.
Figure 4.9 Visualizing the area between two z scores.
Figure 4.10 An example AP raw score distribution.
Figure 4.11 The histogram of AP score values (from Figure 3.12).
Figure 4.12 Using SPSS® to create z scores.
Figure 4.13 Creating a z score variable using SPSS® descriptive menu.
Figure 4.14 Using the SPSS® Frequencies menu to locate percentiles.
Figure 4.15 Using the SPSS® Frequencies output to locate percentiles.
Figure 4.16 The Function Argument STANDARDIZE with AP score data.
Figure 4.17 Entering formulas directly in Excel using the enter formula (“=”) key.
Figure 4.18 Entering formulas by dragging a formula to other values in a spreadsheet.
Figure 4.19 Using the Excel NORM.S.DIST function.
Figure 4.20 Using the Excel NORM.DIST function.
Figure 4.21 Using SPSS® to create z scores.
Figure 4.22 Using the Excel Standardize function to create z scores (using SD population).
CHAPTER 5: PROBABILITY AND THE Z DISTRIBUTION
Figure 5.1 Specifying combinations using Excel “COMBIN” in the “Math & Trig” formulas.
Figure 5.2 Specifying permutations using Excel “PERMUT” from “Statistical” functions.
Figure 5.3 Visualizing the transformation of a percentile to a z score.
Figure 5.4 Visualizing the 67th percentile of the raw score test distribution.
Figure 5.5 Using the Compute Variable menu in SPSS® to create z scores.
Figure 5.6 SPSS® data file with two z score variables.
Figure 5.7 Visualizing the probabilities as a preliminary solution.
Figure 5.8 Visualizing the middle 90% area of the test scores.
Figure 5.9 Visualizing the excluded 5% of the test score values.
Figure 5.10 Using the Excel NORM.DIST function for exact probabilities (probability density function).
Figure 5.11 Using the Excel dragging capability to calculate probability density values.
Figure 5.12 Excel NORM.DIST example for calculating exact probabilities.
Figure 5.13 Estimating an exact probability using the z score table.
Figure 5.14 Using NORM.DIST to identify the exact probability for 10.
CHAPTER 6: RESEARCH DESIGN AND INFERENTIAL STATISTICS
Figure 6.1 The process of social research.
Figure 6.2 The nature of experimental designs.
Figure 6.3 The sampling process for inferential statistics.
Figure 6.4 Creating a sampling distribution of means.
Figure 6.5 The nature of the sampling distribution.
Figure 6.6 Using the sampling distribution to “locate” the sample mean.
Figure 6.7 Using the sampling distribution and the standard error of the mean.
Figure 6.8 Using the sampling distribution to make a statistical decision.
Figure 6.9 The Excel Z test formula specification menu.
CHAPTER 7: THE T TEST FOR SINGLE SAMPLES
Figure 7.1 The “one-shot case study.”
Figure 7.2 Using the sampling distribution with estimated population values.
Figure 7.3 Estimating the population standard deviation.
Figure 7.4 The T Test elements used to compare a sample mean to all possible means from a population.
Figure 7.5 Understanding the concept of degrees of freedom.
Figure 7.6 Excel descriptive statistics for QL.
Figure 7.7 SPSS® descriptive statistics for QL.
Figure 7.8 Transforming the sample mean value to a value on the sampling distribution.
Figure 7.9 The nature of the T distribution.
Figure 7.10 The T test exclusion areas.
Figure 7.11 Visualizing beta and nonbeta areas.
Figure 7.12 The two-tailed test.
Figure 7.13 One- and two-tailed exclusion values.
Figure 7.14 Confidence interval values for the QL example.
Figure 7.15 Comparing CI0.95 and CI0.99 .
Figure 7.16 Selecting the single sample t test in SPSS® .
Figure 7.17 Specifying the single sample T test in SPSS® .
Figure 7.18 The SPSS® output tables with t test results.
Figure 7.19 The SPSS® output tables with t test results.
Figure 7.20 The T.DIST.2T function in Excel.
Figure 7.21 The Excel descriptive data for the sample group Overall scores.
Figure 7.22 The SPSS® results of the single sample t test.
CHAPTER 8: INDEPENDENT SAMPLE T TEST
Figure 8.1 The research design with randomness and two comparison groups.
Figure 8.2 Using dependent sample measures in experimental designs.
Figure 8.3 Using matched groups in experimental designs.
Figure 8.4 Example of experimental research design using T test with two groups.
Figure 8.5 The post facto comparison for independent t test.
Figure 8.6 The independent sample T test process.
Figure 8.7 All possible pairs of samples.
Figure 8.8 The sampling distribution of differences created from pairs of samples.
Figure 8.9 Symbols in the distribution of differences.
Figure 8.10 The statistical decision for the FOP example question.
Figure 8.11 Accessing the SPSS® Explore procedure.
Figure 8.12 Specifying the variables to check the normality assumption.
Figure 8.13 Explore output for assessing normal distribution of sample groups.
Figure 8.14 The Excel equal variance test menu.
Figure 8.15 The Excel output for the two-sample variance test.
Figure 8.16 The F distribution and exclusion area.
Figure 8.17 Specifying the variables of interest for equality of variances.
Figure 8.18 Specifying Levene's test for equality of variances.
Figure 8.19 The SPSS® output assessing equality of variance.
Figure 8.20 The SPSS® menus for the independent sample T test.
Figure 8.21 The Independent-Samples T Test callout window for specifying the analysis.
Figure 8.22 Specifying group values for the independent T test.
Figure 8.23 The SPSS® output for the independent T test.
Figure 8.24 The Excel specifications for the independent T test.
Figure 8.25 The Excel callout window for locating data.
Figure 8.26 The Excel output for the independent T test with equal variances.
Figure 8.27 The SPSS® options for the Mann–Whitney U Test.
Figure 8.28 The SPSS® specification for the Mann–Whitney U Test.
Figure 8.29 The SPSS® output for the Mann–Whitney U Test.
Figure 8.30 Comparison oftraining groups on patient satisfaction.
Figure 8.31 Comparison of training groups on patient satisfaction (SPSS® ).
CHAPTER 9: ANALYSIS OF VARIANCE
Figure 9.1 The four groups in the noise-learning experiment.
Figure 9.2 The paired comparisons in the experiment with four groups.
Figure 9.3 The components of variance in ANOVA.
Figure 9.4 Different ANOVA possibilities.
Figure 9.5 ANOVA possibilities of groups with different within variances.
Figure 9.6 Venn diagram showing effect size.
Figure 9.7 Post hoc test possibilities.
Figure 9.8 The post hoc summary for the example.
Figure 9.9 The descriptive output (Excel) to test the normal distribution assumption.
Figure 9.10 SPSS® descriptive output for normal distribution assumption.
Figure 9.11 SPSS® graphs for FR groups.
Figure 9.12 The SPSS® means procedure.
Figure 9.13 Three indicators of health by 6, 7, or 8 hours sleep.
Figure 9.14 The SPSS® menu options for accessing the one-way ANOVA.
Figure 9.15 The one-way ANOVA specification windows.
Figure 9.16 The post hoc choices from SPSS® one-way ANOVA.
Figure 9.17 Options for SPSS one-way ANOVA.
Figure 9.18 The descriptives report in SPSS® one-way ANOVA.
Figure 9.21 The Tukey post hoc results in the SPSS® one-way ANOVA procedure.
Figure 9.19 The Levene's test results in SPSS® one-way ANOVA.
Figure 9.20 The SPSS® one-way ANOVA summary table.
Figure 9.22 The single factor ANOVA menu option in Excel.
Figure 9.23 The Excel single factor ANOVA output.
Figure 9.24 The specification window for the Kruskal–Wallis test.
Figure 9.25 The output for the Kruskal–Wallis test.
Figure 9.26 The data to check for normal distribution.
Figure 9.27 The ANOVA results for Problem 2.
Figure 9.28 The post hoc analysis for the ANOVA result.
CHAPTER 10: FACTORIAL ANOVA
Figure 10.1 Main effects analyses in 2XANOVA.
Figure 10.2 The interaction effect of sex and noise conditions.
Figure 10.3 Ordinal interaction patterns compared to no interaction.
Figure 10.4 The interaction graph for the 2XANOVA example (sex by sleep).
Figure 10.5 The SPSS® data file for the 2XANOVA example.
Figure 10.6 The SPSS® 2XANOVA menus.
Figure 10.7 The SPSS® menus for specifying the 2XANOVA procedure.
Figure 10.8 The Plots window that specifies the results graph.
Figure 10.9 The “Post Hoc” window specifying the Tukey analysis for sleep.
Figure 10.10 The choices in the “Univariate: Options” window.
Figure 10.11 The SPSS® 2XANOVA summary table.
Figure 10.12 The SPSS® simple effects table for levels of schools on subject areas.
Figure 10.13 The simple effects graph for the 2XANOVA example (sleep by sex).
Figure 10.14 shows the other simple effects analysis. In this table, both the univariate tests for sleep levels within sex categories are significant (male: F = 8.967, p = 0.004; female: F = 7.672, p = 0.007). This analysis indicates that the different sleep levels appear to show significantly different health scores within male and female categories.3 Separate tables in the SPSS output allow you to pinpoint the pairwise differences.
Figure 10.15 The 2XANOVA procedure chart.
Figure 10.16 The simple effects analyses for condition.
Figure 10.17 Simple effects plot for condition.
Figure 10.18 Simple effects analyses for provider.
Figure 10.19 Simple effects plot for provider.
CHAPTER 11: CORRELATION
Figure 11.1 Examples of Pearson's r values.
Figure 11.2 The scattergram between health and income.
Figure 11.3 Correlation patterns in scattergrams.
Figure 11.4 Strength of correlations in the scattergram.
Figure 11.5 The SPSS® graph menu for creating scattergrams.
Figure 11.6 The Scatter/Dot menus.
Figure 11.7 The scattergram specification window in SPSS® .
Figure 11.8 The Excel scattergram specification.
Figure 11.9 The Excel scattergram.
Figure 11.10 The Z score scattergram in Excel.
Figure 11.11 The effect size of correlation–explaining variance.
Figure 11.12 The effect size components produced by correlation.
Figure 11.13 The correlation problem of restricted range.
Figure 11.14 The correlation conditions of homoscedasticity and heteroscedasticity.
Figure 11.15 The SPSS® descriptive output for the study variables.
Figure 11.16 The Excel descriptive statistics.
Figure 11.17 The SPSS® histogram for HealthOP.
Figure 11.18 The Excel histogram for HealthOP.
Figure 11.19 The Excel scattergram between IncomeOP and HealthOP.
Figure 11.20 The SPSS® Correlation menu.
Figure 11.21 The Correlation specification window.
Figure 11.22 The SPSS® correlation matrix.
Figure 11.23 The “Correlation” window in the Excel Data – Data Analysis menu.
Figure 11.24 The “Correlation” specification window in Excel.
Figure 11.25 The Excel correlation matrix.
Figure 11.26 Funding for study hospitals.
Figure 11.27 The Spearman's Rho correlation between the study variables.
Figure 11.28 The Pearson's r correlation between the study variables.
Figure 11.29 Descriptive findings for study variables.
Figure 11.30 The histogram for community involvement.
Figure 11.31 The histogram for job opportunities.
Figure 11.32 The tests of normality for the study variables.
Figure 11.33 The scattergram for the study variables.
Figure 11.34 The correlation findings for the study variables.
CHAPTER 12: BIVARIATE REGRESSION
Figure 12.1 The regression line for job opportunities and community involvement.
Figure 12.2 The effect of correlation on prediction accuracy.
Figure 12.3 The lack of meaningful prediction with no significant correlation.
Figure 12.4 The scattergram between income class and healthy days.
Figure 12.5 The formula in pieces.
Figure 12.6 The completed regression formula.
Figure 12.7 Using the regression formula to predict a value of Y at X = 4.
Figure 12.8 The SPSS® Z score scattergram of income class and healthy days.
Figure 12.9 The elements of the Y variance.
Figure 12.10 The regression options in SPSS® .
Figure 12.11 The SPSS® regression specification windows.
Figure 12.12 The ANOVA table providing data for the omnibus test.
Figure 12.13 The SPSS® model summary results panel.
Figure 12.14 The SPSS® coefficients panel results for the bivariate regression.
Figure 12.15 The Excel regression specification window.
Figure 12.16 The Excel regression Statistics output.
Figure 12.19 The Excel predicted values and residuals for the study data.
Figure 12.20 The SPSS® descriptive summaries of the study variables.
Figure 12.21 The Excel descriptive summaries for the study variables.
Figure 12.22 The scattergram between recreational opportunities and community involvement.
Figure 12.23 The Curve Estimation procedure in SPSS® .
Figure 12.24 The Curve Estimation procedure for recreational opportunities and community involvement.
Figure 12.25 The curve estimation model summary.
Figure 12.26 The quadratic coefficients analysis for Curve Estimation.
Figure 12.27 The SPSS® omnibus test results.
Figure 12.28 The SPSS® model summary results for recreational opportunities–community involvement study.
Figure 12.29 The SPSS® coefficients output for the recreational opportunities–community involvement study.
Figure 12.30 The Excel regression output for the reading assessment – FR study.
Figure 12.31 The multiple correlation relationship in the fictitious study example.
Figure 12.32 Sk and Ku data for the two study variables.
Figure 12.33 Tests of normality are nonsignificant for the study variables.
Figure 12.34 The scattergram of the study variables.
Figure 12.35 Model summary comparisons for linear and quadratic equations.
Figure 12.36 The scattergram of linear and quadratic regression equations.
Figure 12.37 Correlation matrix indicating a significant correlation.
Figure 12.38 The model summary indicating the correlation and squared correlation (effect size).
Figure 12.39 The omnibus test result from the ANOVA result.
Figure 12.40 The coefficients findings summary.
CHAPTER 13: INTRODUCTION TO MULTIPLE LINEAR REGRESSION
Figure 13.1 The SPSS® report of diagnostic values for the study data.
Figure 13.2 The SPSS® specification menu for linear regression.
Figure 13.3 The SPSS® specification menu for diagnostic values.
Figure 13.4 The scattergram of the study variables.
Figure 13.5 The histogram for the dependent variable in the study.
Figure 13.6 The SPSS® descriptives report showing skewness and kurtosis findings.
Figure 13.7 The histogram of standardized residuals from the study.
Figure 13.8 The P–P plot of standardized residuals from the study.
Figure 13.9 The scatterplot between standardized residuals and predicted values.
Figure 13.10 The specification menus for SPSS® MLR.
Figure 13.11 The SPSS® MLR specification for assumptions and individual predictors.
Figure 13.12 The Partial Regression Plot showing Rank Opinion predicting Health Opinion.
Figure 13.13 The MLR omnibus test result.
Figure 13.14 The effect size summary for the MLR procedure.
Figure 13.15 The SPSS® output for individual predictors.
Figure 13.16 Isolating the effects of a predictor variable on an outcome variable through part or semipartial correlation.
Figure 13.17 The MLR entry method for the first predictor.
Figure 13.18 The omnibus test results for hierarchical MLR.
Figure 13.19 The hierarchical results for effect size.
Figure 13.20 The individual predictor summary for separate entry.
Figure 13.21 The specification menu for the Stepwise (and other) entry method(s).
Figure 13.22 The histogram for the outcome variable.
Figure 13.23 The findings indicating all study variables are normally distributed.
Figure 13.24 The residuals plot for normality assessment.
Figure 13.25 The scatterplot between standardized residuals and predicted values.
CHAPTER 14: CHI-SQUARE AND CONTINGENCY TABLE ANALYSIS
Figure 14.1 The chi-square series of distributions.
Figure 14.2 The SPSS® “Weight Cases” specification window.
Figure 14.3 The SPSS® “Weight Cases” specification window.
Figure 14.4 The SPSS® Crosstabs specification window.
Figure 14.5 The “Crosstabs: Statistics” menu in SPSS® .
Figure 14.6 The SPSS® “Crosstabs: Cell Display” menu.
Figure 14.7 The SPSS® crosstabs contingency table.
Figure 14.8 The SPSS® chi-square significance test output.
Figure 14.9 The SPSS® effect size measures.
Figure 14.10 The CHISQ.TEST function in Excel for chi-square analysis.
Figure 14.11 The Excel CHIDIST function to identify the chi-square probability.
Figure 14.12 The SPSS® variables used for the chi-square analysis.
Figure 14.13 The SPSS® chi-square findings.
Figure 14.14 The SPSS® effect size findings.
Figure 14.15 The SPSS® contingency table output for the crosstabs analysis.
Figure 14.16 The Excel CHISQ.DIST.RT results for the test of independence.
Figure 14.17 The Excel CHISQ.TEST function using observed and expected frequencies.
CHAPTER 15: REPEATED MEASURES PROCEDURES: Tdep AND ANOVAWS
Figure 15.1 The SPSS® T dep specification window.
Figure 15.2 The SPSS® descriptive output for T dep .
Figure 15.4 The SPSS® T dep test summary.
Figure 15.3 The SPSS® correlation output for T dep .
Figure 15.5 The Excel T dep specification window.
Figure 15.6 The T dep findings from Excel paired two sample test.
Figure 15.7 The mixed design that includes within-subject and between-group elements.
Figure 15.8 The SPSS® specification window for the ANOVAws procedure.
Figure 15.9 The SPSS® “Repeated Measures” window.
Figure 15.10 The “Contrasts” window for specifying repeated contrasts.
Figure 15.11 The “Options” menu for the ANOVAws procedure.
Figure 15.12 The “Descriptive Statistics” output.
Figure 15.13 The Mauchly's test of sphericity results.
Figure 15.14 The within-subject effects output for Time.
Figure 15.15 The effect size output for ANOVAws .
Figure 15.16 The post hoc output for the study.
Figure 15.17 The comparison plot for the time conditions.
List of Tables
CHAPTER 2: DESCRIPTIVE STATISTICS: CENTRAL TENDENCY
Table 2.1 Typical Ordinal Response Scale
Table 2.2 Perceived Distances in Ordinal Response Items
Table 2.3 Comparison of Interval and Ordinal Scales
Table 2.4 Aggregated School Percentages of Students Passing the Math Standard
Table 2.5 Adjusted School Percentages
Table 2.6 Math Achievement Percentages Demonstrating a Bimodal Distribution of Scores
Table 2.7 BRFSS Responses to the General Health Question
Table 2.8 GENHLTH Responses as Unordered and Ordered
Table 2.9 Frequency of GENHLTH Responses in Five Bins
Table 2.10 Job Satisfaction Ratings of Assembly Workers
Table 2.11 Exam Scores in an AP Class
CHAPTER 3: DESCRIPTIVE STATISTICS: VARIABILITY
Table 3.1 Using the Deviation Method to Calculate SD
Table 3.2 Using the Computation Method to Calculate SD
Table 3.3 Neighborhood Characteristics Ratings Sample
Table 3.4 Housing Survey Data: Quality of Life Index
CHAPTER 4: THE NORMAL DISTRIBUTION
Table 4.1 Neighborhood Characteristics Ratings Sample
Table 4.2 Housing Survey Data – Quality of Life Index
CHAPTER 5: PROBABILITY AND THE Z DISTRIBUTION
Table 5.1 Soccer Injuries by Incidence
Table 5.2 Elements of the Bayes Theorem Problem
Table 5.3 Health Score Ratings of Baristas
Table 5.4 Conditional Probability Elements for OCD
CHAPTER 6: RESEARCH DESIGN AND INFERENTIAL STATISTICS
Table 6.1 Experimental and Control Groups
Table 6.2 Quasi-Experimental Design
Table 6.3 Population and Sample Symbols for Inferential Statistics
CHAPTER 7: THE T TEST FOR SINGLE SAMPLES
Table 7.1 Population and Sample Symbols for Inferential Statistics
Table 7.2 Quality of Life (Hypothetical) Sample Data
Table 7.3 Exclusion Values for the Z Distribution
Table 7.4 Exclusion Values for T Distribution (df = 9)
Table 7.5 The STAR Classroom Observation Protocol™ Data
CHAPTER 8: INDEPENDENT SAMPLE T TEST
Table 8.1 New Entries to the List of Population and Sample Symbols
Table 8.2 Procedure Information and Fear of Procedure Scores
Table 8.3 Minutes Pacing for Low and High Perceived Stress Assessments
Table 8.4 Mann–Whitney U Test Data
Table 8.5 Practitioner Training and Patient Satisfaction
CHAPTER 9: ANALYSIS OF VARIANCE
Table 9.1 Hypothetical Experiment Data
Table 9.2 Hypothetical Experiment Data with Squared Values
Table 9.3 The ANOVA Results Table
Table 9.4 The ANOVA Results Table with Calculated MS Values
Table 9.5 The Final ANOVA Results Table
Table 9.6 The F Table of Values Exclusion Areas
Table 9.7 Example of Values from a Studentized Range Table
Table 9.8 The Group Means
Table 9.9 Matrix of Group Means
Table 9.10 The Data for One-Way ANOVA Example
Table 9.11 The ANOVA Example Database with Calculation Values
Table 9.12 The Completed ANOVA Summary Table for the Extended Example
Table 9.13 The Group Mean Difference Matrix
Table 9.14 Hypothetical Data for Kruskal–Wallis Test
Table 9.15 The Paired Comparison Results
Table 9.16 Blood Sugar: Pay Method Data
Table 9.17 Post Hoc Analysis for Problem 1
CHAPTER 10: FACTORIAL ANOVA
Table 10.1 The Data for the Sleep–Sex Impact on Health
Table 10.2 Data Summaries for 2XANOVA Calculations
Table 10.3 The 2XANOVA Summary Table
Table 10.4 The Completed 2XANOVA Summary Table
Table 10.5 Patient Satisfaction Ratings of Providers and Medical Conditions
Table 10.6 Manual Calculations for Problem 1
Table 10.7 ANOVA Table for Problem 1
CHAPTER 11: CORRELATION
Table 11.1 Measures of Correlation
Table 11.2 Data for Correlation Example
Table 11.3 The Data Table Showing Z Scores
Table 11.4 Hypothetical Correlation of Hospital Rankings by Patient Satisfaction Ranking
Table 11.5 Ranking an Interval Variable
Table 11.6 Test Taking Minutes and Test Score Data
Table 11.7 The Job Opportunity–Community Involvement Study Data
CHAPTER 12: BIVARIATE REGRESSION
Table 12.1 The Fictitious Data on Income Class and Healthy Days
Table 12.2 The Calculated Sums of the Fictitious Data
Table 12.3 The Calculations for the Components of Variance
Table 12.4 The Stress-Absence Data
Table 12.5 The Job Opportunity and Community Involvement Data
Table 12.6 Manual Calculations for Problem 1
CHAPTER 13: INTRODUCTION TO MULTIPLE LINEAR REGRESSION
Table 13.1 The Diagnostic Study Values
Table 13.2 The Change Values Resulting from Adding Predictors
Table 13.3 The Study Variables and Labels
Table 13.4 Problem 1 MLR Data
Table 13.5 The Unique Contribution to Outcome Variance by Predictor Variables
CHAPTER 14: CHI-SQUARE AND CONTINGENCY TABLE ANALYSIS
Table 14.1 The Hypothetical Treatment Differences Data
Table 14.2 The Reporting Data for the Hypothetical Study
Table 14.3 Observed Frequencies for the Test of Independence
Table 14.4 The Expected Frequencies for the Study
Table 14.5 The Calculated Chi-Square for the Test of Independence
Table 14.6 The Percentages of the Study Data for Interpretation of Findings
Table 14.7 The 2 × 2 Chi-Square Contingency Table
Table 14.8 The Example Data in a 2 × 2 Table
Table 14.9 Using the General Chi-Square Formula on the 2 × 2 Table
Table 14.10 The Dependent Sample Chi-Square for the Resolve Study
Table 14.11 Changing the Dependent Sample Chi-Square Categories
Table 14.12 The Chi-Square Values for the Hypothetical Problem
Table 14.13 The Database for the Chi-Square Example
Table 14.14 The Example Data
Table 14.15 The Contingency Table with Expected Frequencies
CHAPTER 15: REPEATED MEASURES PROCEDURES: Tdep AND ANOVAWS
Table 15.1 Independent and Dependent Sample in Experimental and Post Facto Designs
Table 15.2 The Study Data
Table 15.3 The Calculated Elements of the Study Data
Table 15.4 The Difference Procedure for Calculating T dep
Table 15.5 The T ind Comparison with T dep
Table 15.6 The Experimental Design
Table 15.7 Data for Within-Subject Study with Three Categories
Table 15.8 The Within-Subject Example Data
Using Statistics in the Social and Health Sciences with SPSS® and EXCEL®
Copyright © 2017 by John Wiley & Sons, Inc. All rights reserved
Published by John Wiley & Sons, Inc., Hoboken, New Jersey
Published simultaneously in Canada
No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning, or otherwise, except as permitted under Section 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 750-4470, or on the web at www.copyright.com. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748-6008, or online at http://www.wiley.com/go/permissions.
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing this book, they make no representations or warranties with respect to the accuracy or completeness of the contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular purpose. No warranty may be created or extended by sales representatives or written sales materials. The advice and strategies contained herein may not be suitable for your situation. You should consult with a professional where appropriate. Neither the publisher nor author shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages.
For general information on our other products and services or for technical support, please contact our Customer Care Department within the United States at (800) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic formats. For more information about Wiley products, visit our web site at www.wiley.com.
Library of Congress Cataloging-in-Publication Data:
Names: Abbott, Martin, 1949-
Title: Using statistics in the social and health sciences with SPSS® and Excel® / Martin Lee Abbott.
Description: Hoboken, New Jersey : John Wiley & Sons, Inc., [2017] | In the title, both SPSS and Excel are accompanied by the trademark symbol. | Includes bibliographical references and index.
Identifiers: LCCN 2016009168| ISBN 9781119121046 (cloth) | ISBN 9781119121060 (epub) | ISBN 9781119121053 (epdf)
Subjects: LCSH: Mathematical statistics--Data processing. | Multivariate analysis--Data processing. | Social sciences--Statistical methods. | Medical sciences--Statistical methods. | Microsoft Excel (Computer file) | SPSS (Computer file)
Classification: LCC QA276.45.M53 A23 2017 | DDC 005.5/5--dc23 LC record available at https://lccn.loc.gov/2016009168
To my longsuffering, wonderful wife Kathy;
-and-
To those seeking to understand the nature of social systems so that, like Florence Nightingale, they might better understand God's character.
The study of statistics is gaining recognition in a great many fields. In particular, researchers in the social and health sciences note its importance for problem solving and its practical importance in their areas. Statistics has always been important, for example, among those hoping to enter careers in medicine but more so now due to the increasing emphasis on “Scientific Inquiry & Reasoning Skills” as preparation for the Medical College Admission Test (MCAT). Sociology, always relying on statistics and research for its core emphases, is now included in the MCAT as well.
This book focuses squarely on the procedures important to an essential understanding of statistics and how it is used in the real world for problem solving. Moreover, my discussion in the book repeatedly ties statistical methodology with research design (see the “companion” volume my colleague and I wrote to emphasize research and design skills in social science; Abbott and McKinney, 2013).
I emphasize applied statistical analyses and as such will use examples throughout the book drawn from my own research as well as from national databases like GSS and Behavioral Risk Factor Surveillance System (BRFSS). Using data from these sources allow students the opportunity to see how statistical procedures apply to research in their fields as well as to examine “real data.” A central feature of the book is my discussion and use of SPSS® and Microsoft Excel® to analyze data for problem solving.
Throughout my teaching and research career, I have developed an approach to helping students understand difficult statistical concepts in a new way. I find that the great majority of students are visual learners, so I developed diagrams and figures over the years that help create a conceptual picture of the statistical procedures that are often problematic to students (like sampling distributions!).
Another reason for writing this book was to give students a way to understand statistical computing without having to rely on comprehensive and expensive statistical software programs. Since most students have access to Microsoft Excel, I developed a step-by-step approach to using the powerful statistical procedures in Excel to analyze data and conduct research in each of the statistical topics I cover in the book.1
I also wanted to make those comprehensive statistical programs more approachable to statistics students, so I have also included a “hands-on” guide to SPSS in parallel with the Excel examples. In some cases, SPSS has the only means to perform some statistical procedures, but in most cases, both Excel and SPSS can be used.
Here are some of the features of the book:
1. Emphasis on the interpretation of findings.
2. Use of clear examples from my existing and former research projects and large databases to illustrate statistical procedures. “Real-world” data can be cumbersome, so I introduce straightforward procedures and examples in order to help students focus more on interpretation of findings.
3. Inclusion of a data lab section in each chapter that provides relevant, clear examples.
4. Introduction to advanced statistical procedures in chapter sections (e.g., regression diagnostics) and separate chapters (e.g., multiple linear regression) for greater relevance to real-world research needs.
5. Strengthening of the connection between statistical application and research designs .
6. Inclusion of detailed sections in each chapter explaining applications from Excel and SPSS .
I use SPSS2 (versions 22 and 23) screenshots of menus and tables by permission from the IBM® Company. IBM, the IBM logo, ibm.com, and SPSS are trademarks or registered trademarks of International Business Machines Corporation , registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at “IBM Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml. Microsoft Excel references and screenshots in this book are used with permission from Microsoft. I use Microsoft Excel® 2013 in this book.3
I use GSS (2014) data and codebook for examples in this book.4 The BRFSS Survey Questionnaire and Data are used with permission from the CDC.5
1 One limitation to teaching statistics procedures with Excel is that the data analysis features are different depending on whether the user is a “Mac” user or a “PC” user. I am using the PC version, which features a “Data Analysis” suite of statistical tools. This feature may no longer be included in the Mac version of Excel.
2 SPSS screen reprints throughout the book are used courtesy of International Business Machines Corporation, ©International Business Machines Corporation. SPSS was acquired by IBM in October 2009.
3 Excel references and screenshots in this book are used with permission from Microsoft® .
4 Smith, Tom W., Peter Marsden, Michael Hout, and Jibum Kim. General Social Surveys, 1972–2012 [machine-readable data file]/Principal Investigator, Tom W. Smith; Coprincipal Investigator, Peter V. Marsden; Coprincipal Investigator, Michael Hout; Sponsored by National Science Foundation. NORC ed. Chicago: National Opinion Research Center [producer]; Storrs, CT: The Roper Center for Public Opinion Research, University of Connecticut [distributor], 2013. 1 data file (57,061 logical records) + 1 codebook (3432 pp.). (National Data Program for the Social Sciences, No. 21).
5 Centers for Disease Control and Prevention (CDC). Behavioral Risk Factor Surveillance System Survey Questionnaire . Atlanta, Georgia: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, 2013 and Centers for Disease Control and Prevention (CDC). Behavioral Risk Factor Surveillance System Survey Data . Atlanta, Georgia: U.S. Department of Health and Human Services, Centers for Disease Control and Prevention, 2013.
I wish to thank my daughter Kristin Hovaguimian for her outstanding work on the Index to this book (and all the others!) – not an easy task with a book of this nature.
I thank my wife Kathleen Abbott for her dedication and amazing contributions to the editing process.
I thank my son Matthew Abbott for the inspiration he has always provided in matters statistical and philosophical.
Thank you Jon Gurstelle and the team at Wiley for your continuing support of this project.