SPSS Statistics for Data Analysis and Visualization
Dive deeper into SPSS Statistics for more efficient, accurate, and sophisticated data analysis and visualization SPSS Statistics for Data Analysis and Visualization goes beyond the basics of SPSS Statistics to show you advanced techniques that exploit the full capabilities of SPSS. The authors explain when and why to use each technique, and then walk you through the execution with a pragmatic, nuts and bolts example. Coverage includes extensive, in-depth discussion of advanced statistical techniques, data visualization, predictive analytics, and SPSS programming, including automation and integration with other languages like R and Python. You'll learn the best methods to power through an analysis, with more efficient, elegant, and accurate code. IBM SPSS Statistics is complex: true mastery requires a deep understanding of statistical theory, the user interface, and programming. Most users don't encounter all of the methods SPSS offers, leaving many little-known modules undiscovered. This book walks you through tools you may have never noticed, and shows you how they can be used to streamline your workflow and enable you to produce more accurate results. Conduct a more efficient and accurate analysis Display complex relationships and create better visualizations Model complex interactions and master predictive analytics Integrate R and Python with SPSS Statistics for more efficient, more powerful code These "hidden tools" can help you produce charts that simply wouldn't be possible any other way, and the support for other programming languages gives you better options for solving complex problems. If you're ready to take advantage of everything this powerful software package has to offer, SPSS Statistics for Data Analysis and Visualization is the expert-led training you need.
Foreword xxiii Introduction xxvii Part I Advanced Statistics 1 Chapter 1 Comparing and Contrasting IBM SPSS AMOS with Other Multivariate Techniques 3 T-Test 7 ANCOVA 8 MANOVA 13 Factor Analysis and Unobserved Variables in SPSS 23 AMOS 26 Revisiting Factor Analysis and a General Orientation to AMOS 26 The General Model 29 Chapter 2 Monte Carlo Simulation and IBM SPSS Bootstrapping 43 Monte Carlo Simulation 44 Monte Carlo Simulation in IBM SPSS Statistics 44 Creating an SPSS Model File 45 IBM SPSS Bootstrapping 59 Proportions 63 Bootstrap Mean 66 Bootstrap and Linear Regression 68 Chapter 3 Regression with Categorical Outcome Variables 71 Regression Approaches in SPSS 72 Logistic Regression 73 Ordinal Regression Theory 74 Assumptions of Ordinal Regression Models 77 Ordinal Regression Dialogs 77 Ordinal Regression Output 81 Categorical Regression Theory 86 Assumptions of Categorical Regression Models 87 Categorical Regression Dialogs 87 Categorical Regression Output 93 Chapter 4 Building Hierarchical Linear Models 101 Overview of Hierarchical Linear Mixed Models 102 A Two-Level Hierarchical Linear Model Example 102 Mixed Models…Linear 104 Mixed Models…Linear (Output) 113 Mixed Models…Generalized Linear 116 Mixed Models…Generalized Linear (Output) 120 Adjusting Model Structure 126 Part II Data Visualization 129 Chapter 5 Take Your Data Visualizations to the Next Level 131 Graphics Options in SPSS Statistics 132 Understanding the Revolutionary Approach in The Grammar of Graphics 136 Bar Chart Case Study 138 Bubble Chart Case Study 143 Chapter 6 The Code Behind SPSS Graphics: Graphics Production Language 147 Introducing GPL: Bubble Chart Case Study 147 GPL Help 155 Bubble Chart Case Study Part Two 156 Double Regression Line Case Study 160 Arrows Case Study 163 MBTI Bubble Chart Case Study 167 Chapter 7 Mapping in IBM SPSS Statistics 173 Creating Maps with the Graphboard Template Chooser 174 Creating a Choropleth of Counts Map 175 Creating Other Map Types 179 Creating Maps Using Geographical Coordinates 185 Chapter 8 Geospatial Analytics 193 Geospatial Association Rules 194 Case Study: Crime and 311 Calls 194 Spatio-Temporal Prediction 207 Case Study: Predicting Weekly Shootings 207 Chapter 9 Perceptual Mapping with Correspondence Analysis, GPL, and OMS 217 Starting with Crosstabs 220 Correspondence Analysis 224 Multiple Correspondence Analysis 234 Crosstabulations 234 Applying OMS and GPL to the MCA Perceptual Map 242 Chapter 10 Display Complex Relationships with Multidimensional Scaling 249 Metric and Nonmetric Multidimensional Scaling 251 Nonmetric Scaling of Psychology Sub?]Disciplines 251 Multidimenional Scaling Dialog Options 253 Multidimensional Scaling Output Interpretation 259 Subjective Approach to Dimension Interpretation 264 Statistical Approach to Dimension Interpretation 266 Part III Predictive Analytics 271 Chapter 11 SPSS Statistics versus SPSS Modeler: Can I Be a Data Miner Using SPSS Statistics? 275 What Is Data Mining? 275 What Is IBM SPSS Modeler? 276 Can Data Mining Be Done in SPSS Statistics? 278 Hypothesis Testing, Type I Error, and Hold-Out Validation 280 Significance of the Model and Importance of Each Independent Variable 284 The Importance of Finding and Modeling Interactions 284 Classic and Important Data Mining Tasks 287 Partitioning and Validating 288 Feature Selection 291 Balancing 294 Comparing Results from Multiple Models 295 Creating Ensembles 297 Scoring New Records 300 Chapter 12 IBM SPSS Data Preparation 303 Identify Unusual Cases 304 Identify Unusual Cases Dialogs 305 Identify Unusual Cases Output 311 Optimal Binning 315 Optimal Binning Dialogs 316 Optimal Binning Output 321 Chapter 13 Model Complex Interactions with IBM SPSS Neural Networks 325 Why “Neural” Nets? 326 The Famous Case of Exclusive OR and the Perceptron 328 What Is a Hidden Layer and Why Is It Needed? 332 Neural Net Results with the XOR Variables 333 How the Weights Are Calculated: Error Backpropagation 337 Creating a Consistent Partition in SPSS Statistics 340 Comparing Regression to Neural Net with the Bank Salary Case Study 341 Calculating Mean Absolute Percent Error for Both Models 344 Classification with Neural Nets Demonstrated with the Titanic Dataset 349 Chapter 14 Powerful and Intuitive: IBM SPSS Decision Trees 355 Building a Tree with the CHAID Algorithm 355 Review of the CHAID Algorithm 360 Adjusting the CHAID Settings 363 CRT for Classification 366 Understanding Why the CRT Algorithm Produces a Different Tree 368 Missing Data 369 Changing the CRT Settings 369 Comparing the Results of All Four Models 371 Alternative Validation Options 373 The Scoring Wizard 374 Chapter 15 Find Patterns and Make Predictions with K Nearest Neighbors 379 Using KNN to Find “Neighbors” 380 The Titanic Dataset and KNN Used as a Classifier 381 The Trade-Offs between Bias and Variance 386 Comparing Our Models: Decision Trees, Neural Nets, and KNN 388 Building an Ensemble 391 Part IV Syntax, Data Management, and Programmability 393 Chapter 16 Write More Effi cient and Elegant Code with SPSS Syntax Techniques 395 A Syntax Primer for the Uninitiated 396 Making the Connection: Menus and the Grammar of Syntax 401 What Is “Inefficient” Code? 403 The Case Study 404 Customer Dataset 406 Fixing the ZIP Codes 407 Addressing Case Sensitivity of City Names with UPPER() and LOWER() 409 Parsing Strings and the Index Function 410 Aggregate and Restructure 410 Pasting Variable Names, TO, Recode, and Count 412 DO REPEAT Spend Ratios 414 Merge 415 Final Syntax File 417 Chapter 17 Automate Your Analyses with SPSS Syntax and the Output Management System 421 Overview of the Output Management System 422 Running OMS from Menus 423 Contents xxi Automatically Writing Selected Categories of Output to Different Formats 424 Suppressing Output 429 Working with OMS data 436 Running OMS from Syntax 438 Chapter 18 Statistical Extension Commands 441 What Is an Extension Command? 441 TURF Analysis—Designing Product Bundles 444 Large Problems 449 Quantile Regression—Predicting Airline Delays 450 Comparing Ordinary Least Squares with Quantile Regression Results 455 Operational Considerations 459 Support Vector Machines—Predicting Loan Default 461 Background 461 An Example 464 Operational Issues 467 Computing Cohen’s d Measure of Effect Size for a T-Test 468 Index 473
KEITH MCCORMICK is a data mining consultant, trainer, and speaker. A passionate user of SPSS for 25 years, he has trained thousands on how to effectively use SPSS Statistics and SPSS Modeler. He blogs at keithmccormick.com. JESUS SALCEDO is an independent statistical consultant. He is a former SPSS Curriculum Team Lead and Senior Education Specialist who has written numerous SPSS training courses and trained thousands of users. JON PECK, now retired from IBM, was a senior engineer, statistician, and product strategist for SPSS and IBM for 32 years. He designed and contributed to many features of SPSS Statistics and has consulted with and trained many users. He remains active on social media. ANDREW WHEELER is a researcher in criminal justice and a former crime analyst. He has used SPSS for over 8 years, and often blogs SPSS tutorials at andrewpwheeler.wordpress.com.
Explore advanced techniques that unlock the full capabilities of SPSS IBM® SPSS Statistics is a complex software package with more than a dozen specialized, high-octane tools. Unlock its most powerful aspects with this comprehensive tutorial. In-depth study of advanced statistical techniques, data visualization tools, predictive analysis, and SPSS programming will enable you to take advantage of the many SPSS features that are often overlooked, making your work more efficient and accurate. Each chapter introduces an advanced feature, explains when and why to use it, and then walks you step-by-step through executing the technique. You will learn to use advanced regression, work with maps, display complex relationships with multi-dimensional scaling, and take advantage of powerful and intuitive SPSS decision trees. You'll discover how to write more efficient code with advanced SPSS syntax, combine SPSS syntax with Python, master SPSS bootstrapping and Monte Carlo simulation. Conquering these advanced features helps you conduct more accurate and thorough analysis and create elegant, sophisticated visualizations that tell the whole story. SPSS Statistics for Data Analysis and Visualization will show you how to: Tell a more complete story with SPSS Amos Build hierarchical linear models with SPSS Advanced Statistics Turn crosstabulations into pictures with Correspondence Analysis Map your data with advanced geospatial analysis Model complex interactions with SPSS Neural Networks Harness the power of R and Python in SPSS Statistics Automate your analyses with SPSS Syntax and the Output Management System Take advantage of SPSS® Data Preparation and SPSS® Bootstrapping, now included in SPSS Subscription Base