Cover Page

Introduction to Quantitative Data Analysis in the Behavioral and Social Sciences

Michael J. Albers
East Carolina University

Wiley Logo

Preface

This book strives to be an introduction to quantitative data analysis for students who have little or no previous training either in statistics or in data analysis. It does not attempt to cover all types of data analysis situations, but works to impart the proper mindset in performing a data analysis. Too often the problem with poorly analyzed studies is not the number crunching itself, but a lack of the critical thinking process required to make sense of the statistical results. This book works to provide some of that training.

Statistics is a tool. Knowing how to perform a t-test or an ANOVA is similar to knowing how to use styles and page layout in Word. Just because you know how to use styles does not make you a writer. It will not even make you a good layout person if you do not know when and why to apply those styles. Likewise, statistics is not data analysis. Learning how to use a software package to perform a t-test is relatively easy and quick for a student. But knowing when and why to perform a t-test is a different, and more complex, learning outcome. I had a student, who had taken two graduate-level business statistics courses, remark when she turned in a statistics heavy report in a writing class: “In the stat classes, I only learned enough to get me through the test problems. I have no idea how to analyze this data.” She had learned how to crunch numbers, but not how to analyze data. Bluntly, she wasted her time and money in those two classes.

The issue for researchers in the social sciences is not to learn statistics, but learn to analyze data. The goal is not to learn how to use the statistical tests to crunch numbers, but to be able use those tests to interpret the data and draw valid conclusions from it. There is a wide range of statistical tests relevant to data analysis; some that every researcher should be able to perform and some that require the advice/help of a statistical expert. Good quantitative data analysis does not require a comprehensive knowledge of statistics, but, rather, knowing enough to know when it is time to ask for help and what questions to ask.

Every quantitative research study (essentially by definition) collects some type of data that must then be analyzed to help draw the study's conclusions. A great study design is useless unless the data is properly analyzed. But teaching that data analysis to students is a difficult task. What I have found is that most textbooks fall into one of these categories.

This book differs from textbooks in these two categories because it focuses on teaching how to analyze data from a study, rather than how to perform a study or how to perform individual statistical tests. Notice that in the first sentence of a previous paragraph I said “data that must then be analyzed to help draw the study's conclusions.” The key word in the sentence is help versus give the study's conclusions. The results of statistical tests are not the final conclusions for research data analysis. The researcher must study the test results, apply them to the situational context, and then draw conclusions that make sense (see Figure 1.1 in Chapter 1). To support that process, this book works to place statistical tests within the context of a data analysis problem and provide the background to connect a specific type of data with the appropriate test. The work is placed within long examples and the entire process of data analysis is covered in a contextualized manner. It looks at the data analysis from different viewpoints and using different tests to enable a student to learn how and when to apply different analysis methods.

Two major goals are to teach what questions to ask during all phases of a data analysis and how to judge the relevance of potential questions. It is easy to run statistical tests on all combinations of the data, but most of those tests have no relevance or validity regardless of the actual research question.

This book strives to explain the when, why, and what for, rather than the button pushing how-to. The data analysis chapters of many research textbooks are little more than an explanation of various statistical tests. As a result, students come away thinking the important questions are procedural, such as: “How do I run a chi-squared test?” “What is the best procedure, a Kruskal–Wallis test or a standard ANOVA?” and “Let me tell you about my data, and you can tell me what procedure to run.” (Rogers, 2010, p. 8). These are the wrong questions to be asking at the beginning of a data analysis. Rather, students need to think along the lines of “what relationships do I need to understand?” and “what are the important practical issues I need to worry about?” Unfortunately, most data analysis texts get them lost is the trees of individual tests and never explains where they are within a data analysis forest.

Besides knowing when and why to perform a statistical test, there is a need for a researcher to get at the data's deep structure and not be content with the superficial structure that appears at first glance. And certainly not to be content with poor/inadequate data analysis in which the student sees the process as “run a few statistics tests, report the p-value, and call the analysis complete.”

Statistics is a tool to get where you want to go, but far too many view it either as an end for itself and the rest view it as a way of manipulating raw data in order to get a justification for what they want to do to begin with. Further, being able to start to quantify relationships and being able to quantify results does not mean that you are beginning to understand these, let alone being able to quantify anything like the risk involved

(Briggs, 2008).

I recently had to review a set of undergraduate honors research project proposals; they consistently had several weeks scheduled for data collection and one week for data analysis. Unfortunately, with only 1 week, these students will never get more than a superficial level of understanding of their data. In many of the cases of superficial analysis, I am more than willing to place a substantial part of the blame on the instructor. There is a substantial difference between a student who chooses to not to do a good data analysis and a student who does not know how to do a good data analysis. Unless students are taught how to perform an in-depth analysis, they will never perform one because they lack the knowledge. More importantly, they will lack the understanding to realize their analysis was superficial. If someone was taught the task as “do a t-test and report a p-value,” then who is to blame for the lack of data analysis knowledge?

A goal of this book is to teach that data analysis is not just crunching numbers, but a way of thinking that works to reveal the underlying patterns and trends that allow a researcher to gain an understanding of the data and its connection to the research situation. I am content with students knowing when and why to use statistical tests, even if the test's internal logic is little more than a black box.

I expect many research methods instructors will be appalled at this book's contents. The heavy statistics-based researcher or a statistics instructor will be appalled at the statistical tests I left out or at the lack of rigorous discussion of many concepts. The instructor who touches on statistics in a research methods course will be appalled at the number of tests I include and the depth of the analysis. (Yes, I fully appreciate the inherent contradiction in these two sentences.) But I sincerely hope both groups appreciate my attempt at defining statistical tests as a part of data analysis—NOT as either its totality or its end— and my goal of teaching students to approach a data analysis with a mind-set of that they must analyze the data and not simply run a bunch of statistical tests.

With that said, here are some research issues this book will not address:

This book assumes the research methodology and data collection methods are valid. For instance, some examples discuss how to analyze the results of survey questions using Likert scales. Neither the design of the survey question or the developments of Likert items will be discussed; they are assumed to be valid.

This book assumes the data's reliability and validity. The reliability and validity of the data are research design questions that a well-designed study must consider up front, but they do not affect the data analysis per sec. Obviously, with poor quality data, the conclusions are questionable, but the analysis process does not change.

There are no step-by-step software instructions. There are several major statistical software packages and a researcher might use any one of them. With multiple packages, detailed-level software instructions would result in an overly long book with many pages irrelevant to any single reader. All the major software packages provide all of the basic tests covered in this book and there are essentially an infinite number of help sites and YouTube videos that explain the button-pushing aspects. Plus, the how-to is much more effectively taught one-on-one with an instructor than from a book.

The basic terminology used in research study design is used with minimal definition. For example, if the analysis differs for within subjects and between subject's designs, the discussion assumes the student already understands the concepts of within subjects and between subjects, since those must be understood before collecting data. Terminology relevant to a quantitative analysis will, of course, be full defined and explained. Also, there are extensive references to definitions and concepts.

There is no attempt to cover statistical proofs or deal with the edge cases of when a test does or does not apply. Readers desiring that level of understanding need a full statistics course. There are many places where I refer the researcher to a statistician. The complexities of much statistics or delving into more advanced tests may be relevant to the research, but are out of place here. This book is an introduction to data analysis, not an exhaustive data analysis tome.

This book focuses on the overall methodology and research mind-set for how to approach quantitative data analysis and how to use statistics tests as part of analyzing research data. It works to show that the goal of data analysis is to reveal the underlying patterns, trends, and relationships between the variables, and connecting those patterns, trends, and relationships to the data's contextual situation.

References

  1. Briggs, W. (2008) The limits of statistics: black swans and randomness [Web log comment]. Retrieved from http://wmbriggs.com/blog/?p=204.
  2. Rogers, J.L. (2010) The epistemology of mathematical and statistical modelling: a quiet methodological revolution. American Psychologist, 65 (1), 1–12.

About the Companion Website

This book is accompanied by a companion website:

  1. http://www.wiley.com/go/albers/quantitativedataanalysis

The website includes: