CONTENTS

Preface

Acknowledgments

About the Companion Website

1: Experimental Design

1.1 Experimental Design Background
1.2 Sampling Design
1.3 Sample Analysis
1.4 Hypotheses
1.5 Variables

2: Central Tendency and Distribution

2.1 Central Tendency and Other Descriptive Statistics
2.2 Distribution
2.3 Descriptive Statistics in Excel
2.4 Descriptive Statistics in SPSS
2.5 Descriptive Statistics in Numbers
2.6 Descriptive Statistics in R

3: Showing Your Data

3.1 Background on Tables and Graphs
3.2 Tables
3.3 Bar Graphs, Histograms, and Box Plots
3.4 Line Graphs and Scatter Plots
3.5 Pie Charts

4: Parametric versus Nonparametric Tests

4.1 Overview
4.2 Two-Sample and Three-Sample Tests

5: t-Test

5.1 Student's t-Test Background
5.2 Example t-Tests
5.3 Case Study
5.4 Excel Tutorial
5.5 Paired t-Test SPSS Tutorial
5.6 Independent t-Test SPSS Tutorial
5.7 Numbers Tutorial
5.8 R Independent/Paired-Samples t-Test Tutorial

6: ANOVA

6.1 ANOVA Background
6.2 Case Study
6.3 One-Way ANOVA Excel Tutorial
6.4 One-Way ANOVA SPSS Tutorial
6.5 One-Way Repeated Measures ANOVA SPSS TUTORIAL
6.6 Two-Way Repeated Measures ANOVA SPSS Tutorial
6.7 One-Way ANOVA Numbers Tutorial
6.8 One-Way R Tutorial
6.9 Two-Way ANOVA R Tutorial

7: Mann–Whitney U and Wilcoxon Signed-Rank

7.1 Mann–Whitney U and Wilcoxon Signed-Rank Background
7.2 Assumptions
7.3 Case Study – Mann—Whitney U Test
7.4 Case Study – Wilcoxon Signed-Rank
7.5 Mann–Whitney U Excel Tutorial
7.6 Wilcoxon Signed-Rank Excel Tutorial
7.7 Mann–Whitney U SPSS Tutorial
7.8 Wilcoxon Signed-Rank SPSS Tutorial
7.9 Mann–Whitney U Numbers Tutorial
7.10 Wilcoxon Signed-Rank Numbers Tutorial
7.11 Mann–Whitney U/Wilcoxon Signed-Rank R Tutorial

8: Kruskal–Wallis

8.1 Kruskal–Wallis Background
8.2 Case Study 1
8.3 Case Study 2
8.4 Kruskal–Wallis Excel Tutorial
8.5 Kruskal–Wallis SPSS Tutorial
8.6 Kruskal–Wallis Numbers Tutorial
8.7 Kruskal–Wallis R Tutorial

9: Chi-Square Test

9.1 Chi-Square Background
9.2 Case Study 1
9.3 Case Study 2
9.4 Chi-Square Excel Tutorial
9.5 Chi-Square SPSS Tutorial
9.6 Chi-Square Numbers Tutorial
9.7 Chi-Square R Tutorial

10: Pearson's and Spearman's Correlation

10.1 Correlation Background
10.2 Example
10.3 Case Study – Pearson's Correlation
10.4 Case Study – Spearman's Correlation
10.5 Pearson's Correlation Excel and Numbers Tutorial
10.6 Spearman's Correlation Excel Tutorial
10.7 Pearson/Spearman's Correlation SPSS Tutorial
10.8 Pearson/Spearman's Correlation R Tutorial

11: Linear Regression

11.1 Linear Regression Background
11.2 Case Study
11.3 Linear Regression Excel Tutorial
11.4 Linear Regression SPSS Tutorial
11.5 Linear Regression Numbers Tutorial
11.6 Linear Regression R Tutorial

12: Basics in Excel

12.1 Opening Excel
12.2 Installing the Data Analysis ToolPak
12.3 Cells and Referencing
12.4 Common Commands and Formulas
12.5 Applying Commands to Entire Columns
12.6 Inserting a Function
12.7 Formatting Cells

13: Basics in SPSS

13.1 Opening SPSS
13.2 Labeling Variables
13.3 Setting Decimal Placement
13.4 Determining the Measure of a Variable
13.5 Saving SPSS Data Files
13.6 Saving SPSS Output

14: Basics in Numbers

14.1 Opening Numbers
14.2 Common Commands
14.3 Applying Commands
14.4 Adding Functions

15: Basics in R

15.1 Opening R
15.2 Getting Acquainted with the Console
15.3 Loading Data
15.4 Installing and Loading Packages
15.5 Troubleshooting

16: Appendix

Flow Chart

Literature Cited

Glossary

Index

EULA

List of Illustrations

Chapter 1

Figure 1.1 A representation of a random sample of individuals within a population.
Figure 1.2 A systematic sample of individuals within a population, starting at the third individual and then selecting every sixth subsequent individual in the group.
Figure 1.3 A stratified sample of individuals within a population. A minimum of 20% of the individuals within each subpopulation were selected.
Figure 1.4 Bar graph comparing the body mass index (BMI) of men who eat less than 38 g of fiber per day to men who eat more than 38 g of fiber per day.
Figure 1.5 Bar graph comparing the daily dietary fiber (g) intake of men and women.

Chapter 2

Figure 2.1 Frequency distribution of the body length of the marine iguana during a normal year and an El Niño year.
Figure 2.2 Display of normal distribution.
Figure 2.3 Histogram illustrating a normal distribution.
Figure 2.3 Histogram illustrating a right skewed distribution.
Figure 2.5 Histogram illustrating a left skewed distribution.
Figure 2.6 Histogram illustrating a platykurtic curve where tails are lighter.
Figure 2.7 Histogram illustrating a leptokurtic curve where tails are heavier.
Figure 2.8 Histogram illustrating a bimodal, or double-peaked, distribution.
Figure 2.9 Histogram illustrating a plateau distribution.
Figure 2.10 Estimated lung volume of the human skeleton (590 mL), compared with the distribution of lung volumes in the nearby sea level population.
Figure 2.11 Distributions of lung volumes for the sea level population (mean = 420 mL), compared with the lung volumes of the Aymara population (mean = 590 mL).

Chapter 3

Figure 3.1 Clustered bar chart comparing the mean snowfall of alpine forests between 2013 and 2015 in Mammoth, CA; Mount Baker, WA; and Alyeska, AK.
Figure 3.2 Clustered bar chart comparing the mean snowfall of alpine forests between 2013 and 2015 in Mount Baker, WA and Alyeska, AK. An improperly scaled axis exaggerates the differences between groups.
Figure 3.3 Clumped bar chart comparing the mean snowfall of alpine forests by year (2013, 2014, and 2015) in Mammoth, CA; Mount Baker, WA; and Alyeska, AK.
Figure 3.4 Stacked bar chart comparing the mean snowfall of alpine forests by month (January, February, and March) for 2015 in Mammoth, CA; Mount Baker, WA; and Alyeska, AK.
Figure 3.5 Histogram of seal size.
Figure 3.6 Example box plot showing the median, first and third quartiles, as well as the whiskers.
Figure 3.7 Comparison of the box plot to the normal distribution of a sample population.
Figure 3.8 Sample box plot with an outlier.
Figure 3.9 Line graph comparing the monthly water temperatures (°F) for Woods Hole, MA and Avalon, CA.
Figure 3.10 Scatter plot with a line of best fit showing the relationship between temperature (°C) and the relative abundance of Mytilus trossulus to Mytilus edulis (from 0 to 100%).
Figure 3.11 Pie chart comparing the fatty acid content (saturated fat, linoleic acid, alpha-linolenic acid, and oleic acid) in standard canola oil.
Figure 3.12 Pie chart comparing the fatty acid content (saturated fat, linoleic acid, alpha-linolenic acid, and oleic acid) in standard olive oil.

Chapter 4

Figure 4.1 Example of a survey question asking the effectiveness of a new antihistamine in which the response is based on a Likert scale.
Figure 4.2 Visual representation of the SPSS menu showing how to test for homogeneity of variance.

Chapter 5

Figure 5.1 Visual representation of the error distribution in a one- versus two-tailed t-test. In a one-tailed t-test (a), all of the error (5%) is in one direction. In a two-tailed t-test (b), the error (5%) is split into the two directions.
Figure 5.2 SPSS output showing the results from an independent t-test.
Figure 5.3 Bar graph with standard deviations illustrating the comparison of mean pH levels for Upper and Lower Klamath Lake, OR.

Chapter 6

Figure 6.1 One-way ANOVA example protocol using three groups (A, B, and C).
Figure 6.2 Two-way ANOVA example protocol using three groups (A, B, and C) with subgroups (1 and 2).
Figure 6.3 One-way repeated measures ANOVA study protocol for the measurement of muscle power output at pre-, mid-, and post-season.
Figure 6.4 Two-way repeated measures ANOVA study protocol for the measurement of muscle power output at pre-, mid-, and post-season for three resistance training groups (morning, mid-day, and evening).
Figure 6.5 An intervention design layout to compare the effects of time of day for strength training (morning, mid-day, and evening) on muscle power output across a season (pre-, mid-, and post-season).
Figure 6.6 Diagram illustrating the relationship between distribution curves where groups B and C are similar but A is significantly different.
Figure 6.7 One-way ANOVA case study experimental design diagram.
Figure 6.8 One-way ANOVA SPSS output.
Figure 6.9 Bar graph illustrating the average blood lactate levels (A significantly different than B) for the control and experimental groups (SSE and HIIE).
Figure 6.10 SPSS post hoc options when analyzing data for multiple comparisons.
Figure 6.11 Post hoc multiple comparison SPSS output.

Chapter 7

Figure 7.1 Mann–Whitney U SPSS output.
Figure 7.2 Bar graph illustrating the mean ranks of land cleared for the unprotected surrounding areas and park areas.
Figure 7.3 Wilcoxon signed-rank SPSS output.
Figure 7.4 Bar graph illustrating the median changes in metabolic rate (CO₂/mL/g) pre and post meal of Gromphadorhina portentosa.

Chapter 8

Figure 8.1 Kruskal–Wallis SPSS output.
Figure 8.2 Bar graph illustrating the median number of parasites observed among the three snail species, Bulinus forskalii, Bulinus beccarii, and Bulinus cernicus.
Figure 8.3 Kruskal–Wallis SPSS output.
Figure 8.4 Bar graph illustrating the mean ranks of sleep satisfaction score for the four treatment groups.

Chapter 9

Figure 9.1 Illustration depicts the experimental setup (treatment, no response, and control) utilized in both case studies assessing leech attraction.

Chapter 10

Figure 10.1 Scatter plot of data points depicting (a) no relationship, (b) a positive relationship, and (c) a negative relationship.
Figure 10.2 Different relationships between parent and offspring beak size. (a) shows a positive relationship, (b) shows a negative relationship, and (c) shows no relationship between the two variables.
Figure 10.3 Representation of the strength of correlation based on the spread of data on a scatterplot, with higher r values indicating stronger correlation.
Figure 10.4 Scatter plots illustrating the features used to determine normality of a dataset: (a) homoscedastic data that display both a linear and elliptical shape satisfies the normality assumption, (b) homoscedastic data that display an elliptical shape satisfies the normality assumption, (c) heteroscedastic data that is funnel shaped, rather than elliptical or circular violates the normality assumption, (d) the presence of outliers violates the normality assumption, and (e) data that are non-linear also violate the normality assumption.
Figure 10.5 Pearson's correlation SPSS output.
Figure 10.6 Scatter plot illustrating number of hours studied and student exam scores for 28 students.
Figure 10.7 Spearman's correlation SPSS output.
Figure 10.8 Scatter plot illustrating number of hours studied and feeling of preparedness based on a Likert scale (1–5) for 28 students.

Chapter 11

Figure 11.1 Scatter plot with regression line representing a typical regression analysis.
Figure 11.2 Graphs depicting the spread around the trend line. Orientation of the slope determines the type of relationship between x and y and R² describes the strength of the relationship.
Figure 11.3 Linear regression SPSS output.
Figure 11.4 Scatter plot with regression line illustrating the relationship between distance from the cattle farm (kilometer) and the number of antibiotic resistant colonies.

Examples from the Book

The tutorials in this book are built to show a variety of approaches to using Microsoft Excel, SPSS, Apple Numbers, and R, so the student can find their own unique style in working with statistical software, as well as to enrich the student learning experience through exposure to more and varied examples. Most of the data used in this book were obtained directly from published articles or were drawn from unpublished datasets with permission from the faculty at the University of La Verne. In some tutorials, data were generated strictly for teaching purposes; however, data were based on actual trends observed in the literature.

1
Experimental Design

1.1 Experimental Design Background

As scientists, our knowledge of the natural world comes from direct observations and experiments. A good experimental design is essential for making inferences and drawing appropriate conclusions from our observations. Experimental design starts by formulating an appropriate question and then knowing how data can be collected and analyzed to help answer your question. Let us take the following example.

Case Study

Observation: A healthy body weight is correlated with good diet and regular physical activity. One component of a good diet is consuming enough fiber; therefore, one question we might ask is: do Americans who eat more fiber on a daily basis have a healthier body weight or body mass index (BMI) score?

How would we go about answering this question?

In order to get the most accurate data possible, we would need to design an experiment that would allow us to survey the entire population (all possible test subjects – all people living in the United States) regarding their eating habits and then match those to their BMI scores. However, it would take a lot of time and money to survey every person in the country. In addition, if too much time elapses from the beginning to the end of collection, then the accuracy of the data would be compromised.

More practically, we would choose a representative sample with which to make our inferences. For example, we might survey 5000 men and 5000 women to serve as a representative sample. We could then use that smaller sample as an estimate of our population to evaluate our question. In order to get a proper (and unbiased) sample and estimate of the population, the researcher must decide on the best (and most effective) sampling design for a given question.

1.2 Sampling Design

Below are some examples of sampling strategies that a researcher could use in setting up a research study. The strategy you choose will be dependent on your research question. Also keep in mind that the sample size (N) needed for a given study varies by discipline. Check with your mentor and look at the literature to verify appropriate sampling in your field.

Some of the sampling strategies introduce bias. Bias occurs when certain individuals are more likely to be selected than others in a sample. A biased sample can change the predictive accuracy of your sample; however, sometimes bias is acceptable and expected as long as it is identified and justifiable. Make sure that your question matches and acknowledges the inherent bias of your design.

Random Sample

In a random sample all individuals within a population have an equal chance of being selected, and the choice of one individual does not influence the choice of any other individual (as illustrated in Figure 1.1). A random sample is assumed to be the best technique for obtaining an accurate representation of a population. This technique is often associated with a random number generator, where each individual is assigned a number and then selected randomly until a preselected sample size is reached. A random sample is preferred in most situations, unless there are limitations to data collection or there is a preference by the researcher to look specifically at subpopulations within the larger population.

Diagram shows 8 columns and 5 rows of smiley faces where few of them are colored. — **Figure 1.1** A representation of a random sample of individuals within a population.

In our BMI example, a person in Chicago and a person in Seattle would have an equal chance of being selected for the study. Likewise, selecting someone in Seattle does not eliminate the possibility of selecting other participants from Seattle. As easy as this seems in theory, it can be challenging to put into practice.

Systematic Sample

A systematic sample is similar to a random sample. In this case, potential participants are ordered (e.g., alphabetically), a random first individual is selected, and every kth individual afterward is picked for inclusion in the sample. It is best practice to randomly choose the first participant and not to simply choose the first person on the list. A random number generator is an effective tool for this. To determine k, divide the number of individuals within a population by the desired sample size.

This technique is often used within institutions or companies where there are a larger number of potential participants and a subset is desired. In Figure 1.2, the third person (going down the first column) is the first individual selected and every sixth person afterward is selected for a total of 7 out of 40 possible.

Stratified Sample

A stratified sample is necessary if your population includes a number of different categories and you want to make sure your sample includes all categories (e.g., gender, ethnicity, other categorical variables). In Figure 1.3, the population is organized first by category (i.e., strata) and then random individuals are selected from each category.

Diagram shows 8 columns and 5 rows of smiley faces where most of them are colored. — **Figure 1.3** A stratified sample of individuals within a population. A minimum of 20% of the individuals within each subpopulation were selected.

In our BMI example, we might want to make sure all regions of the country are represented in the sample. For example, you might want to randomly choose at least one person from each city represented in your population (e.g., Seattle, Chicago, New York, etc.).

Volunteer Sample

A volunteer sample is used when participants volunteer for a particular study. Bias would be assumed for a volunteer sample because people who are likely to volunteer typically have certain characteristics in common. Like all other sample types, collecting demographic data would be important for a volunteer study, so that you can determine most of the potential biases in your data.

Sample of Convenience

A sample of convenience is not representative of a target population because it gives preference to individuals within close proximity. The reality is that samples are often chosen based on the availability of a sample to the researcher.

Here are some examples:

A university researcher interested in studying BMI versus fiber intake might choose to sample from the students or faculty she has direct access to on her campus.
A skeletal biologist might observe skeletons buried in a particular cemetery, although there are other cemeteries in the same ancient city.
A malacologist with a limited time frame may only choose to collect snails from populations in close proximity to roads and highways.

In any of these cases, the researcher assumes that the sample is biased and may not be representative of the population as a whole.

For all studies involving living human participants, you need to ensure that you have submitted your research proposal to your campus’ Institutional Review Board (IRB) or Ethics Committee prior to initiating the research protocol. For studies involving animals, submit your research proposal to the Institutional Animal Care and Use Committee (IACUC).

Counterbalancing

When designing an experiment with paired data (e.g., testing multiple treatments on the same individuals), you may need to consider counterbalancing to control for bias. Bias in these cases may take the form of the subjects learning and changing their behavior between trials, slight differences in the environment during different trials, or some other variable whose effects are difficult to control between trials. By counterbalancing we try to offset the slight differences that may be present in our data due to these circumstances. For example, if you were investigating the effects of caffeine consumption on strength, compared to a placebo, you would want to counterbalance the strength session with placebo and caffeine. By dividing the entire test population into two groups (A and B), and testing them on two separate days, under alternating conditions, you would counterbalance the laboratory sessions. One group (A) would present to the laboratory and undergo testing following caffeine consumption and then the other group (B) would present to the laboratory and consume the placebo on the same day. To ensure washout of the caffeine, each group would come back one week later on the same day at the same time and undergo the strength tests under the opposite conditions from day 1. Thus, group B would consume the caffeine and group A would consume the placebo on testing day 2. By counterbalancing the sessions you reduce the risk of one group having an advantage or a different experience over the other, which can ultimately impact your data.

1.3 Sample Analysis

Once we take a sample of the population, we can use descriptive statistics to characterize the population. Our estimate may include the mean and variance of the sample group. For example, we may want to compare the mean BMI score of men who intake greater than 38 g of dietary fiber per day with those who intake less than 38 g of dietary fiber per day (as indicated in Figure 1.4). We cannot sample all men; therefore, we might randomly sample 100 men from the larger population for each category (<38 g and >38 g). In this study, our sample group, or subset, of 200 men (N = 200) is assumed to be representative of the whole.

Bar graph shows dietary fiber intake between less than 38 and more than 38 versus body mass index from 20 to 35. — **Figure 1.4** Bar graph comparing the body mass index (BMI) of men who eat less than 38 g of fiber per day to men who eat more than 38 g of fiber per day.

Although this estimate would not yield the exact same results as a larger study with more participants, we are likely to get a good estimate that approximates the population mean. We can then use inferential statistics to determine the quality of our estimate in describing the sample and determine our ability to make predictions about the larger population.

If we wanted to compare dietary fiber intake between men and women, we could go beyond descriptive statistics to evaluate whether the two groups (populations) are different, as in Figure 1.5. Inferential statistics allows us to place a confidence interval on whether the two samples are from the same population, or whether they are really two different populations. To compare men and women, we could use an independent t-test for statistical analysis. In this case, we would receive both the means for the groups, as well as a p-value, which would give us an estimated degree of confidence in whether the groups are different from each other.

Bar graph shows gender between men and women versus daily dietary fiber in grams from 0 to 40. — **Figure 1.5** Bar graph comparing the daily dietary fiber (g) intake of men and women.

1.4 Hypotheses

In essence, statistics is hypothesis testing. A hypothesis is a testable statement that provides a possible explanation to an observable event or phenomenon. A good, testable hypothesis implies that the independent variable (established by the researcher) and dependent variable (also called a response variable) can be measured. Often, hypotheses in science laboratories (general biology, cell biology, chemistry, etc.) are written as “If…then…” statements; however, in scientific publications, hypotheses are rarely spelled out in this way. Instead, you will see them formulated in terms of possible explanations to a problem. In this book, we will introduce formalized hypotheses used specifically for statistical analysis. Hypotheses are formulated as either the null hypothesis or alternative hypotheses. Within certain chapters of this book, we indicate the opportunity to formulate hypotheses using this symbol inline .

In the simplest scenario, the null hypothesis (H₀) assumes that there is no difference between groups. Therefore, the null hypothesis assumes that any observed difference between groups is based merely on variation in the population. In the dietary fiber example, our null hypothesis would be that there is no difference in fiber consumption between the sexes.

The alternative hypotheses (H₁, H₂, etc.) are possible explanations for the significant differences observed between study populations. In the example above, we could have several alternative hypotheses. An example for the first alternative hypothesis, H₁, is that there will be a difference in the dietary fiber intake between men and women.

Good hypothesis statements will include a rationale or reason for the difference. This rationale will correspond with the background research you have gathered on the system.

It is important to keep in mind that difference between groups could be due to other variables that were not accounted for in our experimental design. For example, if when you were surveying men and women over the telephone, you did not ask about other dietary choices (e.g., Atkins, South Beach, vegan diets), you may have introduced bias unexpectedly. If by chance, all the men were on a high protein diet and the women were vegan, this could bring bias into your sample. It is important to plan out your experiments and consider all variables that may influence the outcome.

1.5 Variables

An important component of experimental design is to define and identify the variables inherent in your sample. To explain these variables, let us look at another example.

Case Study

In 1995, wolves were reintroduced to Yellowstone National Park after an almost 70-year absence. Without the wolf, many predator–prey dynamics had changed, with one prominent consequence being an explosion of the elk population. As a result, much of the vegetation in the park was consumed, resulting in obvious changes, such as to nesting bird habitat, but also more obscure effects like stream health. With the reintroduction of the wolf, park rangers and scientists began noticing dramatic and far reaching changes to food webs and ecosystems within the park. One question we could ask is how trout populations were impacted by the reintroduction of the wolf. To design this experiment, we will need to define our variables.

The independent variable, also known as the treatment, is the part of the experiment established by or directly manipulated by the research that causes a potential change in another variable (the dependent variable). In the wolf example, the independent variable is the presence/absence of wolves in the park.

The dependent variable, also known as the response variable, changes because it “depends” on the influence of the independent variable. There is often only one independent variable (depending on the level of research); however, there can potentially be several dependent variables. In the question above, there is only one dependent variable – trout abundance. However, in a separate question, we could examine how wolf introduction impacted populations of beavers, coyotes, bears, or a variety of plant species.

Controlled variables are other variables or factors that cause direct changes to the dependent variable(s) unrelated to the changes caused by the independent variable. Controlled variables must be carefully monitored to avoid error or bias in an experiment. Examples of controlled variables in our example would be abiotic factors (such as sunlight) and biotic factors (such as bear abundance). In the Yellowstone wolf/trout example, researchers would need to survey the same streams at the same time of year over multiple seasons to minimize error.

Here is another example: In a general biology laboratory, the students in the class are asked to determine which fertilizer is best for promoting plant growth. Each student in the class is given three plants; the plants are of the same species and size. For the experiment, each plant is given a different fertilizer (A, B, and C). What are the other variables that might influence a plant's growth?

Let us say that the three plants are not receiving equal sunlight, the one on the right (C) is receiving the most sunlight and the one on the left (A) is receiving the least sunlight. In this experiment, the results would likely show that the plant on the right became more mature with larger and fuller flowers. This might lead the experimenter to determine that company C produces the best fertilizer for flowering plants. However, the results are biased because the variables were not controlled. We cannot determine if the larger flowers were the result of a better fertilizer or just more sunlight.

Types of Variables

Categorical variables are those that fall into two or more categories. Examples of categorical variables are nominal variables and ordinal variables.

Nominal variables are counted not measured, and they have no numerical value or rank. Instead, nominal variables classify information into two or more categories. Here are some examples:

Sex (male, female)
College major (Biology, Kinesiology, English, History, etc.)
Mode of transportation (walk, cycle, drive alone, carpool)
Blood type (A, B, AB, O)

Ordinal variables, like nominal variables, have two or more categories; however, the order of the categories is significant. Here are some examples:

Satisfaction survey (1 = “poor,” 2 = “acceptable,” 3 = “good,” 4 = “excellent”)
Levels of pain (mild, moderate, severe)
Stage of cancer (I, II, III, IV)
Level of education (high school, undergraduate, graduate)

Ordinal variables are ranked; however, no arithmetic-like operations are possible (i.e., rankings of poor (1) and acceptable (2) cannot be added together to get a good (3) rating).

Quantitative variables are variables that are counted or measured on a numerical scale. Examples of quantitative variables include height, body weight, time, and temperature. Quantitative variables fall into two categories: discrete and continuous.

Discrete variables are variables that are counted:

Number of wing veins
Number of people surveyed
Number of colonies counted

Continuous variables are numerical variables that are measured on a continuous scale and can be either ratio or interval.

Ratio variables have a true zero point and comparisons of magnitude can be made. For instance, a snake that measures 4 feet in length can be said to be twice the length of a 2 foot snake. Examples of ratio variables include: height, body weight, and income.

Interval variables have an arbitrarily assigned zero point. Unlike ratio data, comparisons of magnitude among different values on an interval scale are not possible. An example of an interval variable is temperature (Celsius or Fahrenheit scale).

An Introduction to Statistical Analysis in Research

With Applications in the Biological and Life Sciences

Preface

Examples from the Book

Acknowledgments

About the Companion Website

1
Experimental Design

1.1 Experimental Design Background

Case Study

1.2 Sampling Design

Random Sample

Systematic Sample

Stratified Sample

Volunteer Sample

Sample of Convenience

Counterbalancing

1.3 Sample Analysis

1.4 Hypotheses

1.5 Variables

Case Study

Types of Variables