# Econometrics For Dummies®

Visit www.dummies.com/cheatsheet/econometrics to view this book's cheat sheet.

Introduction

Foolish Assumptions

Icons Used in This Book

Beyond the Book

Where to Go from Here

Part I: Getting Started with Econometrics

Chapter 1: Econometrics: The Economist’s Approach to Statistical Analysis

Evaluating Economic Relationships

Using economic theory to describe outcomes and make predictions

Relying on sensible assumptions

Applying Statistical Methods to Economic Problems

Recognizing the importance of data type, frequency, and aggregation

Avoiding the data-mining trap

Incorporating quantitative and qualitative information

Using Econometric Software: An Introduction to STATA

Getting acquainted with STATA

Creating new variables

Estimating, testing, and predicting

Chapter 2: Getting the Hang of Probability

Reviewing Random Variables and Probability Distributions

Looking at all possibilities: Probability density function (PDF)

Summing up the probabilities: Cumulative density function (CDF)

Putting variable information together: Bivariate or joint probability density

Predicting the future using what you know: Conditional probability density

Understanding Summary Characteristics of Random Variables

Making generalizations with expected value or mean

Measuring variance and standard deviation

Looking at relationships with covariance and correlation

Chapter 3: Making Inferences and Testing Hypotheses

Getting to Know Your Data with Descriptive Statistics

Calculating parameters and estimators

Determining whether an estimator is good

Laying the Groundwork of Prediction with the Normal and Standard Normal Distributions

Recognizing usual variables: Normal distribution

Putting variables on the same scale: Standard normal distribution (Z)

Working with Parts of the Population: Sampling Distributions

Simulating and using the central limit theorem

Defining the chi-squared (χ2), t, and F distributions

Making Inferences and Testing Hypotheses with Probability Distributions

Performing a hypothesis test

The confidence interval approach

The test of significance approach

Part II: Building the Classical Linear Regression Model

Chapter 4: Understanding the Objectives of Regression Analysis

Making a Case for Causality

Getting Acquainted with the Population Regression Function (PRF)

Setting up the PRF model

Walking through an example

Collecting and Organizing Data for Regression Analysis

Taking a snapshot: Cross-sectional data

Looking at the past to explain the present: Time-series data

Combining the dimensions of space and time: Panel or longitudinal data

Joining multiple snapshots: Pooled cross-sectional data

Chapter 5: Going Beyond Ordinary with the Ordinary Least Squares Technique

Defining and Justifying the Least Squares Principle

Estimating the Regression Function and the Residuals

Obtaining Estimates of the Regression Parameters

Finding the formulas necessary to produce optimal coefficient values

Calculating the estimated regression coefficients

Interpreting Regression Coefficients

Seeing what regression coefficients have to say

Standardizing regression coefficients

Measuring Goodness of Fit

Decomposing variance

Measuring proportion of variance with R2

Adjusting the goodness of fit in multiple regression

Evaluating fit versus quality

Chapter 6: Assumptions of OLS Estimation and the Gauss-Markov Theorem

Characterizing the OLS Assumptions

Linearity in parameters and additive error

Random sampling and variability

Imperfect linear relationships among the independent variables

Error term has a zero conditional mean; correct specification

Error term has a constant variance

Correlation of error observations is zero

Relying on the CLRM Assumptions: The Gauss-Markov Theorem

Proving the Gauss-Markov theorem

Summarizing the Gauss-Markov theorem

Chapter 7: The Normality Assumption and Inference with OLS

Describing the Role of the Normality Assumption

The error term and the sampling distribution of OLS coefficients

Revisiting the standard normal distribution

Deriving a chi-squared distribution from the random error

OLS standard errors and the t-distribution

Testing the Significance of Individual Regression Coefficients

Picking an approach

Choosing the level of significance and p-values

Analyzing Variance to Determine Overall or Joint Significance

Normality, variance, and the F distribution

The reported F-statistic from OLS

Slope coefficients and the relationship between t and F

Joint significance for subsets of variables

Applying Forecast Error to OLS Predictions

Mean prediction and forecast error

Variance of mean prediction

All predictions are not the same: The prediction confidence interval

Part III: Working with the Classical Regression Model

Chapter 8: Functional Form, Specification, and Structural Stability

Employing Alternative Functions

Quadratic function: Best for finding minimums and maximums

Cubic functions: Good for inflexion

Inverse function: Limiting the value of the dependent variable

Giving Linearity to Nonlinear Models

Working both sides to keep elasticity constant: The log-log model

Making investments and calculating rates of return: The log-linear model

Decreasing the change of the dependent variable: The linear-log model

Checking for Misspecification

Too many or too few: Selecting independent variables

Sensitivity isn’t a virtue: Examining misspecification with results stability

Chapter 9: Regression with Dummy Explanatory Variables

Defining a dummy variable when you have only two possible characteristics

Juggling multiple characteristics with dummy variables

Finding Average Differences by Using a Dummy Variable

Specification

Interpretation

Combining Quantitative and Qualitative Data in the Regression Model

Specification

Interpretation

Interacting Quantitative and Qualitative Variables

Specification

Interpretation

Interacting Two (or More) Qualitative Characteristics

Specification

Interpretation

Segregate and Integrate: Testing for Significance

Revisiting the F-test for joint significance

Revisiting the Chow test

Part IV: Violations of Classical Regression Model Assumptions

Chapter 10: Multicollinearity

Distinguishing between the Types of Multicollinearity

Pinpointing perfect multicollinearity

Zeroing in on high multicollinearity

Rules of Thumb for Identifying Multicollinearity

Pairwise correlation coefficients

Auxiliary regression and the variance inflation factor (VIF)

Knowing When and How to Resolve Multicollinearity Issues

Get more data

Use a new model

Expel the problem variable(s)

Chapter 11: Heteroskedasticity

Distinguishing between Homoskedastic and Heteroskedastic Disturbances

Homoskedastic error versus heteroskedastic error

The consequences of heteroskedasticity

Detecting Heteroskedasticity with Residual Analysis

Examining the residuals in graph form

Brushing up on the Breusch-Pagan test

Getting acquainted with the White test

Trying out the Goldfeld-Quandt test

Conducting the Park test

Correcting Your Regression Model for the Presence of Heteroskedasticity

Weighted least squares (WLS)

Robust standard errors (also known as White-corrected standard errors)

Chapter 12: Autocorrelation

Examining Patterns of Autocorrelation

Positive versus negative autocorrelation

Misspecification and autocorrelation

Illustrating the Effect of Autoregressive Errors

Analyzing Residuals to Test for Autocorrelation

Taking the visual route: Graphical inspection of residuals

Using the normal distribution to identify residual sequences: The run test

Detecting autocorrelation of an AR(1) process: The Durbin-Watson test

Detecting autocorrelation of an AR(q) process: The Breusch-Godfrey test

Remedying Harmful Autocorrelation

Feasible generalized least squares (FGLS)

Serial correlation robust standard errors

Part V: Discrete and Restricted Dependent Variables in Econometrics

Chapter 13: Qualitative Dependent Variables

Modeling Discrete Outcomes with the Linear Probability Model (LPM)

Estimating LPM with OLS

Presenting the Three Main LPM Problems

Non-normality of the error term

Heteroskedasticity

Unbounded predicted probabilities

Specifying Appropriate Nonlinear Functions: The Probit and Logit Models

Working from the standard normal CDF: The probit model

Basing off of the logistic CDF: The logit model

Using Maximum Likelihood (ML) Estimation

Constructing the likelihood function

The log transformation and ML estimates

Interpreting Probit and Logit Estimates

Probit coefficients

Logit coefficients

Chapter 14: Limited Dependent Variable Models

The Nitty-Gritty of Limited Dependent Variables

Censored dependent variables

Truncated dependent variables

Modifying Regression Analysis for Limited Dependent Variables

Tobin’s Tobit

Truncated regression

Oh, what the heck if I self select? Heckman’s selection bias correction

Part VI: Extending the Basic Econometric Model

Chapter 15: Static and Dynamic Models

Using Contemporaneous and Lagged Variables in Regression Analysis

Examining problems with dynamic models

Testing and correcting for autocorrelation in dynamic models

Projecting Time Trends with OLS

Spurious correlation and time series

Detrending time-series data

Estimating seasonality effects

Deseasonalizing time-series data

Chapter 16: Diving into Pooled Cross-Section Analysis

Adding a Dynamic Time Element to the Mix

Examining intercepts and/or slopes that change over time

Incorporating time dummy variables

Using Experiments to Estimate Policy Effects with Pooled Cross Sections

Benefitting from random assignment: A true experiment

Working with predetermined subject groups: A natural (or quasi) experiment

Chapter 17: Panel Econometrics

Estimating the Uniqueness of Each Individual Unit

First difference (FD) transformation

Dummy variable (DV) regression

Fixed effects (FE) estimator

Increasing the Efficiency of Estimation with Random Effects

The composite error term and assumptions of random effects model

The random effects (RE) estimator

Testing Efficiency against Consistency with the Hausman Test

Part VII: The Part of Tens

Chapter 18: Ten Components of a Good Econometrics Research Project

Introducing Your Topic and Posing the Primary Question of Interest

Discussing the Relevance and Importance of Your Topic

Reviewing the Existing Literature

Describing the Conceptual or Theoretical Framework

Discussing the Estimation Method(s)

Providing a Detailed Description of Your Data

Constructing Tables and Graphs to Display Your Results

Interpreting the Reported Results

Summarizing What You Learned

Chapter 19: Ten Common Mistakes in Applied Econometrics

Failing to Use Your Common Sense and Knowledge of Economic Theory

Ignoring the Work and Contributions of Others

Failing to Familiarize Yourself with the Data

Making It Too Complicated

Being Inflexible to Real-World Complications

Looking the Other Way When You See Bizarre Results

Obsessing over Measures of Fit and Statistical Significance

Appendix: Statistical Tables

Cheat Sheet

Roberto Pedace is an associate professor of economics at Scripps College in Claremont, California. Prior to joining the faculty at Scripps College, he held positions at Claremont Graduate University, the University of Redlands, Claremont McKenna College, and the U.S. Census Bureau. He holds a PhD in economics from the University of California, Riverside.

Roberto regularly teaches courses in the areas of statistics, microeconomics, labor economics, and econometrics. While at the University of Redlands, he was nominated for both the Innovative Teaching Award and the Outstanding Teaching Award. At Scripps College, he was recognized for his scholarly achievements by winning the Mary W. Johnson Faculty Achievement Award in Scholarship.

Roberto’s academic research interests are in the area of labor and personnel economics. His work addresses a variety of important public policy issues, including the effects of immigration on domestic labor markets and the impact of minimum wages on job training and unemployment. He also examines salary determination and personnel decisions in markets for professional athletes. His published work appears in the Southern Economic Journal, the Journal of Sports Economics, Contemporary Economic Policy, Industrial Relations, and other outlets.

Roberto is also a soccer fanatic. He’s been playing soccer since the age of 5, paid for most of his undergraduate education with a soccer scholarship, and had a short semi-professional stint in the USISL (now known as the United Soccer League). He continues to participate in leagues and tournaments but now mostly enjoys sitting on the sidelines watching his children play soccer.

Dedication

To my wife, Cynthia, for supporting me emotionally and being a wonderful mother to our children. To my children, Vincent and Emily, for brightening up my days.

Author’s Acknowledgments

None of this would have been possible if my professors hadn’t motivated me and given me a solid foundation in economics. My undergraduate adviser at California State University, San Bernardino, Thomas Pierce, opened my eyes to the world of economics and gave me wonderful advice in preparation for graduate school. I was fortunate to have taken several courses from Nancy Rose and Mayo Toruño, who helped me see economics in a different light when standard theory just wasn’t helping me understand certain aspects of the world. Kazim Konyar was the first to introduce me to the realm of econometrics and helped me understand how it could be a powerful complement to economic theory. At the University of California, Riverside, Aman Ullah’s uncanny ability to make advanced econometric theory comprehensible to a first-year graduate student solidified my interest in the topic. Finally, in his labor economics course and as my dissertation adviser, David Fairris taught me the art of using econometrics to address important economic policy issues.

Many of my econometrics students deserve special gratitude. Several of them stand out: Lora Brill, Megan Cornell, Guadalupe De La Cruz, Matthew Lang, Chandler Lutz, India Mullady, and Stephanie Rohn. Some became friends, a few colleagues, and a couple coauthors, but all inspired me to think of effective approaches to making econometrics accessible, useful, and interesting.

I thank Sean Flynn, my friend and colleague, for believing that I’d be the best person to write this book and Linda Roghaar, my literary agent, for listening to Sean and having faith in my ability to complete the project.

The folks at Wiley have also been incredibly supportive. In particular, I’d like to thank Jennifer Tebbe, my project editor, for working with me every step of the way and keeping me motivated to stay on track with my deadlines. No matter how long the tunnel, she always helped me see the light at the end. Erin Calligan Mooney, my acquisitions editor at Wiley, also helped me get through my sample chapter and ensured that it would meet the standards of others on the editorial team. My copy editor, Caitie Copple, and technical reviewers, Ariel Belasen and Nicole Bissessar, were ideal for this project. Their “eagle eyes” were instrumental in finding my mistakes and improving the finished product.

My research assistant, Anne Miles, gathered data for some of the examples I use in the book and assisted with the imaging of figures and graphs. Her turnaround time was amazing, and I’ll be forever grateful for all the hard work she provided on this project. I also want to thank my friend and colleague, Latika Chaudhary, for responding immediately to an urgent request for a sample of panel data.

Last, but not least, I’d like to thank my family and friends for being patient with me while I wrote this book. I know that sometimes I wasn’t myself and that I’ll need to make up for lost time.

Publisher’s Acknowledgments

Some of the people who helped bring this book to market include the following:

Acquisitions, Editorial, and Vertical Websites

Project Editor: Jennifer Tebbe

Acquisitions Editor: Erin Calligan Mooney

Copy Editor: Caitlin Copple

Assistant Editor: David Lutton

Editorial Program Coordinator: Joe Niesen

Technical Editors: Ariel Belasen, Nicole Bissessar

Editorial Manager: Christine Meloy Beck

Editorial Assistants: Rachelle S. Amick, Alexa Koschier

Composition Services

Project Coordinator: Sheree Montgomery

Layout and Graphics: Carrie A. Cesavice, Christin Swinford

Indexer: Riverside Indexes, Inc.

Publishing and Editorial for Consumer Dummies

Kathleen Nebenhaus, Vice President and Executive Publisher

David Palmer, Associate Publisher

Kristin Ferguson-Wagstaffe, Product Development Director

Publishing for Technology Dummies

Andy Cummings, Vice President and Publisher

Composition Services

Debbie Stailey, Director of Composition Services

Introduction

My appreciation for econometrics grew out of my interest in trying to figure out how the world works. I discovered that empirical techniques tailored to specific circumstances could help explain all sorts of economic outcomes. As I came to understand how the theoretical structure of economics combines with information contained in real-world data, I began to see observed phenomena in a different light. I’d often ask myself questions about my observations. Could I determine whether the outcomes were random and simply appeared to be related? If I believed that two or more things I observed had a logical connection, could I use data to test my assertions? Increasingly, I found myself relying on the tools of econometrics to answer these types of questions.

I’ve written Econometrics For Dummies to help you get the most out of your economics education. By now, your classes have taught you some economic theory, but you’re craving more precision in the predicted outcomes of those theories. Perhaps you’re even questioning whether the theories are consistent with what you observe in the real world. I find that one of the most attractive characteristics of properly applied econometrics is that it’s “school of thought neutral.” In other words, you can adapt an econometric approach to a variety of initial assumptions and check the results for consistency. By using econometrics carefully and conscientiously, you can get the data to speak. But you better learn the language if you hope to understand what it’s saying!

Econometrics For Dummies provides you with a short and simple version of a first-semester course in econometrics. I don’t cite the seminal work or anything from the large collection of econometric theory papers published in scholarly journals. The organization of topics may have some resemblance to traditional econometrics textbooks, but my goal is to present the material in a more straightforward manner. Even if you’re taking a second-semester (advanced) econometrics course or a graduate course, you may find this book to be a useful, one-stop, nuts-and-bolts resource.

Of course, some technical sophistication is essential in econometrics. Besides, you’ve taken introductory economics, statistics, and maybe even intermediate economic theory, so now you’re ready to show off your technical prowess. But wait a minute! Sometimes, with all the technical skills being mastered in learning econometrics, students fail to appreciate the insights from the simplicity. In fact, you may even forget why you’re approaching a problem with a particular technique. That’s where this book can help.

Please note that I have tried to remain consistent with my terminology throughout the book, but econometricians sometimes have several different words for the same thing. Also, note that I use the statistical software STATA 12.1 throughout, but sometimes I refer to it simply as econometrics software or just STATA.

Foolish Assumptions

If you’re following the normal course of action, you take an econometrics course after you complete courses on principles of microeconomics, principles of macroeconomics, and statistics. In some cases, depending on the school, you may also be required to complete intermediate economic theory courses before taking econometrics. I cover the topics in a way that accommodates some variation in preexisting knowledge, but I’ve had to make the following assumptions about you:

You’re a college student taking your first econometrics class taught in a traditional manner — emphasizing a combination of theoretical proofs and practical applications.

Or you’re a graduate student (or are taking an advanced undergraduate econometrics class) and would like to refresh your memory of basic econometric concepts so you can feel more comfortable with the transition into advanced material.

You remember basic algebra, principles of economics, and statistics. I review the concepts from your statistics course that are most important for econometrics, but I also assume that a quick overview is all you need to get up to speed (and you can skip it if you’re ready to dig right in).

Numbers, equations, and Greek letters don’t intimidate you. I know that on the surface using the so-called dismal science with quantitative methods isn’t exactly the most attractive combination of topics. By this point in your studies, however, I’m sure you’re over the fear people often have at the mere mention of these subjects.

You’ll be using some econometrics software in your class and are willing to adapt my examples in STATA to the software you’re using (although chances are high you’re using STATA in your class anyway).

Icons Used in This Book

Throughout the book, you may notice several different icons along the left margin. I use them to grab your attention and make the book easier to read. Each icon has an important function.

If you see this icon, it means I’m applying the techniques of a particular chapter or section with STATA. I briefly summarize the data I’m using to produce the output, show you how to format the data or create the variables required for the analysis, and point you to the most important components of the output.

I use this icon to signal that the information that follows is essential for your success in applying econometric analysis. To the extent possible, I explain these important, big-picture ideas in a nontechnical manner. However, keep in mind that this book is about econometrics, and therefore some technical sophistication may be required for even the most basic principles.

This icon appears next to information that’s interesting but not essential for your understanding of the main ideas. You’re welcome to skip these paragraphs, but if your econometrics class is more theory based (something that usually depends on the professor’s preferences), you may need to spend more time with this material.

I use this icon to indicate shortcuts that can save you time or provide alternative ways of thinking about a concept.

This icon flags information that helps you steer away from misconceptions, common pitfalls, and inappropriate applications of a particular econometric technique.

Beyond the Book

You may not always have your e-reader or a copy of this book handy, but I'm guessing you have almost constant access to the Internet courtesy of a smartphone or tablet. That's why I include a wealth of accessible-from-anywhere additional information at `www.dummies.com`.

In need of some of the most useful formulas in econometrics? Looking for a breakdown of how you can give your econometric model some flexibility? Head to `www.dummies.com/cheatsheet/econometrics` to access this book's helpful e-Cheat Sheet, which covers these topics and more.

But that's not all. Because econometrics is synonymous with forecasting in some fields, I've put a bonus chapter online at `www.dummies.com/extras/econometrics`. It's all about helping you hone your forecasting skills so you can select the right method to predict an outcome based on the information you have and later vet the accuracy of your forecast.

Where to Go from Here

Unlike most books, you don’t need to start at the beginning and read through to the end in order to gain an understanding of fundamental econometric concepts. Simply turn to the topic that most interests you. Are you struggling with the intuition or justification for a particular type of econometric model? Do you think that a specific econometric tool will help you reveal more insights from your data? You can find that topic in the table of contents or the index and then jump right to it.

Maybe you’re not puzzled and are simply curious about the various tools econometrics has to offer for data analysis. Feel free to browse through the chapters. Maybe an interesting paragraph or a fascinating equation will catch your eye and give you ideas about approaching a problem — hey, it’s possible!

If your statistics knowledge is rusty, I recommend you begin with the first couple chapters. On the other hand, if your experience with statistics wasn’t a good one, you’d like to avoid disturbing flashbacks, and you’re confident in your ability to catch on quickly, then by all means start at any other point. No matter where you start, you’ll never look at data the same way after learning econometrics (for better or for worse!).

Part I

Getting Started with Econometrics

For Dummies can help you get started with lots of subjects. Visit `www.dummies.com` to learn more and do more with For Dummies.

In this part . . .

Get familiar with the approach economists use when investigating empirical issues — not controlled experiments that never seem to contradict standard statistical assumptions.

Find out the basic commands you need to work with data files in STATA 12.1, a popular form of econometric software, and discover the syntax structure for executing estimation commands.

Review the probability concepts that are most relevant for your study of econometrics: topics that focus on the properties of probability distributions and their use in calculating descriptive statistics of random variables.

Reinforce your knowledge of statistical inference so you can be better equipped to use surveys and other forms of sample data to test your hypotheses and draw conclusions.