This section contains 4,110 words (approx. 14 pages at 300 words per page) |
In a linear regression model, sample selection bias occurs when data on the dependent variable are missing nonrandomly, conditional on the independent variables. For example, if a researcher uses ordinary least squares (OLS) to estimate a regression model in which large values of the dependent variable are underrepresented in a sample, estimates of slope coefficients typically will be biased.
Hausman and Wise (1977) studied the problem of estimating the effect of education on income in a sample of persons with incomes below $15,000. This is known as a truncated sample and is an example of explicit selection on the dependent variable. This is shown in Figure 1, where individuals are sampled at three education levels: low (L), middle (M), and high (H). In the figure, sample truncation leads to an estimate of the effect of schooling that is biased downward from the true regression line as a...
This section contains 4,110 words (approx. 14 pages at 300 words per page) |