*It's true that the average woman working full-time earns 23 percent less than the average man who works full-time does. Yet this tells us a lot less than it might seem at first glance.*

Men and women differ in occupations, work experience, education, hours and so many other qualities that “average” doesn’t get you close to the real concern about pay equity: That women earn less simply because they are women, and that a woman would be paid less than a man who is otherwise exactly alike her.

In this post, I'm going to do a cursory analysis of data from the March Current Population Survey from 1990 to 2013. You should check out the classic study on gender pay gaps here, from Francine Blau and Lawrence Kahn; these two essays by Claudia Goldin are also a terrific grounder on what and how economists think about the issue.

There's no doubt that, in raw average terms, women earn less than men on an hourly basis. Here's a kernel-density plot of the distribution of the natural logarithm of the hourly wage in 2013, divided according to gender. The black vertical line denotes the federal minimum wage. You can see the wage gap for yourself: the men's wage distribution, in red, is further to the right than the women's wage distribution, in blue.

But perhaps one might think the right question to ask with respect to gender equity isn't this raw average, but rather the counterfactual story about two people who are identical but for their gender. Econometric analysis can get us a good deal of the way there, and that's what this post is about.

Let's think about the factors that tend to be associated with higher or lower hourly wages. People with more work experience and more education seem likely to earn more, as are people who enter into occupations that are generally well-paid, like law or medicine, or unionized, like manufacturing. And it's well known that there are substantial racial differences in pay. We might also expect that pay varies according to whether you work in an urban area or a rural one and your geographic region of the country. We also know that hourly wages have generally risen over time as a result of inflation and productivity growth. And we might imagine that marriage and the number of children, especially young ones, has some effect on pay.

When we control for all of these influences -- which, let's suppose for the moment, are all independent of the gender pay gap -- how much of the gap persists? Whatever goes away is explained by these factors rather than purely gender.

The technical specification of the regression is that we're constructing a Mincer earnings function with controls for NAICS occupation, age, race, SMSA urban status, US Census region, marital status, union membership, number of children, number of children under age five, the logarithm of reported usual weekly hours, and time fixed effects by gender.

Using Stata to run the regression on my CPS survey data, which includes a sample of 209,000 people, I get the following table. From left to right, the columns are year, my point estimate for the gender pay gap in that year, the standard error, the t-statistic, the p-value, and then the 95-percent confidence interval around my point estimate.

My estimate of the gender pay gap is that women were paid about 7.7 percent per hour less than men on average in 2013, holding everything else equal. The gap was 14.3 percent in 1990. Depending on my regression specification, I was able to push the gender pay gap coefficients around only somewhat -- I think the range of reasonable estimates of the 2013 adjusted gender pay gap is probably 4 percent to 10 percent.

We're talking about a hypothetical woman working in the same occupation, in the same region of the country, of the same work experience, education, and race, and with the same family and working hours -- that woman is paid significantly less than a man to whom she is alike in all these respects, though the pay gap is smaller than the raw version.

It's not clear, though, that we really should be controlling for all these things. It's fair enough that people with more experience earn more. But it's reasonable to think that things like occupational choice and working hours are all influenced by the same gender discrimination we're seeking to detect. The pay gaps that result from women ending up in lower-paying fields are part of the pay gap insofar as that's true -- they're not something to explain away. More on this soon.

*Note: This post is a re-do of an earlier one, which had a technical issue in the regression specification. Hat tip to Justin Wolfers, who spotted the problem, and to John Schmitt for some helpful further comments. You can download the .do file here, and the data from IPUMS here.*

Why would you use log hours -- to catch overtime?

ReplyDeleteNot quite. Because the LHS variable is log hourly wage, I used log hours on the RHS -- that way, the effect of a marginal increase in hours can be linear in the wage, rather than exponential, which would be strange. Usually we would think of the production function as concave in worker hours, so I'd rather not build in an assumption of convexity in worker hours into the regression model.

Delete