# ECO 440/640 — Problem Sets 4 and 5

You should answer the following question: does having legal access to oral contraceptives cause women to get married later? This is a big task, so the work will span two assignments. There are multiple goals for this task:
• Understand and use difference-in-differences strategies
• Get experience interpreting methods in research papers
• Get experience writing out methods and data descriptions
• Understand that precision is crucial in scientific writing
• Get experience cleaning data

Your will be replicating some of the work in the following report: Goldin, Claudia, and Lawrence Katz (2002) “The Power of the Pill: Oral Contraceptives and Women's Career and Marriage Decisions”, Journal of Political Economy, Vol. 110. Thus you should read that report. You do not need to read all of it, since this is not a generic economics course. Instead you should read everything that is necessary to understand the justification for the “identification strategy” and estimation process. In other words, you may ignore the “Frameworks to Understand the Effect of the Pill on Marriage and Career” section. You can also ignore the sections only on professional choices or that use cohort-level regressions (“Career and Marital Status Outcomes: Aggregate Cohort Analysis”). Remember that statistical work is never separate from understanding the qualities of the data and research design.

## Problem Set 4

Suppose you just regressed age at first marriage on pill use at age 18. What are two potential confounds in this strategy? What is the identification strategy you will be using to get around these confounds? Why does it work? (As in other cases, do not copy and paste these questions with answers. These are questions that should guide your description of the problem at hand and the solution G&K developed.)

What are the methods G&K use to evaluate their hypotheses? Do they provide any evidence other than a regression table?

What was the result that G&K found? Are you convinced by it?

Write out the econometric model you will estimate. Explain the reason why each variable is included and what it means. Read their description, internalize it, understand it, and then write out a description. What is the level of observation in this model?

Write out a detailed explanation for exactly what sample you will use. Think of this as a recipe. Someone should be able to follow your recipe to reproduce your work exactly. There should be no ambiguity.

Evaluate this claim: difference-in-differences models solve the problem of omitted variable bias/selection bias.

## Problem Set 5

Here is a spreadsheet I made with the laws in the US: http://randycragun.com/courses/640/Laws_US.csv. For the “14 or 18” and “14 or 19” cases, I used the higher age. Perhaps this is the reason my results differ from G&K's.

This report hase more high-quality measures of the laws: Myers, Caitlin Knowles (2017) “The Power of Abortion Policy: Re-examining the Effects of Youn Women's Access to Reproductive Control” Journal of Political Economy, Forthcoming. Myers has the most complete set of measures of the legal environment in the US so far compiled, so you could use her measures of early legal access to contraception and abortion (in Table 1) even though you are trying to replicate the work by Goldin and Katz (G&K). (Also note the misspelling of my name in the aknowledgments. Thanks, Caitlin.)

Replicate the results from G&K. Write up a report with your results. You do not need to replicate any of the columns of the age-of-first-marriage regression table except the first two. I chose to not word this as “you only need to replicate the first two columns of the age-of-first-marriage regression table” because there is more to writing up the report than just getting regression results. You need to check on the definitions of the variables you downloaded and their codes to make sure that you are using them correctly and actually using the right sample. Your report needs to state exactly what sample you use (it is likely that most of you wrote incomplete descriptions for last week because it is hard to notice the details without actually trying to do the work.

The following categories should guide your write-up. These are some things that every empirical research report covers:

• the question
• the methods (how you will answer the question)
• the data (with extremely precise descriptions)
• the results
Some of this will be the same as what you wrote last week, but you should rewrite it to clarify things now that you have actually tried working with the data (e.g. your description of the data was likely not sufficiently precise).

Example uses of merge(), factor(), and ifelse():

    merge(acph1986,laws, by.x="BPL", by.y="STATEFIP", all.x=T)
    factor(BPL)
    acph1986$marr23 = ifelse(acph1986$AGEMARR<23, 1, 0)