A Social Scientist’s Guide to Inference with Linked Data

Casey Breen, University of Oxford

The explosion in administrative datasets and methods for record linkage has revolutionized social science research. While there has been a proliferation of record linkage algorithms, there is limited guidance for researchers doing inference with linked data. In this study, we use a series of simulation studies and empirical examples to illustrate the ways in which false matches and missed matches can impact research results. For our empirical examples, we investigate three different outcomes—social mobility, shifts in ethnoracial identification, and the educational gradient of longevity—using publicly available linkages from the CenSoc, IPUMS-MLP, and ABE projects. Our paper concludes with a series of practical recommendations for researchers conducting inference with linked data.

See extended abstract

 Presented in Session 45. New Developments in Linked Data Infrastructure