Casey Breen, University of Oxford
The explosion in administrative datasets and methods for record linkage has revolutionized social science research. While there has been a proliferation of record linkage algorithms, there is limited guidance for researchers doing inference with linked data. In this study, we use a series of simulation studies and empirical examples to illustrate the ways in which false matches and missed matches can impact research results. For our empirical examples, we investigate three different outcomes—social mobility, shifts in ethnoracial identification, and the educational gradient of longevity—using publicly available linkages from the CenSoc, IPUMS-MLP, and ABE projects. Our paper concludes with a series of practical recommendations for researchers conducting inference with linked data.
Presented in Session 45. New Developments in Linked Data Infrastructure