Jarmo Peltola, University of Helsinki
I have built a population database of the population of the city of Tampere. The database is based on city address calendars published at irregular intervals based on the police's address register. The address calendars have been read from OCR-identified pdf documents into the database, and currently comprise 400,000 records from 1910–1946. The database is combined with other databases collected for research about people in the city. Marriages 1926–1939, Children born in the Maternity Hospital 1921–1945, Jobseekers at the employment agency 1927–1929, Unemployed people approved to the official unemployment registers 1929–1937; lists of those who fought on the red and white sides in the civil war in 1918 and those who fell ill and died to typhoid in 1916 and dead registers 1916–1960. The census taken in the city and its suburbs in 1930, which has been identified for us in the National Archives with a program similar to Transcribus, will also be connected to the database. In addition to basic information about people, it also contains information about people's places of residence and living amenities. Combining databases is challenging because the common matching-surface varies with separate databases. Address calendars contain name, profession, address and phone number. Combining it with the census is a challenge because the address calendars do not have a date of birth and the spelling of names varies. The base material of the address calendars, i.e. the Police's address register, contains name, date of birth, profession and address, with the help of which it is possible to "enrich" the address calendars so that they can be connected to other databases. Currently, the address calendars have been enriched manually, but we are looking for ways to automate processes. When totally enriched, the address calendars offer a very dense description of the life of the city's population.
No extended abstract or paper available
Presented in Session 21. Evaluating Data Quality II