“Building Multi-Source Databases for Comparative Analyses”
December 16-20, 2019
Institute of Philosophy and Sociology, Polish Academy of Sciences, Warsaw, Poland
General Description of the Event
In Winter 2019, from December 16 to 20, CONSIRT – Cross-national Studies: Interdisciplinary Research and Training program of The Ohio State University and the Polish Academy of Sciences, is organizing the international event Building Multi-Source Databases for Comparative Analyses. The event comprises two days of conference-style presentations on survey data harmonization in the social sciences, followed by a 3-day workshop on ex-post survey data harmonization methodology.
Both the conference and the workshop will be held at the Institute of Philosophy and Sociology, Polish Academy of Sciences, Warsaw, Poland. They are jointly set within the Survey Data Recycling (SDR) Project (NSF 1738502) and the Political Voice and Economic Inequality across Nations and Time (POLINQ) Project (NCN 2016/23/B/HS6/03916).
About the Conference
The Conference (December 16-17) aims to facilitate discussions on methodology of survey data harmonization, and collaboration on a co-edited book that Christof Wolf (University of Mannheim, and GESIS) and the PIs of the Survey Data Recycling (SDR) Project, Kazimierz M. Slomczynski, Irina Tomescu-Dubrow and J. Craig Jenkins are preparing. To garner insights from discipline-specific and interdisciplinary views on the challenges inherent to harmonization and how these challenges are met, the conference will join contributions from sociology, political science, demography, economics, and health and medicine.
Currently, presenters include Claire Durand, Department of Sociology, University of Montreal, Canada; Isabel Fortier, Research Institute of McGill University Health Centre, Canada; Ewa Jarosz-Gugushvili, Marie Curie Fellow, University of Vienna, Austria; Dean Lillard, College of Human Ecology, The Ohio State University, USA; Steven Ruggles, Institute for Social Research and Data Innovation, University of Minnesota; Christof Wolf (U. Mannheim, and GESIS); and members of the SDR Team. The full list of presenters and the conference schedule will be distributed closer to the event.
About the Workshop
The Workshop (December 18-20) will feature the SDR database as a key empirical resource to discuss substantive and methodological considerations in building multi-source databases for comparative analyses. The SDR database covers more than four million respondents surveyed from 1966-2017 in ca. 140 countries. It contains individual-level measures of socio-demographics, political attitudes and behaviors, social capital, and well-being, constructed via ex-post harmonization of social survey data pooled from ca. 3,400 national surveys stemming from 23 major cross-national survey projects, including the World Values Survey, the European Social Survey, and the International Social Survey Programme, among others.
The SDR database also contains source survey quality and harmonization process metadata that we stored as control variables in the database and that are available for analyses. An initial version of the SDR database, covering the period 1966 – 2013, 1721 national surveys from 22 cross-national projects, and 2.2 million respondents, is available by contacting the SDR project, and from Harvard Dataverse.
Experiences within the Survey Data Recycling (SDR) and the Political Voice and Economic Inequality across Nations and Time (POLINQ) projects inform the Workshop. Using the SDR and other databases, POLINQ constructs a dataset of survey-based aggregate measures of political participation and representation featuring young and established democracies since the 1990s.
Day 1 of the Workshop will be devoted to discussing (a) survey data recycling (SDR) as a framework for reprocessing extant cross-national survey data and ex-post harmonization, (b) the structure of the SDR database, and (c) conceptual and practical issues of constructing datasets stemming from the SDR database, including for the POLINQ project. Discussions will be led by members of the SDR and POLINQ projects.
Day 2 will be devoted to missing data imputation. Stef van Buuren, professor of Statistical Analysis of Incomplete Data at the University of Utrecht and statistician at the Netherlands Organisation for Applied Scientific Research TNO in Leiden, will deliver the lectures on missing data imputation for survey datasets with a multi-level structure, focusing on how to solve comparability problems by multiple imputation. Dr. Michał Kotnarowski from the Institute of Philosophy and Sociology, Polish Academy of Sciences, will lead the computer lab session. The materials for Dr. Kotnarowski’s missing data imputation computer lab session is here.
Day 3 will be devoted to discussing the use of individual-level data from cross-national surveys to construct measures of characteristics of countries in given years (macro-level). Social scientists frequently aggregate survey data, yet they rarely discuss the extent to which country-year indicators constructed via aggregation are valid and reliable. The task is especially difficult when aggregation involves behavioral and attitudinal survey items that lack ‘external benchmarks’ against which to judge the summary statistics derived from survey data. Discussions will be led by members of the SDR and POLINQ projects.
We are organizing the 13th edition of the OSU Summer School in the Social Sciences “Central and Eastern Europe in Comparative Perspective: Assessing Social and Political Change” in Poland, June to July of 2020.
Summer School Students will earn 12 semester credit hours for: SOC 3549: Statistics in Sociology (3 credit hours), (b) its application to substantive problems pertaining to social and political change in Central and Eastern Europe, subsumed by the SOC 4699: Undergraduate Research in Sociology (6 credit hours), and (c) SOC 5503 Social Change in Central and Eastern Europe (3 credit hours).
The Summer School organizers work closely with students in helping students apply for OSU and extra-OSU funding for covering costs the Program incurs. In each of the last five years, many of our OSU students enrolled in the Warsaw Summer School received funding from OSU and/or other sources. For questions about the OSU Study Abroad in Warsaw, please contact Dr. Irina Tomescu-Dubrow (firstname.lastname@example.org). Please see the OSU OIA website for more information, and the Summer School website for information on previous editions.
In March 2019, CONSIRT co-organized, with colleagues from the University of Michigan, the annual Comparative Survey Design and Implementation (CSDI) International Workshop (March 18–20, Warsaw, Poland). This conference held at IFiS PAN provided researchers from academia and non-academic organizations the opportunity to exchange best practices in comparative survey methods.
In summer 2019, CONSIRT administration and affiliates organized two sessions at the 8th conference of the European Survey Research Association (July 15–19, 2019, Zagreb, Croatia). One session was Survey Data Harmonization: Potentials and Challenges. Survey data harmonization – its theory and methodology – is growing into a new scientific field that pushes forward the methods of survey data analysis while emphasizing the continuous relevance of surveys for understanding society. Depending on whether researchers intend to design a study to collect comparable data, or use existing data not designed a priori as comparative, the literature distinguishes between input and ex-ante output harmonization, and ex-post harmonization. In both its forms, ex-ante and ex-post, harmonization is a complex, labor-intensive and multistage process, which poses numerous challenges at different stages of the survey lifecycle. This session welcomed papers on both opportunities and difficulties inherent in survey data harmonization.
The other session was Messiness in Extant Cross-national Survey Data: New Approaches to Old News and focuses on survey quality. Cross-national survey projects exhibit wide variation in data quality, both within and across projects. Some departures from quality standards that the specialized literature has established for data collection, cleaning, and documentation, such as the presence of non-unique records (or duplicates), are unequivocal instances of “bad data,” while others, such as certain types of processing errors are more ambiguous. Between the clearly bad and clearly good survey data there may be a range of “decent” quality surveys, with potentially interesting and important information collected form under-surveyed countries and less well covered time periods. However, to date there is little research that systematically assesses the quality of extant international survey data, or that looks at whether and how the “messiness” in existing surveys can be minimized ex-post, and with what consequences for empirical analyses. This session invited theoretical and empirical papers on evaluating the quality of extant surveys, after the stages of data gathering and documentation are completed.