Conference Presentations 2012

  • IASSIST 2012-Data Science for a Connected World: Unlocking and Harnessing the Power of Information, Washington, DC
    Host Institution: National Opinion Research Center (NORC)

G1: Classification, Harmonization (Fri, 2012-06-08)

  • Research on Cognitive Aspects of Classification: Effects on Metadata Practice and Standards
    Daniel W Gillman (US Bureau of Labor Statistics)
    John Bosley (US Bureau of Labor Statistics)
    Scott Fricker (US Bureau of Labor Statistics)


    Metadata practitioners and standards developers typically take classifications as given. Rarely do they look at how these were created and whether they make sense for respondents or data users. This talk will break tradition and discuss this issue. We considered a question in the Current Population Survey in the US on self-employment, called Class of Worker (COW). The COW question reads: “Were you employed by government, a private company, a non-profit organization, or were you self-employed (or working in the family business)?” The basic question was whether these four response options make sense together. Said another way, does the COW classification make sense for data users? Based on research done by the Small Business Administration and independent researchers, we suspected the answer was No. To investigate this, we paid 90 volunteers to come to BLS and classify a set of twelve job description vignettes based on two different groupings of the COW classification. The data show answering COW is a difficult task. Using research from cognitive psychology, we are now able to provide reasons for this, and we propose new roles for metadata practitioners and new considerations for metadata standards developers.

  • Data coding and harmonization: How DataCoH and Charmstats are transforming social science data
    Kristi M Winters (GESIS - Leibniz Institute for the Social Sciences)
    Alexia Katsanidou (GESIS - Leibniz Institute for the Social Sciences)
    Martin Friedrichs (GESIS - Leibniz Institute for the Social Sciences)


    Comparative social researchers are often confronted with the challenge of making key theoretical concepts comparable across nations and/or time. One example is the socio-demographic variable ‘Education'. To operationalize ‘education' researchers must review multiple educational systems across nations and/or changing educational structures within one nation across time. Further, researchers have multiple ways to recode education into a harmonized variable including (inter alia): the Hoffmeyer-Zlotnik/Warner matrix; the CASMIN education scheme; the International Standard Classification of Education; or a harmonized variable provided by the dataset itself. GESIS is developing two electronic resources to assist social researchers. The website DataCoH (Data Coding and Harmonization) will provide a centralized online library of data coding and harmonization for existing variables to increase transparency and variable replication. DataCoH initially will contain socio-demographic variables used across the social sciences and then expand to discipline-specific variables. The software program Charmstats (Coding and Harmonizing Statistics) will provide a structured approach to data harmonization by allowing researchers to: 1) download harmonization protocols; 2) document variable coding and harmonization processes; 3) access variables from existing datasets for harmonization; and 4) create harmonization protocols for publication and citation. This paper explains DataCoH and Charmstats and demonstrates how they work.

  • A DDI resource package for the International Standard for Classification of Education (ISCED)
    Joachim Wackerow (GESIS - Leibniz Institute for the Social Sciences)
    Hilde Orten (Norwegian Social Science Data Services (NSD))


    The International Standard for Classification of Education (ISCED) is at the heart of national and international statistical agencies' reporting on education. Over the last years, ISCED is also increasingly used by official and non-official survey programmes in the measurement of educational attainment. This even if ISCED 1997, the current version of the standard up to recently, is quite complex and does not contain a classification of educational attainment. In November 2011, UNESCO launched a new ISCED version. ISCED 2011 has a numeric coding framework, separate classifications for educational programmes and educational attainment, and more details at the level of tertiary education. Conceptual clarifications compared to the previous version are also made, and in sum ISCED has become more user-friendly. It is thus expected that its use will increase in the coming years. The DDI Lifecycle has a storing module called a resource package, that structures materials for publication that are intended for reuse by multiple studies, projects or user communities. This presentation focuses on how ISCED 2011 metadata components usefully can be structured in a DDI resource package, for the benefit of reuse by reference by national and international statistical agencies, as well as official and non-official survey programmes.

