Conference Presentations 2014

  • IASSIST 2014-Aligning Data and Research Infrastructure, Toronto
    Host Institution: University of Toronto, Ryerson University, and York University

1A: Building data collections (Wed, 2014-06-04)
Chair:Maria A. Jankowska

  • New and/or unique data-capture practices on a limited budget
    Paula Lackie (Carleton College)


    This is a collection of 4 special data projects that exhibit innovation in data gathering processes with human subjects that have solved or hope to solve some nagging issues while on a limited budget. 1) the responsible execution and processing of a survey for native ASL communicators – issues: interpreting cultural differences and applying scientific processes toward an interview-style survey, then interpreting these data into numbers 2) innovations in the census of 3 rural Bengali villages on remittances and perceptions of wealth – issues: survey design and managing local graduate students, mitigating a propensity toward data fabrication, data validation on hand-written survey forms, training and norming processes, working without much electricity and no internet access 3) experiences with non-technically inclined researchers conducting interviews and managing their data using smart pens – issues: making the process easy and fruitful while still technically responsible for qualitative data analysis 4) an experimental technique to capture data on paper and convert it to CSV using smart pen technology.


1B: Panel. Data Service Infrastructure for the Social Sciences and Humanities (Wed, 2014-06-04)
Chair:Johan Fihn

  • DASISH - data service infrastructure for the Social Sciences and Humanities
    Johan Fihn (Swedish National Data Service (SND))
    John Shepherdson (Data Archiving and Networked Services (KNAW-DANS))
    Vigdis Kvalheim (Norwegian Social Science Data Service (NSD))
    Alexia Katsanidou (GESIS – Leibniz Institute for the Social Sciences)
    Katrine Utaaker Segadal (Norwegian Social Science Data Services (NSD))
    Catharina Wasner (GESIS – Leibniz Institute for the Social Sciences)


    We are currently seeing an explosion in research infrastructure initiatives at domain, local, national and international levels. How does a data archive/repository provide services to multiple infrastructures whilst aligning the requests to its longterm strategies and responsibilities? How does a research infrastructure ensure viability and longterm sustainability by ensuring that it has an engaged community of service providers as well as users? Five European research infrastructures in the social sciences and humanities have come together in the form of DASISH – Data Service Infrastructure for the Social Sciences and Humanities, to find solutions to some of the prevailing issues that we face as service providers and research infrastructures and support shared development. The activities in DASISH are consequently broad and cover areas such as: reference architecture for research infrastructure alignment, preservation challenges & deposit service convergence, metadata quality improvement, legal & ethical challenges, multilinguality in questionnaire tools and question databank, data, tools and services discovery, annotation frameworks, and training & workshops for data managers, service providers and those working in infrastructures.


1C: Integrated Data Discovery and Access: Building Data Collections (Wed, 2014-06-04)
Chair:Katherine McNeill

  • Introducing da|ra SearchNet: the integrated data portal for the social sciences
    Tanya Friedrich (GESIS – Leibniz Institute for the Social Sciences)
    Brigitte Hausstein (GESIS – Leibniz Institute for the Social Sciences)
    Daniel Hienert (GESIS – Leibniz Institute for the Social Sciences)


    Data Sharing is largely dependent on infrastructure that facilitates search for and retrieval of research data. Currently, however, the landscape of data repositories and metadata services is uneven and incoherent even within disciplinary boundaries. In our project da|raSearchNet we address this problem for the case of the social sciences by designing and implementing an integrated search infrastructure that aims at fostering data sharing within the discipline. We build on the outcomes of the completed project “da|ra – development of a registration agency for social science data” that already integrates metadata from nearly 20.000 data files and other resources from more than 30 data providers in one database and search application. We are planning to extend this service by automizing our existent metadata recording workflows and by incorporating even more metadata from large comparative survey programmes, from data archives around the world, from qualitative data providers, and other relevant players. Our goal is to establish da|raSearchNet as a comprehensive, easy-to-use metadata store for secondary researchers. In our presentation we describe in detail how we will approach this task in terms of international networking and cooperation, metadata standardization, and search engine technology.

  • Linking health data with social science data
    Amalie Nielsen (Danish Data Archive)


    Archiving, dissemination and reuse of research data enhance the opportunities to make secondary analyses on new subjects. By using the DDI-Lifecycle standard you can make meta-analyses and comparative analyses on data materials from different research areas. In the Danish Data Archive we archive and disseminate health data as well as data from the social sciences. By linking those kinds of data we can get new forms of knowledge on the social impact on public health. Danish Data Archive facilitate the use of registries and databases from the health care system and other administrative resources, along with health surveys and the possibility to link it to classic social surveys like surveys of cultural habits. The different kind of data material makes the comparison complicated. Data has been collected by different researchers and institutions for different purposes. The DDI-Lifecycle standardized metadata and comprehensive study descriptions are the key to make linking and comparison possible.

  • Questasy, a DDI based data dissemination tool
    Edwin De Vet (CentERdata)


    Questasy is a data dissemination tool based on DDI3 developed at CentERdata. It is written in CakePHP and uses a MySQL database. It supports documentation of longitudinal studies as well as the creation of custom datasets. OAI-PMH support for harvesting of studies can be easily set up. The metadata of studies can be exported as DDI 3.1 XML (since version 4.2). Since CakePHP is an MVC framework and supports theming, Questasy can be easily modified to meet specific customer needs. Version 5.0 saw two main features implemented: multilingual support and DDI 3.1 import. For future developments we will investigate whether we can implement DDI 3.2 export and import. Further issues that will be implemented are better documentation of questionnaire routing, summary statistics of variables and dissemination of datasets and syntax created by researchers.

  • A metadata portal for complex research data in Germany: An application of DDI
    David Schiller (Institute for Employment Research (IAB))
    Dana Mueller (Institute for Employment Research (IAB))
    Ingo Barkow (German Institute for International Educational Research (DIPF))


    The research data centre of the Federal Employment Agency in the Institute of Employment Research provides different types of data for the scientific community. There are register data, survey data and linked data between surveys and register data for example. Available metadata tools do not consider the variety of research data of the research data centre and the workflow of the data documentation within departments and the collaboration between departments. Therefore we have started an international development project in cooperation with TBA 21 Assessment Systeme GmbH (Germany), OPIT Consulting Kft. (Hungary) and Colectica (USA) in 2012 to handle and standardized the metadata for all kinds of data in one technical application. Beside the metadata portal there will be a web application for the data users. The focuses of the presentation is the appropriate subset of the DDI standard first and second the application of the metadata portal for users as well as the workflow within departments e.g. user administration and less the technical structure behind the metadata portal.

4N: Integrated Data Discovery and Access (Wed, 2014-06-04)
Chair:Lynda Kellam

  • Creating catalog records that reflect data use restrictions
    Michele Hayslett (University of North Carolina at Chapel Hill)
    Amanda henley (University of North Carolina at Chapel Hill)
    Wanda Gunther (University of North Carolina at Chapel Hill)
    Margaretta Yarborough (University of North Carolina at Chapel Hill)
    Joe Collins (University of North Carolina at Chapel Hill)


    Many researchers don’t realize the restrictions involved in using licensed data for research. The Data Services staff at UNC – Chapel Hill are working with catalogers to present that information within our catalog records to give the opportunity for that information to be discovered in advance of deciding on a given data set, better aligning the research and data infrastructures. This session will discuss this collaborative effort, the specific MARC fields we’re using and implications for sustainability and for outreach (to increase visibility).

  • Powerful access to qualitative data: What's behind the UK QualiBank
    Darren Bell (UK Data Archive)


    In this paper we discuss how we have implemented a digital data browsing system for qualitative data based on highly structured data and metadata. Our system also enable paragraph-level citation. The crux of this exciting project has been the incorporation of object and sub-object level metadata using the QuDEx metadata schema in addition to DDI study-level metadata, and using the Text Encoding Initiative (TEI) for encoding textual data. The QuDex schema was initially released in 2006 following a project undertaken by the UK Data Archive and Metadata Technologies. QuDEx enables simple description of collections, data objects, parts of data objects, captures formal relationships between them, and analytical elements such as categories, codes and memos. The Text Encoding Initiative (TEI) has further provided a powerful tool for marking up bodies of text to enable rich web display. We will discuss how we have implemented the DDI, QuDEx and the TEI in the UK Quali Bank browsing system, and describe our use of technologies: an XML database (Base X) to store and deliver both metadata and textual data, and Solr and XQuery for powerful searching.

  • A question database for the German Longitudinal Election Study
    Wolfgang Zenk-Moeltgen (GESIS - Leibniz Institute for the Social Sciences)


    The GLES Question Database contains all the questions from the German Longitudinal Election Study ( The GLES Study is a large and ambitious election study in Germany. It is structured in eleven components conducted in different modes which are connected by a common core questionnaire. The GLES Question Database enables users to search for questions and is currently used as an internal tool for the GLES project. A public release is planned until summer 2014. Questions are shown with their answer categories, their variables, and their association to studies, along with some basic study level information. Studies may also be compared to see differences in methodology and topics covered. The GLES Question Database is a first product based on the STARDAT development that was shown at IASSIST before. It is based on DDI-Lifecycle re-usable components that build a basic common infrastructure for several applications. The GLES documentation was imported from a project specific database into DDI-Lifecycle format and combined with study level information from the GESIS Data Archive. The presentation will show challenges and solutions of the development of the GLES Question Database and the possibilities to use it for similar data collection projects.

  • New infrastructure for harmonized longitudinal data with MIDUS and DDI
    Jeremy Iverson (Colectica)
    Barry Radler (University of Wisconsin)
    Dan Smith (Colectica)


    Researchers wishing to use data from longitudinal studies or to replicate other’s research must currently navigate thousands of variables across multiple waves and datasets to answer simple analysis questions. A tool that allows researchers to create documented and citable data extracts that are directly related to their queries would allow more time to be spent on public health research questions instead of data management. MIDUS (Midlife in the United States) is a national longitudinal study of approximately 10,000 Americans designed to study aging as an integrated biopsychosocial process. The study has a unique blend of social, health, and biomarker data collected over several decades. In late 2013, the the United States National Institutes of Health funded MIDUS to create a DDI-based, harmonized data extraction system. This tool will facilitate identification and harmonization of similar MIDUS variables, while enhancing the MIDUS online repository with a data extract function. This will accomplish something unprecedented: the ability to obtain customized cross-project downloads of harmonized MIDUS data that are DDI-compliant. Doing so will greatly enhance efficient and effective public use of the large longitudinal and multi-disciplinary datasets that comprise the MIDUS study. This session will discuss project background and demonstrate the current state of the software.

