Already a member?

Sign In

Conference Presentations 2011

  • IASSIST 2011-Data Science Professionals: A Global Community of Sharing, Vancouver, BC
    Host Institution: Simon Fraser University and the University of British Columbia

A1: Recent Developments in the DDI Implementation Landscape I (Wed, 2011-06-01)
Chair:Arofan Gregory, Open Data Foundation

  • Can DDI eXist?
    Johan Fihn (Swedish National Data Service)
    Olaf Olsson (Swedish Language Bank)
    Leif-Jöran Olsson (Swedish Language Bank)


    This paper is about how the xml-database eXist can store and index DDI-instances. Some of the technical advantages of the implementation is very fast full-text search in documents, fast and flexible indexing with the built-in indexing engine Lucene and fast development of xml-webservices directly within eXist. We will look at the implementation of Swedish National Data Service's DDI3 based question bank in eXist and show examples of indexing and accessing DDI using eXist.

  • Recent Developments in the DDI Implementation Landscape 1
    Arofan Gregory (Open Data Foundation)
    Joachim Wackerow (GESIS - Leibniz Institute for the Social Sciences)


    This panel addresses recent developments in the DDI implementation landscape - new tools developments, and related developments in the standards. There has been much activity in the DDI community within the past year. Some of the existing tools have been extended with major new features; there are several new tools which support and use DDI 3, and there have been some developments for the DDI standards by the DDI Alliance itself, which will influence on-going tools support. Discussion covers both open-source and freely available tools, and also commercial tools which support DDI 3. This panel summarizes these recent developments, and highlights the points of interest. The panel is divided in three blocks A1, C1 and D1.

  • Collaboration in Data Documentation: Developing "STARDAT - The Data Archiving Suite"
    Wolfgang Zenk-Möltgen (GESIS - Leibniz Institute for the Social Sciences)


    Providing high quality data and documentation is a major demand on archives and researchers in the field of survey research. GESIS has been developing tools to increase standardized documentation processes on study level and on dataset level (e.g. DBKEdit, a web-based editing system for bi-lingual study descriptions, or DSDM for language specific documentation on variable level). However, the challenges of the DDI Version 3 and the collaboration needs at different stages of the data life cycle leaded to the awareness that an integrated management system for metadata is needed. Therefore GESIS started a project to develop "STARDAT - The Data Archiving Suite". It will be based on DDI 3 and will contain modules for structured metadata capture, management, and administration. STARDAT will be integrated into the workflow between data depositors, data managers, and data users. For this reason, a web-based data ingest module will be provided, that allows researchers to deliver metadata already when the project starts. Requirements of data curation, data documentation, data publication, and long-term availability will be incorporated. STARDAT will allow multi-language documentation on study and variable level, as well as the inclusion of further information, e.g. about related publications, classifications, continuity guides, scales or trends.

  • Converting MS Word based Questionnaires to DDI: A demonstration application for uses of metadata throughout the data lifecycle
    Benjamin Clark (INDEPTH Network)


    This tool demonstrates how a structured questionnaire design can be leveraged to harvest metadata, which can then be used to drive tasks downstream in the data lifecycle with greater ease. The application takes in a questionnaire designed in Microsoft Office Word 2007 (.docx format) document and uses LINQ to extract different meta components that make up the questionnaire. The extracted components are then translated into a XAML document that describes the data entry screen and a corresponding DDI document that describes the questionnaire. These two documents are then used to drive the data entry process and preform basic validations like restricting entered values to the codes in a coding scheme, basic data typing, data length and skips. Once the data is entered, the DDI is again leveraged to configure the export process that allows for selecting which variables to export and filtering by error status. The export process also uses the DDI to construct setup files for STATA and produce customized user documentation. Overall, this application demonstrate that creating DDI documentation does not have to be a painstaking, time consuming process done by hand and the advantages of documenting early in the data lifecycle, so that it can be used to drive onward data management activities consistently and efficiently.

  • Using DDI3 in a Technology Based Assessment environment - opportunities and problems
    Ingo Barkow (DIPF - German Institute for International Educational Research)


    The Technology Based Assessment (TBA) group at the German Institute for International Educational Research (DIPF) uses DDI3 in several settings to record metadata for paper & pencil environments. Nevertheless, as the original aim of the group is to transit studies from paper & pencil to computer based assessment it is planned to evolve electronic questionnaires (e.g. CAPI and CATI instruments) like they are used in studies like PISA or PIAAC to a common standard. At the moment TBA uses an own propriety XML structure to describe the items within the questionnaire, but is considering moving towards DDI3 also in the rendering engine which means the metadata and item development tool programmed for the p&p instruments can be used as item editor for computer based questionnaires. This presentation will show the opportunities and also challenges for this process as well as introducing a workflow from the metadata editor to the rendering engine. Furthermore the newest version of Metadata Editor as well as a first prototype of the rendering engine will be shown. All tools in this presentation are open source software and can be requested for an own usage.


A2: Data Management Services: New Roles and Connections (Wed, 2011-06-01)
Chair:Ann Green, Digital Life Cycle Research & Consulting

  • Curation in the age of complexity: reworking an ancient art
    Michael Jones (Australian Data Archive - Melbourne)
    Gavan McCarthy (Australian Data Archive - Melbourne)


    Data curation can trace a lineage back to the very earliest records and the beginnings of materially recorded history. In the world of art galleries and museums, curation evolved to have specific meanings associated with selection, presentation, and interpretation for public exposure; that is, making sure that objects are understandable to the audience of the time. However, in Australia, curation has not been a term associated with traditional archival practice. With the advent of digital data archive programs and the cyberinfrastructure movement in the late twentieth century, curation was a term appropriated to cover a grab bag of roles and functions associated with data capture, documentation, preservation and access. Mapping complexity, that vast array of interconnectedness that characterises human endeavour, has become an acknowledged as necessary in establishing contexts of meaning and interpretation. Visualised networks of connectivity provide a means of understanding larger-scale worlds, lifting data from its mere isolation. These tools provides us with an opportunity to reinstate data curation with the intellectual, scholarly research endeavour as seen in the art world, releasing it from the more technical aspects of preservation and access. Network visualisations of some Australian data worlds will guide an interpretative reworking of curation.

  • Data Management & Preservation: Creating a new service for researchers
    Carol Perry (University of Guelph)


    In January 2010, the University of Guelph Library launched a new data management and preservation service for researchers on campus. While the new service was in its initial stages of development, research groups on campus were eager to volunteer their research projects to serve as pilot test cases. Ultimately, two large, long-term research groups were selected and, following the administration of needs assessments, management plans have been developed around the issues identified for each research group. This paper will track the development of the service with particular attention being paid to the complex needs of the different groups involved in the pilot project. Subsidiary services for graduate students which have been incorporated into the program will also be discussed. Reactions to these new services will be explored.

  • UVa Library SciDac: New Partnerships and Services to Support Scientific Data in the Library
    Andrew Sallans (University of Virginia Library)
    Sherri Lake (University of Virginia Library)


    The University of Virginia Library created a new unit called the Scientific Data Consulting Group (SciDaC) in May 2010 to respond to the increasing need for data management support in the sciences. Since its creation, the SciDaC group has focused on three main priorities: 1) data assessment interviews to establish a baseline on scientific research data management processes, 2) development of institutional support for researchers in response to new data management regulations (ie. NSF, etc.), and 3) integration of institutional scientific research data with the new institutional repository. This effort built upon previous experience providing scientific research support through a partnership with UVA’s central IT group. In order to build and scale this new endeavor, we have aggressively worked to establish and grow new partnerships inside and outside of the institution. The talk will focus on our experiences providing these consulting services and the challenges we have faced in integrating data management expertise into the library environment. It will also cover some specifics of the partnerships that we have developed and the opportunities that we are looking toward in the future.

  • Data archiving and cooperation with medical researchers - An example from Denmark
    Bodil Stenvig (Danish Data Archives )


    In the DDA a special unit called DDA Health collects preserves and disseminates health-related research data. A very important and necessary part of the effort of the DDA Health aims at increasing medical researchers’ awareness of the need for good archiving practices. So we have stimulated voluntary archiving by collaboration with medical research groups and medical researchers in many ways. The efforts has been on visibility of DDA Health and its values; the means have been a website, a newsletter, site visits, presentations in connection with various kinds of scientific gatherings, contributions to undergraduate courses and research training courses, and communication via a network of interested researchers. In the paper I will focus on the development of cooperation between DDA Health and the medical researchers for instance by making it visible that the DDA Health can add value to research data from medical research by supporting them with DDI 3 and GCP.


A3: Building Capacity to Link, Visualize, Identify, and Discover (Wed, 2011-06-01)
Chair:Steven McEachern, Australian Social Science Data Archive

  • Statistical Data Analysis based on Linked Open Data
    Benjamin Zapilko (GESIS - Leibniz Institute for the Social Sciences)
    Brigitte Mathiak (GESIS - Leibniz Institute for the Social Sciences)
    Oliver Hopt (GESIS - Leibniz Institute for the Social Sciences)


    Research in the social sciences is based on the analysis of social phenomena via quantitative evidence. Scientists typically need to perform major and complex analyses on statistical data, but as part of their main tasks, they also require tedious secondary examinations on heterogeneous and distributed datasets, for example to verify prior or referenced assumptions or detecting correlations between two or more datasets. A lot of tools already exist which support researchers in processing and analysing their data effectively, but raw data often has to be converted to particular formats in order to be processed and analysed. In this paper, we propose a method to perform such statistical analyses on Linked Open Data resources in order to support common tasks that researchers encounter when working with heterogeneous and distributed datasets. The idea of Linked Open Data provides a technical basis for exposing, sharing and linking data on the web, based on an established web architecture, comprising standardised formats and interfaces. However, statistical calculations cannot be performed directly on these data sources yet. Our prototype covers some common exemplary tasks for data analysis in the social sciences.
  • IASSIST Quarterly

    Publications Special issue: A pioneer data librarian
    Welcome to the special volume of the IASSIST Quarterly (IQ (37):1-4, 2013). This special issue started as exchange of ideas between Libbie Stephenson and Margaret Adams to collect


  • Resources


    A space for IASSIST members to share professional resources useful to them in their daily work. Also the IASSIST Jobs Repository for an archive of data-related position descriptions. more...

  • community

    • LinkedIn
    • Facebook
    • Twitter

    Find out what IASSISTers are doing in the field and explore other avenues of presentation, communication and discussion via social networking and related online social spaces. more...