Conference Presentations 2010

  • IASSIST 2010-Social Data and Social Networking: Connecting Social Science Communities across the Globe, Ithaca, NY
    Host Institution: Cornell Institute for Social and Economic Research (CISER) and Cornell University Library (CUL)

G2: Social Network Dynamics (Fri, 2010-06-04)
Chair:Harrison Dekker, University of California, Berkeley

  • The Data Archive as a Social Network: An Analysis of the Australian Social Science Data Archive
    Steven McEachern (Australian Social Science Data Archive )


    The study of the network structure of academic disciplines through the analysis of publication citation data has a long history in the field of social network analysis (eg. Mullins et al, 1977). Such studies have examined network characteristics such as the centrality of researchers within collaborative networks (Bollen et al, 2005), and the characteristics distinguishing core and peripheral members of networks (Borgatti and Everett, 1999). The prevalence of such research can in part be attributed to the relatively easy collection of the relational data required for social network analysis, in a consistent and structured form, within a definable population. It is therefore somewhat surprising that the holdings of data archives have not been subject to similar analysis. The metadata describing social science data deposits represents a rich source of data to explore the emergence of the data networks supporting social science research activity. This paper seeks to address therefore seeks to fill this gap. It presents a longitudinal social network analysis of the data holdings of the Australian Social Science Data Archive, exploring the patterns of deposit by discipline and institution, over the 30 year history of the archive, as well as visualisation of the emergence of the deposit network over time.

  • The Diffusion of Information Technology in the United States and Its Impact on Social Science Research across Institutions and Countries
    Anne Winkler (University of Missouri-St. Louis)
    Sharon G. Levin (University of Missouri-St. Louis)
    Paula E. Stephan (Georgia State University and NBER)
    Wolfgang Glanzel (Katholieke Universiteit Leuven, Stenupunt O&O)


    This study examines the extent to which IT has differentially affected collaboration in the social sciences relative to the natural sciences. IT’s impact on the social sciences may be larger because much research can be conducted virtually, while working in close proximity (in labs) may be more crucial to producing research in the natural sciences. To undertake the research, the authors match an explicit measure of institutional IT adoption (domain names, e.g. with institutional data on all published papers indexed by ISI for 1,348 four-year colleges, universities and medical schools for the years 1991-2007. The publication data cover the social sciences, humanities, and natural sciences and narrower fields such as economics and biology. Three measures of co-authorship are examined: (1) average number of coauthors by institution; (2) percent of papers from an institution with one or more co-authors at another U.S. institution; and (3) percent of papers with one or more non-U.S. coauthors. The study describes collaboration patterns and then uses regression analysis to examine the impact of IT “exposure” on co-authorship. Preliminary results suggest: 1) dramatic growth in co-authorship within and across fields; and; 2) differential effects of IT by field.

  • Applications of Social Networking in International Collaboration, Multisite-research, Knowledge Re-use and Data Configuration Management
    Kartikeya Bolar (University of Toledo)


    Researchers would like to collaborate with the researchers who are miles away but have similar interests in examining a particular phenomena or conducting a project. The phenomena in focus or the project parameters will definitely vary in their intensities in different contexts. So knowledge or patterns developed at one site will need validation and further enrichment by examining the same at multiple sites. Hence for effective synthesis of knowledge about any phenomena, it is essential to have international collaboration along with multi-site research. Further knowledge is discovered from the databases. Hence, a different perspective and different analysis to the same database will definitely lead to discovery of new knowledge body. Hence it is not only essential that the knowledge discovered is preserved but also the underlying data. The data can be essentially configured to the needs of new research projects. This paper tries to provide insights on how Social networking can be effectively used to identify potential collaborators in research, resolves issues in multisite-research, knowledge re-use and data configuration management.


G3: Preservation: Interoperability and Reproducibility (Fri, 2010-06-04)
Chair:Bo Wandschneider, University of Guelph

  • Replicated & Distributed Storage Technologies : “Impact on Social Science Data Archive Policies”
    Jonathon Crabtree (University of North Carolina, Chapel Hill)


    The Data-PASS partnership engages in collaboration at three levels: coordinated operations, development of best practices, and creation and use of open-source shared infrastructure. The first talk in the session provides an update on our search for replication and distributed storage technologies for preservation. Systems like iRODS and LOCKSS can be developed into preservation environments for social science data archives. The key when implementing these preservation environments will be the modification of existing archive policies and procedures to reflect new dependence on collaboration. The second talk discusses the collection of international public opinion data collected by the USIA, which began in 1952 and extended through 1999. Until recently, these data were difficult to access. The Roper Center and the National Archives and Records Administration have identified, rescued, and made these data available to the research community. The third talk describes a new alliance between ICPSR and Institutional Repositories (IRs) with the goal of preserving and re-using social science data. This talk focuses on the formation of these partnerships; how an archiving guide for IRs will be developed; and new services that ICPSR can offer to IRs to assist with social science data. The fourth talk summarizes the efforts of ICPSR and the Roper Center to migrate punched card data to modern preservation formats. This presentation focuses on the recovery of the Cornell Retirement Study, a longitudinal study that began in 1952. The final talk discusses the current collaborative structure of Data-PASS, our agreements, infrastructure, and the services and infrastructure available to new partners.

  • Towards a Federated Infrastructure for the Preservation and Analysis Archival Data
    Chien-Yi Hou (University of North Carolina at Chapel Hill)
    Richard Marciano (University of North Carolina at Chapel Hill)


    The Sustainable Archives and Leveraging Technologies group (SALT) at UNC is pursuing a number of projects that address issues of interoperability and reproducibility. This presentation will discuss 3 projects: (1) e-Legacy, an NHPRC-funded project which is developing preservation infrastructure and services for state government geospatial data, (2) PoDRI, an IMLS-funded project which explores policy-based interoperability between Fedora and iRODS repositories, and (3) T-RACES (Testbed for the Redlining Archives of California's Exclusionary Spaces), an IMLS-funded project which builds on these other projects and will publish an online archive of historical redlining and racial discrimination data. T-RACES documents the New Deal Home Owners' Loan Corporation federal agency's confidential security maps and surveys of the 1930s. These surveys form the genesis of neighborhood discrimination and restricted mortgage lending, known as redlining. A digital library interface based on interactive databases and Google Map and Google Earth interfaces will integrate data from 8 California cities, including Los Angeles, San Francisco, and Oakland. The intent of this archive is to serve as a core reference data set that can be augmented and customized through social networking mechanisms, through overlays of social science data. Across all three projects, the authors are interested in roles and responsibilities of data services and data repositories that support concepts of policies and customization.

  • Automated DDI Metadata Harvesting and Replication for Preservation Purposes within iRODS
    Jon Crabtree (H. W. Odum Institute for Research in Social Science)
    Antoine de Torcy (DICE)
    Mason Chua (H. W. Odum Institute for Research in Social Science)


    This prototype demonstrated that the migration of collections between digital libraries and preservation data archives is now possible using automated batch load for both data and metadata. We used this capability to enable collection interoperability between the H.W. Odum Institute for Research in Social Science (Odum) Data Archive and the integrated Rule Oriented Data System (iRODS) extension of the National Archives and Record Administration's (NARA) Transcontinental Persistent Archive Prototype (TPAP). We extracted data and metadata from a Dataverse data archive and ingested it into the iRODS server and metadata catalog using the OAI-PMH, Java, XML/XSL and iRODS rules and microservices. We validated ingest of the files and retained the required Terms & Conditions for the social science data after ingest.

  • Encoding Archival Context: An Australian Perspective on Situating Data in Frameworks of Meaning
    Gavan McCarthy (Australian Social Science Data Archive )


    In 2008 the University of Melbourne eScholarship Research Centre joined the Australian Social Science Data Archive, not as social science researchers nor as experienced data archivists but as a group with significant experience in pushing the boundaries of generalised archival practice. We had been studying and developing tools to systematically document the larger contexts in which archival materials are located, to understand the cultural informatics of meaning and how it is ascribed both by the archivists and users of records. This paper examines the use of two tools we use (the Online Heritage Resource Manager – OHRM, and the Heritage Documentation Management System – HDMS) while working directly social science researchers and their data. In particular it explores metadata interchange with DDI (Versions 2 and 3) and the positioning of these tools within the Open Archive Information System reference model. The systematic documentation of contexts (there are often more than one) has numerous benefits but for social science data it is probably the management of rich and highly interconnected authority records where the most obvious benefit lies. The paper will conclude with reference to recent work on the development and utilisation of the Encoded Archival Context xml schema.


Plenary III (Fri, 2010-06-04)

  • The End of Marketing As We Know it, and the Rise of Sociological Metrics
    Dave Linabury (Campbell-Ewald,


    Social media has forever changed the way people interact, communicate and do business. No industry has been more affected, crippled and mutated by it than marketing. With social media growing in importance each day, in part due to the integration with mobile, advertisers are finding that their tried and true methods of measurement just aren’t cutting it anymore. This presentation proposes that sociology holds the key to understanding social media, not only for understanding people’s communicative behaviors, but also their shopping, research and buying behaviors. Case studies will be presented.

