Already a member?

Sign In
Syndicate content

Community of Data Professionals

New about IASSIST members.

Research Data Management Issues Across Environments

Lots of conversations going on these days in different venues where people are asking many of the same questions:  how do we teach researchers about data management with limited staff, and what data management services should we offer?  How do we find sustainable ways to manage data that leverage the efforts of many different repositories, those in government, institutions and disciplinary ones?  How do we coalesce standard practice and reasonable but effective policies at at least the national level and preferably on a global scale?  What roles should governments play?  How much can we as data professionals accomplish on our own?  The Data Management and Curation SIG will host a workshop to talk about these and other issues across different countries and environments next Tuesday. Our speakers will include:

  • Dan Gillman, U.S. Bureau of Labor Statistics
  • Marcel Hebing, DIW Berlin
  • Chuck Humphrey, University of Alberta
  • Steven McEachern, Australian Data Archive
  • Barry Radler, Institute on Aging, University of Wisconsin-Madison
  • Robin Rice, EDINA and Data Library at the University of Edinburgh
  • Kathleen Shearer, Confederation of Open Access Repositories and Research Data Canada

Looking forward to seeing many of you in Toronto!

Michele Hayslett, University of North Carolina at Chapel Hill & Stefan Kramer, American University

New 'Special Issue' IQ now available!

Editor’s notes

Special issue: A pioneer data librarian

Welcome to the special volume of the IASSIST Quarterly (IQ (37):1-4, 2013). This special issue started as exchange of ideas between Libbie Stephenson and Margaret Adams to collect papers relating to the work of Sue A. Dodd. Margaret Adams (Peggy) acted as the guest editor and the background and content of this volume is described in her preface to this volume on the following page. As editor I want to especially thank Peggy and Libbie for pursuing and finalizing their excellent idea. I also want to thank all the authors that contributed to produce this volume. As one of the authors I can witness that Peggy did a great job.

Articles for the IASSIST Quarterly are always very welcome. They can be papers from IASSIST conferences or other conferences and workshops, from local presentations or papers especially written for the IQ. When you are preparing a presentation, give a thought to turning your one-time presentation into a lasting contribution to continuing development. As an author you are permitted “deep links” where you link directly to your paper published in the IQ. Chairing a conference session with the purpose of aggregating and integrating papers for a special issue IQ is also much appreciated as the information reaches many more people than the session participants, and will be readily available on the IASSIST website at http://www.iassistdata.org.

Authors are very welcome to take a look at the instructions and layout:
http://iassistdata.org/iq/instructions-authors

Authors can also contact me via e-mail: kbr@sam.sdu.dk. Should you be interested in compiling a special issue for the IQ as guest editor(s) I will also be delighted to hear from you.

 

Karsten Boye Rasmussen

April 2014

Editor

IASSIST Africa Regional Report 2013-2014

Freeing African Data

Two regional developments have the potential to get African government data into the public domain. Putting their disaggregated data out there can benefit African governance through ensuring transparency and allowing feedback from policy analysis to support better government planning. The World Bank’s Central microdata catalog has been around since 2012 and continues to expand its listing of data sources. This is currently the only comprehensive online source for microdata produced by African official data producers, as a listing of country datasets is not available on most African government websites.

While the World Bank project supports improve data discovery, a second donor project aims for more Open Government Data. The Accelerated Data Program is an OECD project to make African government data more accessible. This project works to install data dissemination software with government data producers such as ministries and statistics offices. Currently data is available from statistics offices in several countries which are using this platform. These include Ghana, Kenya, Namibia, Nigeria, Rwanda, Senegal, Tanzania, and Tunisia.

The ADP also trains data managers in African National Statistics Offices. While data expertise is necessary to leverage national data resources data curation training projects are scarce in African countries. In 2013-2014 the ADP ran data management training workshops in Malawi, Mozambique, and Zambia, and ADP trainers teamed up with staff from the University of Cape Town’s Data Service to conduct data curation training workshops in Botswana, Lesotho, and Rwanda.

Another move towards Openness is the establishment of a Research Data Centre at the University of the Cape Coast in Ghana.  This will make Ghanaian data more widely available to local researchers and to the wider research community. Currently Ghanaian data can be purchased from the government data producer, which may keep out researchers from poorly-resourced institutions. The University of Cape Town in South Africa and the University of Michigan in the US are working with University of Cape Coast staff to support data curation best practice at the new centre, with funding from University of Michigan’s African Social Research Initiative.

African Universities Managing their Data Assets

The University of Cape Town in South Africa has been engaged Research Data Management policymaking in 2013-2014. IASSIST member Lynn Woolfrey and a team from the University Library undertook a University data needs survey and a scoping study of policies of other universities and completed a report and draft policy document which will be built on by Stakeholders at the University to produce a university-wide policy for managing research data into the future. The policy will ensure the University is in forefront of what will become standard practice at universities in the future.

African Data Conferences

5th African Conference for Digital Scholarship and Curation was held in Durban, South Africa, in June 2013.  The Conference brought together data experts from African countries under the theme of Research data in the advance of education, research, and innovation. IASSIST’s Lynn Woolfrey gave a presentation on data curation best practice at a post-conference workshop organised by South Africa’s Network of Data and Information Curation Communities (NeDiCC).

The first Isibalo data users’ conference was organised by Statistics South Africa at the University of Stellenbosch in Stellenbosch, South Africa, in July 2013. The Conference was an opportunity for feedback on the relevance of South African data for academia and local government decision makers and augers well for future producer-user interactions around data quality issues.

The annual eResearch Africa Conference was held in Cape Town, South Africa in October 2013. Under the banner ICT Enabling Research presenters from Australia, the UK, and African countries discussed eResearch projects and brain-stormed future e-Research strategies. IASSIST member Lynn Woolfrey presented research undertaken on data accessibility for research on Africa.

IASSIST Fellows 2014

The IASSIST Fellows Committee is glad to announce through this post the four recipients of the 2014 IASSIST Fellowship award. We are extremely excited to have such a diverse and interesting group with different backgrounds and experience and encourage IASSISTers to welcome them at our conference in Toronto, Canada.
Please find below their names, countries and brief bios:

Antonin Benoit, Head Librarian at the African Institute for Economic Development and Planning. Dakar, Senegal.

"As the head Librarian I am the manager of our Online Database called IDEP document server (http://www.unidep.org/library). We provide via this tool an access to bibliographical and textual references. In another hand I am the a focal point of IDEP to work with African Centre of Statistics (ACS) to compile an Inventory of all existing data resources in my Institute. The ACS is a division of UNECA and it is located in Addis Ababa (Ethiopia). I am then devoted to provide data used for statistical analysis and publications in the Existing Data Resources of UNECA (http://ecastats.uneca.org/cdsr/). I am also very familar with metadata standards like MarcXML and Dublin Core that I use frequently in my job through our Document server. My main objective is to make our Institute the first African Library catalog to enter the Open Linked Data project. So, attending the IASSIST conference could improve my capacities on data management, because my initial professional background is Librarianship and I still have some weaknesses on data management"

Fei Yu, Acting Manager of Research Data Collections  at the University of Queensland Library. Brisbane, Australia.

"Fei has gained a wide range of experience in academic libraries including bibliometrics and research data management.  She was recently successful in being appointed as Manager, Research Data Collections.  This has involved drafting  the Research Data Management Procedures which will underpin the University of Queensland Research Data Management Policy that was approved at the end of 2013.  She is involved in promoting best practice in data management for all of UQ and has established a wide range of Data Information Literacy training courses for UQ researchers and ensuring that their research data collection metadata is accurate and available on the institutional repository - UQ eSpace.  She is presently rolling out the online data management tool (based on the UK Digital Curation Center (DCC) tool) university wide to ensure that all university researchers and research students have an easy and accessible tool to create their data management plans.  The Research Data Collections team lead by Fei created the Research Data Management Guide  - a one stop shop – containing detailed information on all aspects of data management.  Fei also works collaboratively with the University's Research Computing Centres and the Queensland Cyber Infrastructure to ensure that staff are aware of the many data storage options. "

Aileen O'Carroll, Policy Manager of the Digital Repository at the Digital Repository of Ireland. Dublin, Ireland.

"I am currently Policy Manager of the Digital Repository of Ireland (DRI). DRI is a newly established national organisation (the project was established in September 2011) whose remit is to link together and preserve the rich and varied cultural, historical, and qualitative social science data held by Irish Institutions. It will be a central access point to this digital data and provide multimedia tools to research and interact with archived data. My role requires me to have a thorough understanding of international best practice in licensing frameworks, digitisation policy, archival management, and an understanding of the different needs and perspectives of a wide range of stalk-holders and users. It is of key importance that this emerging national infrastructure is aligned both with European and International best practice along with practice and policy already in place in a diverse field of Irish cultural, educational and social scientific organisations."

Winny Nekesa, Senior Library and Documentation Officer at the Public Procurement and Disposal of Assets Authority. Kampala, Uganda.

"Winny Nekesa Akullo obtained a Bachelors degree in Library and Information Science in 2003, Postgraduate Diploma in Demography in 2014 from Makerere University and finalized her thesis for the  Masters Degree in Information Science. Before joining the Public Procurement and Disposal of Public Assets Authority as a Senior Library and Documentation Officer in 2014, she worked as an Information Officer/Librarian at Uganda Bureau of Statistics where she was in charge of information management and data dissemination and was spearheading the establishment of a UBOS Digital Library and a School Senior Librarian. She has international training and exposure in establishing digital libraries, preservation and construction and application of information systems. She is the Country Coordinator of the International Librarians’ Network, Publicity Secretary, Uganda Library and Information Association and the General Secretary, Uganda Textbook-Academic and Non-Fiction Authors’ Association.  Her area of expertise is digital preservation and data dissemination. Currently her main research interests are information retrieval, digital preservation and open access repositories. She presented at the 2013 IASSIST Conference “Establishing a National Statistical Information Repository in Uganda; Challenges and Opportunities”  she got a lot of exposure, and new ideas about data and information management. This year, I hope to gain more information which I can apply to my new institution especially in the area of data management which is still virgin."

White Paper Urges New Approaches to Assure Access to Scientific Data

Press release posted on behalf of Mark Thompson-Kolar, ICPSR.

12/12/2013:  (Ann Arbor, MI)—More than two dozen data repositories serving the social, natural, and physical sciences today released a white paper recommending new approaches to funding sharing and preservation of scientific data. The document emphasizes the need for sustainable funding of domain repositories—data archives with ties to specific scientific communities.

“Sustaining Domain Repositories for Digital Data: A White Paper,” is an outcome of a meeting convened June 24-25, 2013, in Ann Arbor. The meeting, organized by the Inter-university Consortium for Political and Social Research (ICPSR) and supported by the Alfred P. Sloan Foundation, was attended by representatives of 22 data repositories from a wide spectrum of scientific disciplines.

Domain repositories accelerate intellectual discovery by facilitating data reuse and reproducibility. They leverage in-depth subject knowledge as well as expertise in data curation to make data accessible and meaningful to specific scientific communities. However, domain repositories face an uncertain financial future in the United States, as funding remains unpredictable and inadequate. Unlike our European competitors who support data archiving as necessary scientific infrastructure, the US does not assure the long-term viability of data archives.

“This white paper aims to start a conversation with funding agencies about how secure and sustainable funding can be provided for domain repositories,” said ICPSR Director George Alter. “We’re suggesting ways that modifications in US funding agencies’ policies can help domain repositories to achieve their mission.”

Five recommendations are offered to encourage data stewardship and support sustainable repositories: 

  •  Commit to sustaining institutions that assure the long-term preservation and viability of research data
  • Promote cooperation among funding agencies, universities, domain repositories, journals, and other stakeholders 
  •  Support the human and organizational infrastructure for data stewardship as well as the hardware
  •  Establish review criteria appropriate for data repositories
  • Incentivize Principal Investigators (PIs) to archive data

While a single funding model may not fit all disciplines, new approaches are urgently needed, the paper says.

“What’s really remarkable about this effort—the meeting and the resulting white paper—has been the consensus across disciplines from astronomy to archaeology to proteomics,” Alter said. “More than two dozen domain repositories from so many disciplines are saying the same thing: Data sharing can produce more science, but data stewards must know the needs of their scientific communities.”

This white paper is a must read for anyone who wants to understand the role of scientific domain repositories and their critical role in the advancement of science. It can be downloaded at http://datacommunity.icpsr.umich.edu

 

The Inter-university Consortium for Political and Social Research (ICPSR), based in Ann Arbor, MI, is the largest archive of behavioral and social science research data in the world. It advances research by acquiring, curating, preserving, and distributing original research data. www.icpsr.umich.edu

The Alfred P. Sloan Foundation is a philanthropic, not-for-profit grantmaking institution based in New York City. Established in 1934, the Foundation makes grants in support of original research and education in science, technology, engineering, mathematics, and economic performance. www.sloan.org

###

I am he as you are he as you are me and we are all together

I'm just in the process of updating who we follow from our @iassistdata twitter account (we follow members who follow us - when I get round to updating things, sorry).

Given the huge* number of followers we now have, (595, thank you one and all) I thought it would be interesting to see what we looked like according to our twitter bios.

No surprises: we define ourselves as data people or organisations, in terms of "research", "librarian" (and library related terms), "social" "science", "digital", "information", and "universities". It suggests people following us are the type of people that should be following us given the organisation's goals, and hopefully are getting some value from following @iassistdata.

*Obviously a subjective assessment when Justin Beiber has 44,625,042.

 @iassistdata twitter follower bios

Congratulations to Dan Tsang and Wendy Watkins!

As some of you may know, Dan Tsang and Wendy Watkins have been named the 2013 ICSPR Flanagan Award winners for distinguished service as an ICPSR OR, http://www.icpsr.umich.edu/icpsrweb/ICPSR/support/announcements/2013/07/icpsr-announces-2013-warren-e-miller

UC-Irvine recognizes Dan here, http://www.lib.uci.edu//features/spotlights/dt-award.html
Perhaps a Canadian colleague has a similar link for Wendy.

Congratulations to both Dan and Wendy!

The Role of Data Repositories in Reproducible Research

Cross posted from ISPS Lux et Data Blog

These questions were on my mind as I was preparing to present a poster at the Open Repositories 2013 conference in Charlottetown, PEI earlier this month. The annual conference brings the digital repositories community together with stakeholders, such as researchers, librarians, publishers and others to address issues pertaining to “the entire lifecycle of information.” The conference theme this year, “Use, Reuse, Reproduce,” could not have been more relevant to the ISPS Data Archive. Two plenary sessions bookended the conference, both discussing the credibility crisis in science. In the opening session, Victoria Stodden set the stage with her talk about the central role of algorithms and code in the reproducibility and credibility of science. In the closing session, Jean-Claude Guédon made a compelling case that open repositories are vital to restoring quality in science.

My poster, titled, “The Repository as Data (Re) User: Hand Curating for Replication,” illustrated the various data quality checks we undertake at the ISPS Data Archive. The ISPS Data Archive is a small archive, for a small and specialized community of researchers, containing mostly small data. We made a key decision early on to make it a "replication archive," by which we mean a repository that holds data and code for the purpose of being used to replicate and verify published results.

The poster presents ISPS Data Archive’s answer to the questions of who is responsible for the quality of data and what that means: We think that repositories do have a responsibility to examine the data and code we receive for deposit before making the files public, and that this data review involves verifying and replicating the original research outputs. In practice, this means running the code against the data to validate published results. These steps in effect expand the role of the repository and more closely integrate it into the research process, with implications for resources, expertise, and relationships, which I will explain here.
First, a word about what data repositories usually do, the special obligations reproducibility imposes, and who is fulfilling them now. This ties in with a discussion of data quality, data review, and the role of repositories.

Data Curation and Data Quality

A well-curated data repository is more than a place to put data. The Digital Curation Center (DCC) explains that data curation means ensuring data are accessible to designated users for first time use and reuse. This involves a set of curatorial practices – maintaining, preserving and adding value to digital research data throughout its lifecycle – which reduces threat to the long-term research value of the data, minimizes the risk of its obsolescence, and enables sharing and further research. An example of a standard-setting curation process is the Inter-university Consortium for Political and Social Research (ICPSR). This process involves organizing, describing, cleaning, enhancing, and preserving data for public use and includes format conversions, reviewing the data for confidentiality issues, creating documentation and metadata records, and assigning digital object identifiers. Similar data curation activities take place at many data repositories and archives.

These activities are understood as essential for ensuring and enhancing data quality. Dryad, for example, states that its curatorial team “works to enforce quality control on existing content.” But there are many ways to assess the quality of data. One criterion is verity: Whether the data reflect actual facts, responses, observations or events. This is often assessed by the existence and completeness of metadata. The UK’s Economic and Social Research Council (ESRC), for example, requests documentation of “the calibration of instruments, the collection of duplicate samples, data entry methods, data entry validation techniques, methods of transcription.” Another way to assess data quality is by its degree of openness. Shannon Bohle recently listed no less than eight different standards for assessing the quality of open data on this dimension. Others argue that data quality consists of a mix of technical and content criteria that all need to be taken into account. Wang & Strong’s 1996 article claims that, “high-quality data should be intrinsically good, contextually appropriate for the task, clearly represented, and accessible to the data consumer.” More recently, Kevin Ashley observed that quality standards may be at odds with each other. For example, some users may prize the completeness of the data while others their timeliness. These standards can go a long way toward ensuring that data are accurate, complete, and timely and that they are delivered in a way that maximizes their use and reuse.

Yet these procedures are “rather formal and do not guarantee the validity of the content of the dataset” (Doorn et al). Leaving aside the question of whether they are always adhered to, these quality standards are insufficient when viewed through the lens of “really reproducible research.” Reproducible science requires that data and code be made available alongside the results, to allow regeneration of the published results. For a replication archive, such as the ISPS Data Archive, the reproducibility standard is imperative.

Data Review

The imperative to provide data and code, however, only achieves the potential for verification of published results. It remains unclear as to how actual replication occurs. That’s where a comprehensive definition of the concept of “data review” can be useful: At ISPS, we understand data review to mean taking that extra step – examining the data and code received for deposit and verifying and replicating the original research outputs.

In a recent talk, Christine Borgman pointed out that most repositories and archives follow the letter, not the spirit, of the law. They take steps to share data, but they do not review the data. “Who certifies the data? Gives it some sort of imprimatur?” she asks. This theme resonated at Open Repositories. Stodden asked: “Who, if anyone, checks replication pre-publication?” Chuck Humphrey lamented the lack of an adequate data curation toolkit and best practices regarding the extent of data processing prior to ingest. And Guédon argued that repositories have a key role to play in bringing quality to the foreground in the management of science.

Stodden’s call for the provision of data and code underlying publication echoes Gary King’s 1995 definition of the “replication standard” as the provision of, “sufficient information… with which to understand, evaluate, and build upon a prior work if a third party could replicate the results without any additional information from the author.” Both call on the scientific community to take up replication for the good of science as a matter of course in their scientific work. However, both are vague as to how this can be accomplished. Stodden suggested at Open Repositories that this activity is community-dependent, often done by students or by other researchers continuing a project, and that community norms can be adjusted by rewarding high integrity, verifiable research. King, on the other hand, argues that “the replication standard does not actually require anyone to replicate the results of an article or book. It only requires sufficient information to be provided – in the article or book or in some other publicly accessible form – so that the results could in principle be replicated” (emphasis added in italics). Yet, if we care about data quality, reproducibility, and credibility, it seems to me that this is exactly the kind of review in which we should be engaging.

A quick survey of various stakeholders in the research data lifecycle reveals that data review of this sort is not widely practiced:

  • Researchers, on the whole, do not do replication tests as part of their own work, or even as part of the peer review process. In the future, they may be incentives for researchers to do so, and post-publication crowd-sourced peer review in the mold of Wikipedia, as promoted by Edward Curry, may prove to be a successful model.
  • Academic institutions, and their libraries, are increasingly involved in the data management process, but are not involved in replication as a matter of course (note some calls for libraries to take a more active role in this regard).
  • Large or general data repositories like Dryad, FigShare, Dataverse, and ICPSR provide useful guidelines and support varying degrees of file inspection, as well as make it significantly easier to include materials alongside the data, but they do not replicate analyses for the purpose of validating published results. Efforts to encourage compliance with (some of) these standards (e.g., Data Seal of Approval) typically regard researchers responsible for data quality, and generally leave repositories to self-regulate.
  • Innovative services, such as RunMyCode, offer a dissemination platform for the necessary pieces required to submit the research to scrutiny by fellow scientists, allowing researchers, editors, and referees to “replicate scientific results and to demonstrate their robustness.” RunMyCode is an excellent facilitator for people who wish to have their data and code validated; but it relies on crowd sourcing, and does not provide the service per se.
  • Some argue that scholarly journals should take an active role in data review, but this view is controversial. A document produced by the British Library recently recommended that, “publishers should provide simple and, where appropriate, discipline-specific data review (technical and scientific) checklists as basic guidance for reviewers.” In some disciplines, reviewers do check the data. The F1000 group identifies the “complexity of the relationship between the data/article peer review conducted by our journal and the varying levels of data curation conducted by different data repositories.” The group provides detailed guidelines for authors on what is expected of them to submit and ensures that everything is submitted and all checklists are completed. It is not clear, however, if they themselves review the data to make sure it replicates results. Alan Dafoe, a political scientist at Yale, calls for better replication practices in political science. He places responsibility on authors to provide quality replication files, but then also suggests that journals encourage high standards for replication files and that they conduct a “replication audit” which will “evaluate the replicability and robustness of a random subset of publications from the journal.”

The ISPS Data Archive and Reproducible Research

This brings us to the ISPS Data Archive. As a small, on-the-ground, specialized data repository, we are dedicated to serious data review. All data and code – as well as all accompanying files – that are made public via the Archive are closely reviewed and adhere to standards of quality that include verity, openness, and replication. In practice it means that we have developed curatorial practices that include assessing whether the files underlying a published (or soon to be published) article, and provided by the researchers, actually reproduce the published results.

This requires significant investment in staffing, relationships, and resources. The ISPS Data Archive staff has data management and archival skills, as well as domain and statistical expertise. We invest in relationships with researchers and learn about their research interests and methods to facilitate communication and trust. All this requires the right combination of domain, technical and interpersonal skills as well as more time, which translates into higher costs.

How do we justify this investment? Broadly speaking, we believe that stewardship of data in the context of “really reproducible research” dictates this type of data review. More specifically, we think this approach provides better quality, better science, and better service.

  • Better quality. By reviewing all data and code files and validating the published results, the ISPS Data Archive essentially certifies that all its research outputs are held to a high standard. Users are assured that code and data underlying publications are valid, accessible, and usable.
  • Better science. Organizing data around publications advances science because it helps root out error. “Without access to the data and computer code that underlie scientific discoveries, published findings are all but impossible to verify” (Stodden et al.) Joining the publication to the data and code combats the disaggregation of information in science associated with open access to data and to publications on the Web. In effect, the data review process is a first order data reuse case: The use of research data for research activity or purpose other than that for which it was intended. This places the Archive as an active partner in the scientific process as it performs a sort of “internal validity” check on the data and analysis (i.e., do these data and this code actually produce these results?).

    It’s important to note that the ISPS Data Archive is not reviewing or assessing the quality of the research itself. It is not engaged in questions such as, was this the right analysis for this research question? Are there better data? Did the researchers correctly interpret the results? We consider this aspect of data review to be an “external validity” check and one which the Archive staff is not in a position to assess. This we leave to the scientific community and to peer review. Our focus is on verifying the results by replicating the analysis and on making the data and code usable and useful.

  • Better service. The ISPS Data Archive provides high level, boutique service to our researchers. We can think of a continuum of data curation that progresses from a basic level where data are accepted “as is” for the purpose of storage and discovery, to a higher level of curation which includes processing for preservation, improved usability, and compliance, to an even higher level of curation which also undertakes the verification of published results.

This model may not be applicable to other contexts. A larger lab, greater volume of research, or simply more data will require greater resources and may prove this level of curation untenable. Further, the reproducibility imperative does not neatly apply to more generalized data, or to data that is not tied to publications. Such data would be handled somewhat differently, possibly with less labor-intensive processes. ISPS will need to consider accommodating such scenarios and the trade-offs a more flexible approach no doubt involves.

For those of us who care about research data sharing and preservation, the recent interest in the idea of a “data review” is a very good sign. We are a long way from having all the policies, technologies, and long-term models figured out. But a conversation about reviewing the data we put in repositories is a sign of maturity in the scholarly community – a recognition that simply sharing data is necessary, but not sufficient, when held up to the standards of reproducible research.

OR2013: Open Repositories Confront Research Data

Open Repositories 2013 was hosted by the University of Prince Edward Island from July 8-12. A strong research data stream ran throughout this conference, which was attended by over 300 participants from around the globe.  To my delight, many IASSISTers were in attendance, including the current IASSIST President and four Past-Presidents!  Rarely do such sightings happen outside an IASSIST conference.

This was my first Open Repositories conference and after the cool reception that research data received at the SPARC IR meetings in Baltimore a few years ago, I was unsure how data would be treated at this conference.  I was pleasantly surprised by the enthusiastic interest of this community toward research data.  It helped that there were many IASSISTers present but the interest in research data was beyond that of just our community.  This conference truly found an appropriate intersection between the communities of social science data and open repositories. 

Thanks go to Robin Rice (IASSIST), Angus Whyte (DCC), and Kathleen Shearer (COAR) for organizing a workshop entitled, “Institutional Repositories Dealing with Data: What a difference a ‘D’ makes!”  Michael Witt, Courtney Matthews, and I joined these three organizers to address a range of issues that research data pose for those operating repositories.  The registration for this workshop was capped at 40 because of our desire to host six discussion tables of approximately seven participants each.  The workshop was fully subscribed and Kathleen counted over 50 participants prior to the coffee break.  The number clearly expresses the wider interest in research data at OR2013.

Our workshop helped set the stage for other sessions during the week.  For example, we talked about environmental drivers popularizing interest in research data, including topics around academic integrity.  Regarding this specific issue, we noted that the focus is typically directed toward specific publication-related datasets and the access needed to support the reproducibility of published research findings.  Both the opening and closing plenary speakers addressed aspects of academic integrity and the role of repositories in supporting the reproducibility of research findings.  Victoria Stodden, the opening plenary speaker, presented a compelling and articulate case for access to both the data and computer code upon which published findings are based.  She calls herself a computational scientist and defends the need to preserve computer code as well as data to facilitate the reproducibility of scientific findings.  Jean-Claude Guédon, the closing plenary speaker, bracketed this discussion on academic integrity.  He spoke about scholarly publishing and how the commercial drive toward indicators of excellence has resulted in cheating.  He likened some academics to Lance Armstrong, cheating to become number one.  He feels that quality rather than excellence is a better indicator of scientific success.

Between these two stimulating plenary speakers, there was a number of sessions during which research data were discussed.  I was particularly interested in a panel of six entitled, “Research Data and Repositories,” especially because the speakers were from the repository community instead of the data community.  They each took turns responding to questions about what their repositories do now regarding research data and what they see happening in the future.  In a nutshell, their answers tended to describe the desire to make better connections between the publications in their repositories with the data underpinning the findings in these articles.  They also spoke about the need to support more stages of the research lifecycle, which often involves aspects of the data lifecycle within research.  There were also statements that reinforced the need for our (IASSIST’s) continued interaction with the repository community.  The use of readme files in the absence of standards-based metadata and other practices, where our data community has moved the best-practice yardstick well beyond, demonstrate the need for our communities to continue in dialogue. 

Chuck Humphrey

IASSIST Fellows 2013

 

The IASSIST Fellows Committee is glad to announce through this post the six recipients of the 2013 IASSIST Fellowship award. We are extremely excited to have such a diverse and interesting group with different backgrounds and experience and encourage IASSISTers to welcome them at our conference in Cologne, Germany.

Please find below their names, countries and brief bios:

Chifundo Kanjala (Tanzania) 

Chifundo currently works as a Data Manager and data documentalist for an HIV research group called ALPHA network based at London School of Hygiene and Tropical Medicine's department of Population Health, Chifundo spends most of his time in Mwanza, Tanzania but do travel from time around Southern and Eastern Africa to work with colleagues in the ALPHA network.Before joining the London School of Hygiene and Tropical Medicine, he was working as a Data analyst consultant at Unicef, Zimbabwe.Currently working part time on a PhD with London school of Hygiene and Tropical Medicine. He has an MPhil in Demography from university of Cape Town, South Africa and a BSc Statistics Honours degree from University of Zimbabwe.


Judit Gárdos (Hungary) 

Judit Gárdos studied Sociology and German Language and Literature in Budapest, Vienna and Berlin. She is PhD-candidate in sociology, with a topic on the philosophy, sociology and anthropology of quantitative sociology. She is young researcher at the Institute of Sociology of the Hungarian Academy of Sciences. Judit has been working at the digital archive and research group called "voicesofthe20century.hu" that is collecting qualitative, interview-based sociological research collections of the last 50 years. She is coordinating the work at the newly-funded Research Documentation Center of the Center for Social Sciences at the Hungarian Academy of Sciences.


Cristina Ribeiro (Portugal) 

Cristina Ribeiro is an Assistant Professor in Informatics Engineering at Universidade do Porto and a researcher at INESC TEC. She has graduated in Electrical Engineering, holds a Master in Electrical and Computer Engineering and a Ph.D. in Informatics. Her teaching includes undergraduate and graduate courses in information retrieval, digital libraries, knowledge representation and markup languages. She has been involved in research projects in the areas of cultural heritage, multimedia databases and information retrieval. Currently her main research interests are information retrieval, digital preservation and the management of research data.


Aleksandra Bradić-Martinović (Serbia) 

Aleksandra Bradić-Martinović, PhD is the Research Fellow at the Institute of Economic Sciences, Belgrade, Serbia. Her field of expertize is research of information and communication technology implementation in economy, especially in banking, payment system operations and stock exchange operations. Aleksandra is also engaged in education process in Belgrade Banking Academy at the following subjects: E-banking and Payment Systems, Stock Market Dealings and Management Information Systems. She was engaged at several projects in the field of education. At the FP7 SERSCIDA project she is a Serbia team coordinator.


Anis Miladi (Tunisia) 

Anis Miladi earned his Bachelor degree in computer sciences and multimedia in 2007 and a Master degree in Management of Information Systems and organizations in 2008 and he is currently finalizing his master degree in project management(projected date summer 2013). Before joining the Social and Economic Survey Research Institute at Qatar University as Survey Research technology specialist in 2009, he worked as a programmer analyst in a private IT services company In Tunisia. His Area of expertise includes managing computer assisted surveys CAPI,CATI(Blaise surveying system)  in addition to Enterprise Document Management Systems, Enterprise Portals (SharePoint).


Lejla Somun-Krupalija (Sarajevo) 

Lejla currently serves as the Senior Program and Research Officer at the Human Rights Centre of the University of Sarajevo. She has over 15 years of experience in research, policy development in social inclusion issues. She is the Project Coordinator of the SERSCIDA FP7 project that aims to open data services/archives in the Western Balkan region in cooperation with CESSDA members. She had been engaged in the NGO sector previously, particularly on issues of capacity building and policy development in the areas of gender equality, the rights of persons with disabilities and issues of social inclusion and forced migration. She teaches academic writing, qualitative research, and gender and nationalism at the University of Sarajevo. 

  • IASSIST Quarterly

    Publications Special issue: A pioneer data librarian
    Welcome to the special volume of the IASSIST Quarterly (IQ (37):1-4, 2013). This special issue started as exchange of ideas between Libbie Stephenson and Margaret Adams to collect

    more...

  • Resources

    Resources

    A space for IASSIST members to share professional resources useful to them in their daily work. Also the IASSIST Jobs Repository for an archive of data-related position descriptions. more...

  • community

    • LinkedIn
    • Facebook
    • Twitter

    Find out what IASSISTers are doing in the field and explore other avenues of presentation, communication and discussion via social networking and related online social spaces. more...