Regional Report 2007-2008: Canada
IASSIST Regional Report 2007-2008
University of Guelph
May 16, 2008
As per usual it is a challenge to try and capture what is happening from such a broad and engaged community. I can only imagine the challenge for the other regional secretaries. I thank all those who submitted content and I apologize if I missed something. These are in no specific order.
Research Data Network (RDC)
The Canadian Research Data Centre Network's infrastructure project has begun with a test-bed implementation of an articulated private virtual network running on a dedicated lightpath on Canada's national high-speed research network (managed by CANARIE - Canada's advanced Internet development organization). Beginning in the fall of 2008, this intranet will be expanded to incorporate the rest of the RDC Network, which now consists of 15 RDCs and nine branches across the country. In addition, Sonia Latour has been hired as the Metadata Project Manager. Sonia is on secondment from Statistics Canada where she has over 20 years of experience in Census operations. An RFP for software tools development to support DDI 3.0 metadata production will be announced sometime in June 2008. This RFP includes tools to help with the conversion of metadata from earlier versions of DDI to DDI 3.0 and tools to help researchers produce DDI-compliant metadata of working data files, that is, the data files researchers produce in RDCs while conducting their analyses. A second RFP will be announced in early 2009 for the development of tools to analyze DDI metadata for comparative purposes.
<odesi> Ontario Data Extraction System and Infrastructure
Just one year ago, the <odesi> project, building a data portal designed to facilitate data exploration and extraction, received funding from the Ontario Council of University Libraries (OCUL) and broader Public Sector (BPS) Supply Chain Secretariat, Ontario Ministry of Finance (OntarioBuys). Since then, much has been accomplished: Paula Hurtibise, the project manager was hired, co-op students have marked-up over 400 datasets including demographic data from Statistics Canada and polling data from Gallup and on May 4, 2008 <odesi> was launched on the highly successful Scholars Portal website.
Key to the success of <odesi> is the DDI, Best Practices Document (BPD) created by the <odesi> project team. The BPD is widely used by Canadian institutions for marking-up files in DDI format and is available through odesi.ca in both French and English.
The western Canadian library data services group (ACCOLEDS) is undertaking a pilot project with the editors of a couple of open access journals to mark-up statistical tables. The objective is to mark-up content about tables that will allow better searching for statistical information in these tables and that will enable linking between tables and the data sources from which they were derived. Maxine Tedesco (an IASSIST member from the University of Lethbridge) will be spearheading this pilot while on study leave in 2008.
Two new significant national policies have been announced in Canada since the last IASSIST conference. First, Canadian Institutes of Health Research (CIHR) announced in the fall of 2007 its new open access policy regarding research outputs (http://www.cihr-irsc.gc.ca/e/32005.html). While the emphasis of the new policy is primarily on creating access to research findings, it does require the deposit of bioinformatics, atomic, and molecular coordinate data with appropriate data repositories. In these cases, international repositories already exist. As trusted data repositories relevant to health data are developed in Canada, there will be a movement to expand the policy's coverage to include the deposit of more data produced under CIHR funding.
Second, Library and Archives Canada concluded at the end of 2007 a public consultation to develop a Canadian Digital Information Strategy. This process resulted in a document detailing the principles and activities needed in Canada to produce, preserve and provide access to digital content arising in the heritage, scientific and government sectors (http://www.collectionscanada.gc.ca/cdis/012033-1000.01-e.html). Research data are clearly identified in this strategy document as valuable digital resources that Canadians need to preserve and to provide access.
In a related national activity, the Canada Institute for Scientific and Technical Information (CISTI), which is Canada's national science library, is sponsoring a new working group, known as Research Data Canada, to address issues raised in the Consultation on Access to Scientific Research Data, which have remained untouched for the most part of the past two years. This new working group represents several Canadian organizations, including universities, federal granting agencies, institutes, libraries and individual researchers. They will focus on the necessary actions and leadership roles that researchers and institutions must take to ensure Canada's research data are accessible and usable for current and future generations of researchers.
CARL (Canadian Association of Research Libraries
CARL formed a new Data management Working Group after a 2007 survey they conducted revealed that most of their member libraries are interested in holding researcher-generated data, but few have a formal data preservation policy. This working group consists of five directors from member libraries. (http://www.carl-abrc.ca/about/working_groups/data_mgt_mandate-e.html)
Discussions in CANDDI tapered off with the emergence of the DLI DDI and <odesi> projects. While the CANDDI template helped facilitate discussion around the best practices document that was subsequently developed in conjunction with <odesi>, all other activity pretty much disappeared in CANDDI. Recent interest, however, has surfaced to revitalize CANDDI to address some of the other issues for which it was originally organized, including shared variable group names and control vocabulary.
In Canada, there have been some positive developments in addition to some continuing concerns. The group working on the 2006 Census PUMFs has come up with a compromise that would see two files developed. The first would be an individual file (2% sample); the second would be hierarchical in nature (1% sample). This represents a move from the original position which would have eliminated the individual file. The committee is to be commended for it extensive consultation and creativity in the face of competing demands.
The Canadian Household Panel Survey has been working on producing a longitudinal, synthetic file that would allow users to create models that mimic the actual confidential data without risk of disclosure. While not PUMFs, these will be public and available for preliminary research. Both the above innovations have been driven by international demand.
On the other hand, there is a committee studying the future of PUMFs. The fact that the value of PUMFs greatly outweighs their cost may not be fully appreciated. With 74 post-secondary institutions and projects such as <odesi> serving to make PUMFs ever-more accessible, the cost of using these files is approaching zero. Curtailing the production of PUMFs would result in the same sort of situation we were in pre-DLI.
OCUL (Ontario Council of University Libraries) Map Group
The OCUL map group is developing a proposal to create a Geospatial Data Portal. It is the vision of the proposal to establish a shared infrastructure, operated as part of OCUL’s Scholars Portal services for storing and delivering geospatial data to students, faculty, staff, and researchers affiliated with Ontario’s 19 universities. This infrastructure will enable the development of a shared interface for search, discovery and access to geospatial data and new models for acquiring and licensing geospatial data through consortium purchasing.
DLI continues to work on making Public Use Microdata File's (PUMF) documentation available on the DLI site at Statistics Canada. They provide access to this metadata using NESSTAR. DLI is also working closely with Statistics Canada's subject-matter divisions in order to eventually have DDI compliant files exported directly from the Survey production line to tools like NESSTAR. There is also work being done in going from IMDB which is the standard used in Statistics Canada for metadata to DDI and from DDI to IMDB. This will facilitate greatly the exchange of information from one standard to the other. In addition DLI membership at the college level is beginning to grow.
In addition, DLI has budgeted for two R&D projects focusing on making metadata more compatible. The first is to develop a two-way crosswalk between the DDI and the IMDB. The second is to investigate the feasibility of creating a wizard to take Beyond 20/20 files and mark them up as DDI-compliant cubes. That would mean that we could augment the metadata and move the B20/20 files out of their proprietary format and into a preservable mode.
We are now creating data about data. :-) The second survey on the Statistics Canada DLI program has been conducted in March-April 2008. The participation of the Canadian data librarians has been fantastic. Final response rate: Overall 82% (97/118) DLI contacts 92% (66/72) Designates 67% (31/46). Gaetan Drolet and folks are now cleaning the data to produce a SPSS file. The data will be analysed this summer by Wendy Watkins and Gaetan Drolet. They plan to split the data by DLI regions for the DLI training coordinators. They will present a report about the training information collected to the DLI training committee in September 2008 and a report to the EAC for future planning of the DLI program. They also plan to disseminate the data to the DLI community later in a DDI format with NESSTAR and possibly make presentations to DLI workshops and eventually IASSIST.
Public Opinion Data in Canada
In the last report we mentioned the use of NESSTAR for the Canadian Opinion Research Archive (CORA) at Queen’s University. Queen’s has added approximately 100 files to their Data Centre's Nesstar service, and are continuing to support the Nesstar server for the CORA. The CORA data is available to public by request. In addition they have been soliciting data files from their researchers and are currently negotiating access to the "International surveillance and privacy opinion research" survey from their Department of Sociology at Queen's.
During the year the Laurier Institute for the Study of Public Opinion and Policy (LISPOP) received a generous donation from IPSOS- Reid. This includes many public opinion surveys, election surveys and other types of surveys. LISPOP is working closely with ODESI to mark-up this rich collection, with a target of making this publicly available.
Discussions are beginning with <odesi> to explore other sources of opinion poll data housed at partner insititutions.
Anastassia Khouri, who worked to create comprehensive digital data-government-info-maps support services at McGill has announced that she will retire at the end of April. In 1996-1997 she began the process of designing EDRS (Electronic Data Resources Service) within the mandate of the McGill Libraries. In 2003, Anastassia was given the mandate to coordinate the Hitschfeld Geographic Information Centre, the Government Documents Division along with EDRS to form the Government Information, Maps and Data Centre. We wish Anastasia well in her retirement.
In 2006, Suzette Giles received the Ontario Confederation of University Faculty Associations (OCUFA) award for Academic Librarianship. Suzette was recognized for her many contributions in the field of data, map, and GIS librarianship. Congratulations Suzette.
In 2007, the Council of Prairie and Pacific University Libraries (COPPUL) presented Chuck Humphrey with their Outstanding Contribution Award. Chuck was specifically recognized for all his contributions related to ACCOLEDS, the COPPUL data group – congratulations Chuck.
That's all for this year.