Sun 23 May 2010
This section, from the Federal Libraries Section, featured several speakers on issues of data curation and sharing. I’m not terribly familiar with the “e-science” jargon, but for the purposes of this session it seemed to mean some focus on data sharing, management and curation, and the role of librarians in supporting this data curation and scientific discovery in general. Some notes from each:
Discussion how the practice of discovery in science is being transformed by
access to information and data sets, computational capacities and tools, and increased abillity to collaborate. Carol noted that librarians have an important role in e-science through making connections, serving as part of teams, and providing access to data. Carol spoke about the DataONE project and its goal to make data available to people now and in the future, crossing institutional boundaries, focusing specifically on earth and environmental sciences data and making that data interoperable to aid discovery.
Carol also spoke about the data loss that can occur (such as when people retire), and the sociocultural issues of scientists that must be overcome in order to get them to share their data. Further, she addressed issues of interoperability, and the problem that one data standard cannot be imposed across groups and institutions.
Carol also spoke about ScienceLinks2, an IMLS-funded initiative to educate PhD students who serve as educators of library science students on roles in data management
Carole also spoke about data curation and the role of librarianship in that process, noting that we as a profession have spent a long time becoming masters of bibliographic world, and are now being asked to be masters of data world in short order.
She spoke about the Data Conservancy, a project to manage research data through its lifecycle of interest and usefulness to scholarship, science and education. Carole offered metaphors from two other individuals who referred to data centers as either the new stacks or the new special collections.
She also spoke of LIS as a meta-science, with access to broad landscape of info across disciplines and generations, promoting sharing (what we would previously call circulation), and the need to get scientists back to science, using their time and expertise on science instead of data management issues.
Carole noted that some of the challenges for data curation are coming to terms with what a unit of data is, what will be described and made available, what is a data set, what is a version, and which of these are preservation targets as we can’t save them all. Carole also explained that they have implemented some strategies for building the data curation workforce in the GSLIS program at Illinois – through their Data Curation Education Program – including concentrations for LIS students in data curation for the sciences and humanities, as well as summer institutes available to librarians for professional development.
Note: a paper of Carole’s in Science was mentioned by the moderator; that seems to be Strategic Reading, Ontologies, and the Future of Scientific Publishing in the August 2009 issue.
James spoke about virtual research environments at the NIH library. He spoke about one major problem in getting buy in for data sharing, that once scientists have done the work of creating data, they want to get every paper they can out of it before sharing the dataset. He noted that there is a trust factor, and that publishers and funders may need to mandate data deposit to really move this forward.
He also described the NIH Library’s use of Drupal to build collaborative websites for their research communities, as well as the Pandemic Influenza Digital Archive which includes about 5,000 items (with 5,000 more identified) on the 1918 flu pandemic. Its purpose is to help inform how we deal with pandemics going forward by looking at historical pandemic data and applying today’s technology to it, such as geospatial data. He indicated that similar sites are in the works on Down Syndrome and a dental/oral interventions database.
Link please? I missed the link for the flu archive and can’t seem to dig it up – if you have it, please share in the comments!
Elaine spoke about how it’s hard for librarians to get in and talk to researchers and their needs and how librarians can serve them, in part because they don’t think they have needs except journals delivered to the desktop. She then described a project to assess the needs of librarians who would be involved in e-science, which resulted in finding that librarians expressed a need for education, not just on advanced topics such as genetics, informatics, and data curation, but on basic science topics.
In response, Elaine and her colleagues have been offering science bootcamps, one day workshops of science topics based on research at the campuses (such as stem cell technology). They are having annual e-science symposiums, and have received funding from NLM to build an e-science portal to bring together tools for librarians. A previous boot camp covered science topics such as medical informatics, nanotechnology, and GIS), and an upcoming sumposium is going to cover genetics, climate change, and remote sensing.
She also noted that many of their campus research centers have K-12 education as part of their mission, and they can readily adapt that to educating librarians. Ahem.
Thanks to the presenters for sharing their info with us! If anyone has notes from the session, feel free to amend or append in the comments!