Agricultural research can take seasons to come to fruition, meaning the data researchers gather is voluminous, tracking things like weather patterns and crop yields over years. A failure to establish data standards and sharing practices means that most of these raw figures never make it out of the hands of the researchers who gather them. With new open access standards coming to federally funded research, though, agricultural researchers will need to share their data more effectively, and a team of scientists and librarians at Purdue University, West Lafayette, IN, may have the first blueprint for the field.
The Purdue team was the driving force behind the Smarter Agriculture conference that took place at the Bolger Center, Potomac, MD, on October 10. Despite being hampered by a lack of representation from federal agencies owing to the government shutdown, the conference brought together attendees from universities, farming concerns, and private enterprise to discuss how to create a system for sharing agricultural data. In large part, said Sylvie Brouder, a Purdue agronomist who's been working on the problem of data-sharing in agricultural sciences for years, it's a system that stakeholders will have to build from the ground up.
"Some disciplines are further along the path in sharing data, but agronomists are pretty far behind," Brouder said. "The idea of making your data shareable is new to the field. We share the synthesized results of experiments in published papers but not much else, and the result is we don't have strong standards and norms for data or metadata in the way other sciences do."
Getting that data to a point where it's shareable could not only improve understanding of complicated subject areas but make study more efficient by taking advantage of work that's already been done. Raw data from Brouder's work on water quality, for example, could be used by researchers looking at related topics as well--if only they could access it. "On an hourly basis, I'm collecting data on the flow of water through soil," said Brouder. "Someone else may want to know the day's rainfall and the nutrient load that's taken away by a rain. That person could do that with my data, but that data is not available to them."
It's not that most researchers are unwilling to share the information they've gathered, Brouder said. When a fellow agricultural scientist calls looking for data, she's generally more than willing to of for it up. But sharing as a rule, rather than on request, needs to become part of the culture of the field. And that will start with students, said Brouder. "We need to start by figuring out how to prepare students for a data-intensive world."
That's where Marianne Stowell-Bracke comes in. Stowell-Bracke is Purdue's agricultural sciences information specialist and acts as a liaison between the library and the ag science department. She and her colleague Jake Carlson have been working on ways to make data management part of the agricultural science curriculum, and they'll start testing two styles of doing so in the spring semester.
On one front, the department will start offering its first data management class to students in the biochemistry major, which at Purdue is under the umbrella of the agricultural sciences department. Previously, they had been bringing the subject into classes here and there as visitors, but this approach hasn't been sufficient to get the concepts to take root.
That's why they're trying multiple methods to find what works best. In addition to making a class available, Stowell-Bracke and Carlson will be taking a cohort of graduate students under their wing to teach them how to use best practices to manage their own data, something many haven't been exposed to previously. "People don't give their data a thought until something bad happens," Stowell-Bracke said. "They store everything on their laptop, and leave the laptop on a bus in Madrid."
The Purdue team is also working to build a model of what a shared, searchable agricultural science database would look like with its recently launched data management tool, the Purdue University Research Repository (PURR). Built on the HubZero platform and operated by Purdue Libraries and the Information Technology department, PURR provides primers on how to manage data, collaborate on projects, and publish data with metadata tags and DOI references for tracking how often and where its cited.
While they have a model to work from and a lot of new interest from farmers, agribusinesses, and academics, there are still obstacles to getting agricultural scientists managing and sharing their data more effectively, not the least of which is the cost involved. "Someone has to pay for this, but it's unclear who or what the funding model looks like," said Brouder, who has been surprised by the costs of managing data properly. "One clear outcome we'd like to see is that our funding agencies have an explicit funding line for data management, which we've never had before."
It's not just funding that's an issue, though, said Brouder. Agricultural research often involves a wide variety of stakeholders, from university researchers to federal funding agencies like the Department of Agriculture, from individual farmers to agribusiness concerns like Monsanto. Getting them all on the same page about what information should be shared, how it should be made available, and who should have access to it. Independent farmers don't generally like the idea of having information about their land made common knowledge, said Brouder, while businesses may have concerns about footing the bill for research that could benefit their competitors as well. While many concerns remain about how to address the issue, there's little doubt it needs to be addressed, and soon. The drive toward open access to federally funded research, outlined earlier this year in a memo from the Office of Science and Technology Policy, means that better data management likely won't be optional for agricultural researchers in the future.