Open data, open research discussion at ICTD 2012

The Global Impact Study uses an open research approach, meaning the study’s data, instruments, publications, and other materials produced, will be available to the public for free under a Creative Commons license. Over the course of the study, the Global Impact Study team has spent a lot of time thinking about the complexities, practicalities, and implications of open research and opening up large datasets to the public. We’ve discussed some of these issues at other venues in the past, such as at IFLA 2010. However, these issues have recently become even more important now as we have released our survey instruments, are preparing our data for release, and will have findings and resources to share. As such, we held a session during ICTD 2012 in Atlanta to discuss open data and open research.

Photo courtesy of Sara Vannini

After providing an overview of the Global Impact Study, survey instruments, and open research and open data, we divided the participants into three groups (the public/users, researchers, and sponsors/funders) to hear perspectives from different types of stakeholders. Each group discussed five questions regarding open data and open research:

When does data sharing begin?
How do you determine conditions of use (of the data)?
What data are open? Are some data not?
How can value be added to open data?
How can people add data to the existing open database?

After discussing the questions, we came back together as a group to share what each “audience” of open data discussed. Even though it is challenging to talk about the multitude of issues surrounding open data in a short session, we received a considerable amount of relevant and useful feedback. It was interesting to see similarities, as well as glaring differences, among the three different stakeholder groups. Below is a synopsis of how each group responded to the questions:

The public (users): Data should be released as soon as possible. More than data, however, they want findings in a usable way, such as summaries. They also want the data too, but in a format and platform that is easy to use. The data should be shared with the least-restrictive Creative Commons license, and all data should be open. Value can be added by offering summaries, snapshots of findings, and examples of how to use the data. The public would likely not be adding much data to the database, so this was not a top issue in their discussion.

Researchers: Data should not be open until the researchers have released their initial findings, unless the sponsor/funder of the research requires something else. The data should fall under a non-commercial Creative Commons license that requires attribution and citation of the original research. Some data should not be open due to confidentiality issues and the potential of mis-use. Value can be added to the data by providing rigorous data cleaning, clear methodology notes, definitions of terms, overview of the research process, and discussion of the limitations of the research and data. A tool for data analysis and visualization, like GapMinder, would also be appreciated.

Sponsors/funders: Data should be open as quickly as possible. There is a debate about whether or not open data should be available for non-commercial use only or commercial use as well, especially with public funding, as commercial entities should not make money off of it, but it also depends on the type of data. All data should be made available to researchers, with a subset of data made available to the public. No data should only be available to the sponsors/donors. Value can be added to the data by providing subsets of data based on particular variables, along with descriptions and analysis. Some analysis needs to be done in order for the data to be useful. Perhaps the open database could be community-owned and run so people can add other data in the future.

As evident in the summaries above, the public, researchers, and sponsors/funders all have different considerations, priorities, and ideals of an open data approach. There are many factors, from a variety of viewpoints, that go into an open research approach and opening up data to the public. In our experience so far, we have learned a lot about open research and open data, and we are sure to learn more. We hope to share some of our lessons learned as we move forward. As always, we encourage you to share your comments, ideas, and relevant resources with us.