By Linda See, IIASA Ecosystems Services and Management Program
One of the biggest questions when it comes to citizen science is the quality of the data. Scientists worry that citizens are not as rigorous in their data collection as professionals might be, which calls into question the reliability of the data. At a meeting this month in Brussels on using citizen science to track invasive species, we grappled with the question: what it will take to trust this data source, particularly if it’s going to be used to alert authorities regarding the presence of an invasive species in a timely manner.
This discussion got me thinking about what other types of data are supplied by citizens that authorities simply trust, for example, when a citizen calls the emergency services to report an incident, such as a fire. Such reports are investigated by the authorities and the veracity of the alert is not questioned. Instead authorities are obliged to investigate such reports.
Yet the statistics show that false alarms do occur. For example, in 2015, there were more than 2.5 million false fire alarms in the United States, of which just under a third were due to system malfunctions. The remaining calls were unintentional, malicious, or other types of false alarms, such as a bomb scare. Statistics for calls to the emergency services more generally show similar trends in different European countries, where the percentage of false reports range from 40% in Latvia up to 75% in Lithuania and Norway. So why is it that we inherently trust this data source, despite the false alarm rate, and not data from citizen scientists? Is it because life is threatened or because fires are easier to spot than invasive species, or simply because emergency services are mandated with the requirement to investigate?

Volunteers monitor butterflies in Mount Rainier National Park, as part of the Cascade Butterfly Project, a citizen science effort organized by the US National Park Service © Kevin Bacher | US National Park Service
A recent encouraging development for citizen science was the signing of an executive order by President Obama on 6 January 2017, which gave federal agencies the jurisdiction to use citizen science and crowdsourced data in their operations. Do we need something similar in the EU or at the level of member states? And what will it really take for authorities to trust scientific data from citizens?
To move from the current situation of general distrust in citizen science data to one in which the data are viewed as a potentially useful source of information, we need further action. First we need to showcase examples of where data collected by citizens are already being used for monitoring. At the meeting in Brussels, Kyle Copas of the Global Biodiversity Information Facility (GBIF) noted that up to 40% of the data records in GBIF are supplied by citizens, which surprised many of the meeting participants. Data from GBIF are used for national and international monitoring of biodiversity. Secondly, we need to quantify the value of information coming from citizen scientists. For example, how much money could have been saved if reports on invasive species from citizens were acted upon? Third, we need to forge partnerships with government agencies to institutionally embed citizen science data streams into everyday operations. For example, the LandSense citizen observatory, a new project, aims to do exactly this. We are working with the National Mapping Agency in France to use citizen science data to update their maps but there are many other similar examples with other local and national agencies that will be tested over the next 3.5 years.
Finally, we need to develop quality assurance systems that can be easily plugged into the infrastructure of existing organizations. The EU-funded COBWEB project began building such a citizen science-based quality assurance system, which we are continuing to develop in LandSense as a service. Providing out-of-the-box tools may be one solution to help organizations to begin working with citizen science data more seriously at an institutional level.

IIASA researchers test the Fotoquest app, a citizen science game developed at IIASA. ©Katherine Leitzell | IIASA
These measures will clearly take time to implement so I don’t expect that the discussion on the quality of the data will be removed from any agenda for some time to come. However, I look forward to the day when the main issue revolves around how we can possibly handle the masses of big data coming from citizens, a situation that many of us would like to be in.
More Information about the meeting: https://ec.europa.eu/jrc/en/event/workshop/citizen-science-open-data-model-invasive-alien-species-europe
This article gives the views of the author, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.
Having been involved and still being involved in citizen science projects in air pollution and radiation, for me the biggest step in acceptance for the data produced was to show that the data was reliable and not just the product of ill-calibrated cheap sensors.
Validation and before-hand calibration are essential processes that are not always easy to perform, but that if not done can severely limit the usefulness of the data.
In the air pollution case there was also an environmental association that was interested in the results to hold the local government responsible, so there was a good predisposition towards citizen data, but it had to be accurate to hold for example in court.
I take your point regarding air pollution measurements or when any sensors are involved that require calibration. There is always going to be an extra level of quality assurance required in these situations, particularly if the data are being used in legal proceedings. However, it’s encouraging to know that the attitude toward the citizen data was positive.
We have had quite similar discussions in this EU COST Action that I was recently involved in (TD1202: Mapping and the Citizen Sensor) regarding citizen-based data that would be used in updating authoritative databases such as the official topographic or land cover/land use database of a national mapping agency. Where does the liability rest if the data are subsequently used in court cases? I think similar to the internet, the legal frameworks are behind and will catch up as more of these types of data are used by public bodies.
An example on how to cope with data from a bottom up citizen science project with diverse data quality can be found here: https://link.springer.com/article/10.1007/s10841-016-9924-4
Thanks for the article! In an ongoing citizen science project we dealt with many of your
questions in the context of biodiversity monitoring. We investigated if and how supervised pupils are able to systematically collect data about the occurrence of diurnal butterflies, and how this data could contribute to a permanent butterfly monitoring system. Results are promising:
https://link.springer.com/article/10.1007/s10841-017-0010-3