What will it take to trust scientific data from citizens?

By Linda See, IIASA Ecosystems Services and Management Program

One of the biggest questions when it comes to citizen science is the quality of the data. Scientists worry that citizens are not as rigorous in their data collection as professionals might be, which calls into question the reliability of the data.  At a meeting this month in Brussels on using citizen science to track invasive species, we grappled with the question: what it will take to trust this data source, particularly if it’s going to be used to alert authorities regarding the presence of an invasive species in a timely manner.

This discussion got me thinking about what other types of data are supplied by citizens that authorities simply trust, for example, when a citizen calls the emergency services to report an incident, such as a fire. Such reports are investigated by the authorities and the veracity of the alert is not questioned. Instead authorities are obliged to investigate such reports.

Yet the statistics show that false alarms do occur. For example, in 2015, there were more than 2.5 million false fire alarms in the United States, of which just under a third were due to system malfunctions. The remaining calls were unintentional, malicious, or other types of false alarms, such as a bomb scare. Statistics for calls to the emergency services more generally show similar trends in different European countries, where the percentage of false reports range from 40% in Latvia up to 75% in Lithuania and Norway. So why is it that we inherently trust this data source, despite the false alarm rate, and not data from citizen scientists? Is it because life is threatened or because fires are easier to spot than invasive species, or simply because emergency services are mandated with the requirement to investigate?

Volunteers monitor butterflies in Mount Rainier National Park, as part of the Cascade Butterfly Project, a citizen science effort organized by the US National Park Service © Kevin Bacher | US National Park Service

A recent encouraging development for citizen science was the signing of an executive order by President Obama on 6 January 2017, which gave federal agencies the jurisdiction to use citizen science and crowdsourced data in their operations. Do we need something similar in the EU or at the level of member states? And what will it really take for authorities to trust scientific data from citizens?

To move from the current situation of general distrust in citizen science data to one in which the data are viewed as a potentially useful source of information, we need further action. First we need to showcase examples of where data collected by citizens are already being used for monitoring. At the meeting in Brussels, Kyle Copas of the Global Biodiversity Information Facility (GBIF) noted that up to 40% of the data records in GBIF are supplied by citizens, which surprised many of the meeting participants. Data from GBIF are used for national and international monitoring of biodiversity. Secondly, we need to quantify the value of information coming from citizen scientists. For example, how much money could have been saved if reports on invasive species from citizens were acted upon? Third, we need to forge partnerships with government agencies to institutionally embed citizen science data streams into everyday operations. For example, the LandSense citizen observatory, a new project, aims to do exactly this. We are working with the National Mapping Agency in France to use citizen science data to update their maps but there are many other similar examples with other local and national agencies that will be tested over the next 3.5 years.

Finally, we need to develop quality assurance systems that can be easily plugged into the infrastructure of existing organizations. The EU-funded COBWEB project began building such a citizen science-based quality assurance system, which we are continuing to develop in LandSense as a service. Providing out-of-the-box tools may be one solution to help organizations to begin working with citizen science data more seriously at an institutional level.

IIASA researchers test the Fotoquest app, a citizen science game developed at IIASA. ©Katherine Leitzell | IIASA

These measures will clearly take time to implement so I don’t expect that the discussion on the quality of the data will be removed from any agenda for some time to come. However, I look forward to the day when the main issue revolves around how we can possibly handle the masses of big data coming from citizens, a situation that many of us would like to be in.

More Information about the meeting: https://ec.europa.eu/jrc/en/event/workshop/citizen-science-open-data-model-invasive-alien-species-europe

Picture Pile: Gaming for Science

By Dilek Fraisl, IIASA Ecosystems Services and Management Program

In October 2015, we launched our latest game, Picture Pile. The idea is simple: look at a pair of satellite images from different  years and tell us if you can see any evidence of deforestation. Thanks to the participation of many volunteers, 2.69 million pictures have already been sorted in our pile of 5 million pairs. But we still have a long way to go, and we need your help to get us there!


Screenshot from the game: click for more information (Image credit Tobias Sturn)

Deforestation is one of the most serious environmental problems in the world today. Forests cover a third of the land area on Earth, producing vital oxygen, habitats for a diversity of wildlife, and important ecosystem services. According to the World Wildlife Fund (WWF), some 46,000 to 58,000 square miles of forest are lost each year, which is equivalent to 48 football fields every minute. But this is a rough estimate since deforestation is very difficult to track. Reasons why are that satellite imagery can be of insufficient spatial resolution to map deforestation accurately, deforestation mostly occurs in small chunks that may not be visible from medium-resolution imagery, and very high-resolution data sets are expensive and can require big data processing capabilities, so can only be used for limited areas.

To help contribute to better mapping of deforestation, researchers in IIASA’s Earth Observation Systems (EOS) group, led by Steffen Fritz, have been working on novel projects to engage citizens in scientific data collection that can complement satellite-based traditional deforestation monitoring. One of the latest applications is Picture Pile, a game that makes use of very high-resolution satellite images spanning the last decade. Designed by Tobias Sturn, the aim is to provide data that can help researchers build a better map of deforestation. Players are provided with a pair of images that span two time periods and are then asked to answer a simple question:  “Do you see tree loss over time?” After examining the image, the player drags the images to the right for “yes,” left for “no,” or down to indicate “maybe” when the deforestation is not clearly visible.

Every image is sorted multiple times by numerous independent players, in order to build confidence in the results, and also to gain an understanding of how good the players are at recognizing visible patterns of deforestation. Once enough data are collected at a single location, the images are taken out of the game and new ones are added, thereby increasing the spatial coverage of our mapped area over time. Right now we are focusing on Tanzania and Indonesia, two regions where we know there are problems with existing maps of deforestation.

Picture Pile is focusing first on Indonesia and Tanzania - two regions where there are problems with existing maps of deforestation. Photo (cc) Aulia Erlangga for Center for International Forestry Research (CIFOR).

Picture Pile is focusing first on Indonesia (pictured) and Tanzania – two regions where there are problems with existing maps of deforestation. Photo (cc) Aulia Erlangga for Center for International Forestry Research (CIFOR).

Once the pile is fully sorted, the 5 million photos in the data set will be used to develop better maps of forest cover and forest loss using hybrid techniques developed by the group as well as inputs to classification algorithms. We will also use the data to validate the accuracy of existing global land cover maps. Finally, we will mine the data set to look for patterns regarding quality (for example, how many samples do we need to provide to the “crowd” before we can be confident enough to use their data in further research). In short, by integrating citizens in scientific research, Picture Pile will also help us improve the science of land cover monitoring through crowdsourcing mechanisms.

So please join in and help us get to the finish line. You can play Picture Pile in your browser or you can download the free iOS/Android app from the Apple and Google Play stores and play on your smartphone or tablet. Your contributions will help scientists like those at IIASA to tackle global problems such as deforestation and environmental degradation. At the same time you may win some great prizes: a brand new smartphone, a tablet, or a mini tablet.

Beating the heat with more data on urban form and function

By Linda See, IIASA Ecosystems Services and Management Program

We had another very hot summer this year in Europe and many other parts of the world. Many European cities, including London, Madrid, Frankfurt, Paris and Geneva, broke new temperature records.

Cities are particularly vulnerable to increasing temperatures because of a phenomenon known as the urban heat island effect. First measured more than a half a century ago by Tim Oke, the increased temperatures measured in urban areas are a result of urban land use, or higher amounts of impervious surfaces such as concrete and concentrated urban structures. The urban heat island effect impacts human health and well-being. It’s not just a matter of comfort: during the heat wave in 2003, more than 70,000 people in Europe are estimated to have perished, mostly urban dwellers.


Summer 2015 in Ljubljana, Slovenia. ©K. Leitzell | IIASA

While climate models have many uncertainties, they do all agree that the urban heat island effect will increase in frequency and duration in the future. A recent article by Hannah Hoag in Nature paints a bleak picture of just how unprepared cities are for dealing with increasing temperatures. The paper cites positive and negative examples of mitigation from various cities but it falls short of suggesting a more widely applicable solution.

What we need is a standardized way of approaching the problem. Underlying this lack of standards is the paucity of data on the form and function of cities. By form I mean the geometry of the city–a 3D model of the buildings and road network, and information on the building materials—as well as a map of the basic land cover including impervious surfaces like roads and sidewalks, and areas of vegetation such as gardens, parks, and fields. Function refers to the building use, road types, use of irrigation and air conditioning and other factors that affect local atmospheric conditions. As climate models become more highly resolved, they will need vast amounts of such information to feed into them.

These issues are what led me and my colleagues (Prof Gerald Mills of UCD, Dr Jason Ching of UNC and many others) to conceive the World Urban Database and Access Portal Tools (WUDAPT) initiative (www.wudapt.org). WUDAPT is a community-driven data collection effort that draws upon the considerable network of urban climate modelers around the world. We start by dividing a city into atmospherically distinct areas, or Local Climate Zones (LCZs) developed by Stewart and Oke, which provides a standard methodology for characterizing cities that can improve the parameters needed for data-hungry urban climate models.

Using freely available satellite imagery of the Earth’s surface, the success of the approach relies on local urban experts to provide representative examples of different LCZs across their city. We are currently working towards creating an LCZ classification for all C40 cities (a network of cities committed to addressing climate change) but are encouraging volunteers to work on any cities that are of interest to them. We refer to this as Level 0 data collection because it provides a basic classification for each city. Further detailed data collection efforts (referred to as Levels 1 and 2) will use a citizen science approach to gather information on building materials and function, landscape morphology and vegetation types.

The Local Climate Zone (LCZ) map for Kiev.

The Local Climate Zone (LCZ) map for Kiev.

WUDAPT will equip climate modelers and urban planners with the data needed to examine a range of mitigation and adaptation scenarios: For example what effect will green roofs, changes in land use or changes in the urban energy infrastructure have on the urban heat island and future climate?

The ultimate goal of WUDAPT is to develop a very detailed open access urban database for all major cities in the world, which will be valuable for many other applications from energy modelling to greenhouse gas assessment. If we want to improve the science of urban climatology and help cities develop their own urban heat adaptation plans, then WUDAPT represents one concrete step towards reaching this goal. Contact us if you want to get involved.

Beyond sharing Earth observations

By Linda See and Ian McCallum, IIASA Ecosystems Services and Management Program, Earth Observation Team

Land cover is of fundamental importance for environmental research. It  serves as critical baseline information for many large-scale models, for example in developing future scenarios of land use and climate change. However, current land cover products are not accurate enough for many applications and to improve them we need better and more accessible validation data. We recently argued this point in a Nature correspondence, and here we take the opportunity to expand on our brief letter.

In the last decade, multiple global land cover data products have been developed. But when these products are compared, there are significant amounts of spatial disagreement across land cover types. Where one map shows cropland, another might show forest domains. These discrepancies persist even when you take differences in the legend definitions into account. The reasons for this disagreement include the use of different satellite sensors, different classification methodologies, and the lack of sufficient data from the ground, which are needed to train, calibrate, and validate land cover maps.

An artist's illustration of the NASA Landsat Data Continuity Mission spacecraft, one of the many satellites that collects data about Earth's surface. Credit: NASA/GSFC/Landsat

An artist’s illustration of the NASA Landsat Data Continuity Mission spacecraft, one of the many satellites that collects data about Earth’s surface. Credit: NASA/GSFC/Landsat

A recent Comment in Nature (Nature513, 30-31; 2014) argued that freely available satellite imagery will improve science and environmental-monitoring products. Although we fully agree that greater open access and sharing of satellite imagery is urgently needed, we believe that this plea neglects a crucial component of land cover generation: the data required to calibrate and validate these products.

At present, remotely sensed global land cover is not accurate enough for monitoring biodiversity loss and ecosystem dynamics or for many of the other applications for which baseline land cover and change over time are critical inputs. When Sentinel-2–a new Earth observation satellite to be launched in 2015 by the European Space Agency–comes online, it will be possible to produce land cover maps at a resolution of 10 meters.  Although this has incredible potential for society as a whole, these products will only be useful if they represent the land cover more accurately than the current products available. To improve accuracy, more calibration and validation data are required. Although more investment is clearly needed in ground-based measurements, there are other, complementary solutions to this problem.

Map showing disagreement between two different land cover maps. Credit: Geo-Wiki.org, Google Earth

Map showing cropland disagreement between two different land cover maps,  GlobCover and GLC2000: all colors represent disagreement. Credit: Geo-Wiki.org, Google Earth

Not only should governments and research institutes be urged to share imagery, they should also share their calibration and validation data. Some efforts have been made by the Global Observation for Forest Cover and Land Dynamics  (GOFC-GOLD) in this direction, but there is an incredible amount of data that remains locked within institutes and agencies. The atmospheric community shares their data much more readily than the Earth Observation (EO) community, even though we would only benefit by doing so.

Crowdsourcing of calibration and validation data also has real potential for vastly increasing the amount of data available to improve classification algorithms and the accuracy of land cover products. The IIASA Geo-Wiki project is one example of a growing community of crowdsourcing applications that aim to improve the mapping of the Earth’s surface.


New apps developed by IIASA’s Earth Observation Team aim to involve people around the world in on-the-ground data validation efforts.

Geo-Wiki is a platform which provides citizens with the means to engage in environmental monitoring of the earth by providing feedback on existing spatial information overlaid on satellite imagery or by contributing entirely new data. Data can be input via the traditional desktop platform or mobile devices, with campaigns and games used to incentivize input. Resulting data are available without restriction.

Another major research projects we are using to address many of these issues identified above is the ERC Project Crowdland .

Interview: Taking Geo-Wiki to the ground

Steffen Fritz has just been awarded an ERC Consolidator Grant to fund a research project on crowdsourcing and ground data collection on land-use and land cover. In this interview he talks about his plans for the new project, CrowdLand. 

Pic by Neil Palmer (CIAT).

Farmers in Kenya are one group which the Crowdland Project aims to involve in their data gathering. Photo credit: Neil Palmer, CIAT

What’s the problem with current land cover data?
There are discrepancies between current land cover products, especially in cropland data. It’s all based on satellite data, and in these data, it is extremely difficult to distinguish between cropland and natural vegetation in certain parts of the world if you do not use so-called very high resolution imagery, similar to a picture you take from space. With this high-resolution data you can see structures like fields and so on, which you can then use to distinguish between natural vegetation and cropland. But this is a task where currently people are still better at than computers–and there is a huge amount of data to look at.

In our Geo-Wiki project and related efforts such as the Cropland Capture game, we have asked volunteers to look at these high-resolution images and classify the ground cover as cropland or not cropland. The efforts have been quite successful, but our new project will take this even further.

How will the new project expand on what you’ve already done in Geo-Wiki?
The big addition is to go on the ground. Most of the exercises we currently do are based on the desktop or the phones, or tablets, asking volunteers to classify imagery that they see on a screen.

What this project aims to do is to improve data you collect on the ground, known as in-situ data.  You can use photography, GPS sensors, but also your knowledge you have about what you see. We will use volunteers to collect basic land cover data such as tree cover, cropland, and wetlands, but also much more detailed land-use information. With this type of data we can document what crops are grown where, whether they are irrigated, if the fields are fertilized, what exact type of crops are growing, and other crop management information which you cannot see in satellite imagery. And there are some things you can’t even see when you’re on the ground, thus you need to ask the farmer or recruit the farmer as a data provider. That’s an additional element this project will bring, that we will work closely with farmers and people on the ground.

For the study, you have chosen Austria and Kenya. Why these two countries?
In Austria we have much better in situ data. For example, the Land Use Change Analysis System (LUCAS) in Europe collects in situ data according to a consistent protocol. But this program is very expensive, and the agency that runs it, Eurostat, is discussing how to reduce costs. Additionally the survey is only repeated every three years so fast changes are not immediately recorded. Some countries are not in favor of LUCAS and they prefer to undertake their own surveys. Then however you lose the overall consistency and there is no Europe-wide harmonized database which allows for comparison between countries.   Our plan is to use gaming, social incentives, and also small financial incentives to conduct a crowdsourced LUCAS survey. Then we will examine what results you get when you pay volunteers or trained volunteers compared to the data collected by experts.

In Kenya, the idea is similar, but in general in the developing world we have very limited information, and the resources are not there for major surveys like in Europe. In order to remedy that the idea is again to use crowdsourcing and use a “bounded crowd” which means people who have a certain level of expertise, and know about land cover and land use, for example people with a surveyor background, university students, or interested citizens who can be trained. But in developing countries in particular it’s important to use financial incentives. Financial incentives, even small ones, could probably help to collect much larger amounts of data. Kenya is a good choice also because it has quite a good internet connection, a 3G network, and a lot of new technologies evolving around mobile phones and smartphone technology.

What will happen with the data you collect during this project?
First, we will analyze the data in terms of quality.  One of our research questions is how good are the data collected by volunteers compared to data collected by experts. Another research question is how can imperfect but large data collected by volunteers be filtered and combined so that it becomes useful and fulfills the scientific accuracy requirements.

Then we will use these data and integrate them into currently existing land use and land cover data, and find ways to make better use of it. For example, in order to make projections about future land-use and to better quantify current yield gaps it is crucial to get accurate current information on land-use, including spatially explicit information on crop types, crop management information and other data.

Once we have done some quality checks we will also make these data available for other researchers or interested groups of people.

Crowdsourcing for land cover is in its infancy. There have been lots of crowdsourcing projects in astronomy, archaeology, and biology, for example, but there hasn’t been much on land use, and there is huge potential there. ”We need to not only better understand the quality of the data we collect, but also expand the network of institutions who are working on this topic.”

How games can help science: Introducing Cropland Capture

By Linda See, Research Scholar, IIASA Ecosystems Services and Management Program

Researchers estimate we spend 3 billion hours a week on game playing. CC Image courtesy TheErin on Flickr

Researchers estimate we spend 3 billion hours a week on game playing. CC Image courtesy TheErin on Flickr

On a recent rush hour train ride in London I looked around to see just about everybody absorbed in their mobile phone or tablet. This in itself is not that unusual. But when I snooped over a few shoulders, what really surprised me was that most of those people were playing games. I hope this bodes well for our new game, Cropland Capture, introduced last week.

Cropland Capture is a game version of our citizen science project Geo-Wiki, which has a growing network of interested experts and volunteers who regularly help us in validating land cover through our competitions. By turning the idea into a game, we hope to reach a much wider audience.

Playing Cropland Capture is simple: look at a satellite image and tell us if you see any evidence of cropland. This will help us build a better map of where cropland is globally, something that is surprisingly uncertain at the moment. This sort of data is crucial for global food security, identifying where the big gaps in crop yields are, and monitoring crops affected by droughts, amongst many other applications.

Gamification and citizen science
The idea of Cropland Capture is not entirely unique. There are an astonishingly large number of games available for high tech gaming consoles, PCs and increasingly, mobile devices. While the majority of these games are pure entertainment, some are part of an emerging genre known as ”serious games” or ”games with a purpose.” These are games that either have an educational element or through the process of playing them, you can help scientists in doing their research. One of the most successful examples is the game FoldIt, where teams of players work together to decode protein structures. This is not an easy task for a computer to do, but some people are exceptionally talented at seeing these patterns. The result has even led to new scientific discoveries that have been published in high level journals such as Nature.

Jane McGonigal, in her book Reality is Broken (Why Games Make us Better and How They Can Change the World), estimates that we spend 3 billion hours a week alone on game playing, and that the average young person spends more time gaming by the end of their school career than they have actually spent in school. Although these figures may seem alarming, McGonigal argues that there are many positive benefits associated with gaming, including the development of problem-solving skills, the ability to cope better with problems such as depression or chronic pain, and even the possibility that we might live ten years longer if we played games. If people spent just a fraction of this time on “serious games” like FoldIt and Cropland Capture, imagine how much could be achieved.

Since the game started last Friday, 185 players have validated 119,777 square kilometers of land (more than twice the land area of Denmark).


Cropland Capture is easy to play – simply swipe the picture left or right to say whether there is cropland or not.

Get in the game
You can play Cropland Capture on a tablet (iPad or Android) or mobile phone (iPhone or Android). Download the game from the Apple’s App Store or the Google Play Store. For those who prefer an online version, you can also play the game at: http://www.geo-wiki.org/games/croplandcapture/. For more information about the game, check out our videos at:  http://www.geo-wiki.org/games/instructions-videos/. During the next six months, we will be providing regular updates on Twitter (@CropCapture) and Facebook.

The game is being played for  six months, where the top scorer each week will be crowned the weekly winner. The 25 weekly winners will then be entered into a draw at the end of the competition to win three big prizes: an Amazon Kindle, a smartphone, and a tablet. The game was launched only last week so there is plenty of time to get involved and help scientific research.