Quantitative models are an important part of environmental and economic research and policymaking. For instance, IIASA models such as GLOBIOM and GAINS have long assisted the European Commission in impact assessment and policy analysis2; and the energy policies in the US have long been guided by a national energy systems model (NEMS)3.
Despite such successful modelling applications, model criticisms often make the headlines. Either in scientific literature or in popular media, some critiques highlight that models are used as if they are precise predictors and that they don’t deal with uncertainties adequately4,5,6, whereas others accuse models of not accurately replicating reality7. Still more criticize models for extrapolating historical data as if it is a good estimate of the future8, and for their limited scopes that omit relevant and important processes9,10.
Validation is the modeling step employed to deal with such criticism and to ensure that a model is credible. However, validation means different things in different modelling fields, to different practitioners and to different decision makers. Some consider validity as an accurate representation of reality, based either on the processes included in the model scope or on the match between the model output and empirical data. According to others, an accurate representation is impossible; therefore, a model’s validity depends on how useful it is to understand the complexity and to test different assumptions.
Given this variety of views, we conducted a text-mining analysis on a large body of academic literature to understand the prevalent views and approaches in the model validation practice. We then complemented this analysis with an online survey among modeling practitioners. The purpose of the survey was to investigate the practitioners’ perspectives, and how it depends on background factors.
According to our results, published recently in Eker et al. (2018)1, data and prediction are the most prevalent themes in the model validation literature in all main areas of sustainability science such as energy, hydrology and ecosystems. As Figure 1 below shows, the largest fraction of practitioners (41%) think that a match between the past data and model output is a strong indicator of a model’s predictive power (Question 3). Around one third of the respondents disagree that a model is valid if it replicates the past since multiple models can achieve this, while another one third agree (Question 4). A large majority (69%) disagrees with Question 5, that models cannot provide accurate projects, implying that they support using models for prediction purposes. Overall, there is no strong consensus among the practitioners about the role of historical data in model validation. Still, objections to relying on data-oriented validation have not been widely reflected in practice.
Figure 1: Survey responses to the key issues in model validation. Source: Eker et al. (2018)
According to most practitioners who participated in the survey, decision-makers find a model credible if it replicates the historical data (Question 6), and if the assumptions and uncertainties are communicated clearly (Question 8). Therefore, practitioners think that decision makers demand that models match historical data. They also acknowledge the calls for a clear communication of uncertainties and assumptions, which is increasingly considered as best-practice in modeling.
One intriguing finding is that the acknowledgement of uncertainties and assumptions depends on experience level. The practitioners with a very low experience level (0-2 years) or with very long experience (more than 10 years) tend to agree more with the importance of clarifying uncertainties and assumptions. Could it be because a longer engagement in modeling and a longer interaction with decision makers help to acknowledge the necessity of communicating uncertainties and assumptions? Would inexperienced modelers favor uncertainty communication due to their fresh training on the best-practice and their understanding of the methods to deal with uncertainty? Would the employment conditions of modelers play a role in this finding?
As a modeler by myself, I am surprised by the variety of views on validation and their differences from my prior view. With such findings and questions raised, I think this paper can provide model developers and users with reflections on and insights into their practice. It can also facilitate communication in the interface between modelling and decision-making, so that the two parties can elaborate on what makes their models valid and how it can contribute to decision-making.
Model validation is a heated topic that would inevitably stay discordant. Still, one consensus to reach is that a model is a representation of reality, not the reality itself, just like the disclaimer of René Magritte that his perfectly curved and brightly polished pipe is not a pipe.
Eker S, Rovenskaya E, Obersteiner M, Langan S. Practice and perspectives in the validation of resource management models. Nature Communications 2018, 9(1): 5359. DOI: 10.1038/s41467-018-07811-9 [pure.iiasa.ac.at/id/eprint/15646/]
Nuccitelli D. Climate scientists just debunked deniers’ favorite argument. The Guardian. 2017. https://www.theguardian.com/environment/climate-consensus-97-per-cent/2017/jun/28/climate-scientists-just-debunked-deniers-favorite-argument
by Melina Filzinger, IIASA Science Communication Fellow
Ecosystems worldwide are changed by the influence of humans, often leading to the extinction of species, for example due to climate change or loss of natural habitat. But it doesn’t stop there: as the different species in an ecosystem feed on each other and are thereby interconnected, the loss of one species might lead to the extinction of others, which can even destabilize the whole system. “In nature, everything is connected in a complex way, so at first glance you cannot be sure what will happen if one species disappears from an ecosystem,” says IIASA postdoc Mateusz Iskrzyński.
This is why the IIASA Evolution and Ecology (EEP) and Advanced Systems Analysis (ASA) programs are employing food-web modeling to find out which properties make ecosystems particularly vulnerable to species extinction. Food webs are stylized networks that represent the feeding relationships in an ecosystem. Their nodes are given by species or groups of species, and their links indicate how biomass cycles through the system by means of eating and being eaten. “This type of network analysis has a surprising power to uncover general patterns in complex relationships,” explains Iskrzyński.
Every one of these food webs is the result of years of intense research that involves both data collection to assess the abundance of species in an area, and reconstructing the links of the network from existing knowledge about the diets of different species. The largest of the currently available webs contain about 100 nodes and 1,000 weighted links. Here, “weighted” means that each link is characterized by the biomass flow between the nodes it connects.
Usually, food webs are published and considered individually, but recently efforts have been stepped up to collect them and analyze them together. Now, the ASA and EEP programs have collected 220 food webs from all over the world in the largest database assembled so far. This involved unifying the parametrization of the data and reconstructing missing links.
The researchers use this database to find out how different ecosystems react to the ongoing human-made species loss, and which ones are most at risk. This is done by removing a single node from a food web, which corresponds to the extinction of one group of species, and modeling how the populations of the remaining species change as a result. The main question is how these changes in the food web depend on its structural properties, like its size and the degree of connectedness between the nodes.
From the preliminary results obtained so far, it seems that small and highly connected food webs are particularly vulnerable to the indirect effects of species extinction. This means that in these webs the extinction of one species is especially likely to lead to large disruptive change affecting many other organisms. “Understanding the factors that cause such high vulnerability is crucial for the sustainable management and conservation of ecosystems,” says Iskrzyński. He hopes that this research will encourage more, and more precise, empirical ecosystems studies, as reliable data is still missing from many places in the world.
As a next step, the scientists in the two programs are planning to understand which factors determine the impact that the disappearance of a particular group of organisms has. They are going to make the software they use for their simulations publicly available, together with the database they developed.
Note: This article gives the views of the author, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.
by Melina Filzinger, IIASA Science Communication Fellow
Having just finished tenth grade, Lillian Petersen from New Mexico, USA is currently spending the summer at IIASA, working with researchers from both the Ecosystems Services and Management (ESM), and Risk and Resilience (RISK) programs on developing risk models for all African countries.
At a talk Petersen gave at the Los Alamos Nature Center/Pajarito Environmental Education Center, her method for predicting food shortages in Africa from satellite images caught the attention of Molly Jahn from the University of Wisconsin-Madison. Jahn, who is collaborating with the ESM and RISK programs at IIASA, was so impressed with Petersen’s work that she added her to her research group and connected her to IIASA researchers for a joint project.
Knowing which areas are at risk for disasters like conflict, disease outbreak, or famine is often an important first step for preventing their occurrence. In developed countries, there is already a lot of work being done to estimate these risks. In developing countries, however, a lack of data often hinders risk modeling, even though these countries are often most at risk for disasters.
Many humanitarian crises, like famine, are closely connected to poverty. However, high resolution poverty estimates are only available for a few African countries. This is why Petersen and her colleagues are developing methods to obtain those poverty estimates for all of Africa using freely available data, like maps showing major roads and cities, as well as high-resolution satellite images. Information about poverty in a certain region can be extracted from this data by considering several indicators. For example, areas that are close to major roads or cities, or those that have a large amount of lighting at night, meaning that electricity is available, are usually less poor than those without these features. The researchers are also analyzing the trading potential with neighboring countries, the land cover type, and distance to major shipping routes, such as waterways.
As no single one of these indicators can perfectly predict poverty, the scientists combine them. They “train” their model using the countries for which poverty data exists: A comparison of the model’s output and the real data helps to reveal which combination of indicators gives a reliable estimate of poverty. Following this, they plan to apply that knowledge in order to accurately predict poverty with high spatial resolution over the entire African continent.
Once these estimates exist, Petersen and her colleagues will apply risk models to find out which areas are particularly vulnerable to disease outbreaks, famine, and conflicts. “I hope that this research will inform policymakers about which populations are most at risk for humanitarian crises, so that they can target these populations systematically in aid programs,” says Petersen, adding that preventing a disaster is generally cheaper than dealing with its aftermath.
The skills Petersen is using for her research are largely self-taught. After learning computer programming with the help of a book when she was in fifth grade, Petersen conducted her first research project on the effect of El Nino on the winter weather in the US when she was in seventh grade. “It was a small project, but I was pretty excited to obtain scientific results from raw data,” she says. After this first success she has been building up her skills every year, by competing at science fairs across the US with her research projects.
Her internship at IIASA gives Petersen access to the resources she needs to take her research to the next level. “Getting feedback from some of the top scientists in the field here at IIASA is definitely improving my work,’’ she says. Petersen is hoping to publish a paper about her project next year, and wants to major in applied mathematics after she finishes high school.
Note: This article gives the views of the author, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.
by Melina Filzinger, IIASA Science Communication Fellow
Strategic board games are staple entertainment for families all over the world, but what many do not know is that games can also be a valuable research tool. As her project for the Young Scientists Summer Program (YSSP), Sara Turner is piloting an experiment that uses a game called the Forest Game, developed by IIASA and the Centre for Systems Solutions, to find out how policy decisions are made and how they change over time. “Games let you abstract from the specifics of a real-world case, but are more human-centric than, for example, computer simulations,” says Turner.
In the Forest Game, a group of five to ten players is asked to make decisions about the management of a forest together. Harvesting trees yields returns for the players, while harvesting too many of them might destroy the forest or increase the risk of flooding. There are some uncertainties in the game – for example, the players do not know exactly how resilient the forest is. The goal of the research project is to run multiple iterations of the game with different players and starting conditions, and trace how group discussions and the resulting decisions change over time. This helps to generate hypotheses about the ways in which individuals interact to generate policy outcomes. Each game takes about an hour to play.
Even though the Forest Game deals with forest management, this is only one example of a broader class of decision-making dilemma: when a resource is limited, and it is costly to prevent access, people will tend to over-exploit the resource. This in turn leads to a wide range of problems, from over-fishing to air pollution. Although games cannot capture the complexity of real situations, they can still help us understand the core dynamics of the problem and develop ideas and strategies that are relevant to solving it. “The game is not designed to be directly applicable to real life, but it helps to come up with hypotheses that you can then compare to real-life cases,” explains Turner.
Questions about the sustainable management of resources have been studied for decades, but not a lot is known about the role values play in shaping group decision making and the stability of the implemented policies. To investigate this, each participant is asked to fill out a short ten-minute survey assessing their core values and beliefs, after which they are put into a group with people who either have a very similar or very different worldview from them. “It is really interesting to put a person in a decision-making context with other people and get some insight into how they work through that problem,” says Turner.
For example, if you are a person that strongly values equality, in the game you might be likely to argue in favor of a policy where all participants obtain the same amount of returns, regardless of the number of trees the individual player chooses to harvest. If many players in the group share your belief, that policy might be more likely to be implemented than in a very diverse group.
Another interesting question whenever you run a game for research purposes is, “Who are the right players?” Some games are targeted at real-world policymakers, but often games can also be educational for the broader public. ‘’People learn a lot during games, because of the way that information is processed and experienced,” says Turner. That is why many participants, although they might not see a connection between the game and their life at first, find themselves relying on the insights they gained while playing when faced with similar situations in the future.
In this case, the goal is to study group decision-making processes in general, so the details of who is playing are not particularly important. However, to obtain groups of players with heterogeneous worldviews, a high degree of diversity is preferable.
While the game has previously mainly been played by YSSP participants and students of the University of Vienna, Turner is currently trying to recruit a more diverse set of players from both within and outside of IIASA. “It would be ideal to have a pool of participants who come from a wide variety of educational and cultural backgrounds,” she says.
If you are interested in participating in the Forest Game, you can write Sara Turner an e-mail to firstname.lastname@example.org.
Note: This article gives the views of the authors, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.
By Dilek Yildiz, Wittgenstein Center for Demography and Global Human Capital (IIASA, VID/ÖAW and WU), Vienna Institute of Demography, Austrian Academy of Sciences, International Institute for Applied Systems Analysis
Social media offers a promising source of data for social science research that could provide insights into attitudes, behavior, social linkages and interactions between individuals. As of the third quarter of 2017, Twitter alone had on average 330 million active users per month. The magnitude and the richness of this data attract social scientists working in many different fields with topics studied ranging from extracting quantitative measures such as migration and unemployment, to more qualitative work such as looking at the footprint of second demographic transition (i.e., the shift from high to low fertility) and gender revolution. Although, the use of social media data for scientific research has increased rapidly in recent years, several questions remain unanswered. In a recent publication with Jo Munson, Agnese Vitali and Ramine Tinati from the University of Southampton, and Jennifer Holland from Erasmus University, Rotterdam, we investigated to what extent findings obtained with social media data are generalizable to broader populations, and what constitutes best practice for estimating demographic information from Twitter data.
A key issue when using this data source is that a sample selected from a social media platform differs from a sample used in standard statistical analysis. Usually, a sample is randomly selected according to a survey design so that information gathered from this sample can be used to make inferences about a general population (e.g., people living in Austria). However, despite the huge number of users, the information gathered from Twitter and the estimates produced are subject to bias due to its non-random, non-representative nature. Consistent with previous research conducted in the United States, we found that Twitter users are more likely than the general population to be young and male, and that Twitter penetration is highest in urban areas. In addition, the demographic characteristics of users, such as age and gender, are not always readily available. Consequently, despite its potential, deriving the demographic characteristics of social media users and dealing with the non-random, non-representative populations from which they are drawn represent challenges for social scientists.
Although previous research has explored methods for conducting demographic research using non-representative internet data, few studies mention or account for the bias and measurement error inherent in social media data. To fill this gap, we investigated best practice for estimating demographic information from Twitter users, and then attempted to reduce selection bias by calibrating the non-representative sample of Twitter users with a more reliable source.
We gathered information from 979,992 geo-located Tweets sent by 22,356 unique users in South-East England and estimated their demographic characteristics using the crowd-sourcing platform CrowdFlower and the image-recognition software Face++. Our results show that CrowdFlower estimates age more accurately than Face++, while both tools are highly reliable for estimating the sex of Twitter users.
To evaluate and reduce the selection bias, we ran a series of models and calibrated the non-representative sample of Twitter users with mid-year population estimates for South-East England from the UK Office of National Statistics. We then corrected the bias in age-, sex-, and location-specific population counts. This bias correction exercise shows promise for unbiased inference when using social media data and can be used to further reduce selection bias by including other sociodemographic variables of social media users such as ethnicity. By extending the modeling framework slightly to include an additional variable, which is only available through social media data, it is also possible to make unbiased inferences for broader populations by, for example, extracting the variable of interest from Tweets via text mining. Lastly, our methodology lends itself for use in the calculation of sample weights for Twitter users or Tweets. This means that a Twitter sample can be treated as an individual-level dataset for micro-level analysis (e.g., for measuring associations between variables obtained from Twitter data).
By Valeria Javalera Rincón, IIASA CONACYT Postdoctoral Fellow in the Ecosystems Services and Management and Advanced Systems Analysis programs.
What is more important: water, energy, or food?
If you work in the water, energy or agriculture sector we can guess what your answer might be! But if you are a policy or decision maker trying to balance all three, then you know that it is getting more and more difficult to meet the growing demand for water, energy, and food with the natural resources available. The need for this balance was confirmed by the 17 Sustainable Development Goals, agreed by 193 countries, and the Paris climate agreement. But how to achieve it? Intelligent cooperation is the key.
The thing is that water, energy, and food are all related in such a way that are reliant on each other for production or distribution. This is the so-called Water-Energy-Food nexus. In many cases, you need water to produce energy, you need energy to pump water, and you need water and energy to produce, distribute, and conserve food.
Many scientists have tried to relate or to link models for water, agriculture, land, and energy to study these synergic relationships. In general, so far, there are two ways that this has been solved: One is integrating models with “hard linkages” like this:
In the picture there are six models (let’s say water, land use, hydro energy, gas, coal, food production models) that are then integrated into just one. The resulting integrated model then preserves the relationships but is complex, and in order to make it work with our current computer power you often have to sacrifice details.
Another way is to link them is using so-called “soft linkages” where the output of one model is the input of the next one, like this:
In the picture, each person is a model and the input is the amount of water left. These models all refer to a common resource (the water) and are connected using “soft linkages.” These linkages are based on sequential interaction, so there is no feedback, and no real synergy.
The intelligent linker agent
But what if we could have the relations and synergies between the models? It would mean much more accurate findings and helpful policy advice. Well, now we can. The secret is to link through an intelligent linker agent.
I developed a methodology in which an intelligent linker agent is used as a “negotiator” between models that can communicate with each other. This negotiator applies a machine-learning algorithm that gives it the capability to learn from the interactions with the models. Through these interactions, the intelligent linker can advise on globally optimal actions.
When I came to IIASA, I was asked to apply this approach to optimize trading between cities in the Shanxi region of China. I used a set of previously development models which aimed to distribute water and land available for each city in order to produce food (eight types of crops) and coal for energy. The intelligent linker agent optimizes trading between cities in order to satisfy demand at the lowest cost for each city.
The purpose of this exercise was to compare the solutions with those from “hard linkages” – like those in the first picture. We found that the intelligent linker is flexible enough to find the optimal solution to questions such as: How much of each of these products should each city export/import to satisfy global demand at a global lower economic and ecological cost? What actions are optimal when the total production is insufficient to meet the total demand? Under what conditions is it preferable to stop imports/exports when production is insufficient to supply the demand of each city?
The answers to these questions can be calculated by the interaction with the models of each city just by the interfacing with the intelligent linker agent, this means that no major changes in the models of each city were needed. We also found that, under the same conditions, the solutions using the intelligent linker agent were in agreement with those found when hard linking was used.
My next challenge is to build a prototype of a “distributed computer platform,” which will allow us to link models on different computers in different parts of the world—so that we in Austria could link to a model built by colleagues in Brazil, for example. I also want to link models of different sectors and regions of the globe, in order to prove that intelligent cooperation is the key to improving global welfare.
Javalera V, Morcego B, & Puig V, Negotiation and Learning in distributed MPC of Large Scale Systems, Proceedings of the 2010 American Control Conference, Baltimore, MD, 2010, pp. 3168-3173. doi: 10.1109/ACC.2010.5530986
Valeria J, Morcego B, & Puig V, Distributed MPC for Large Scale Systems using Agent-based Reinforcement Learning, In IFAC Proceedings Volumes, Volume 43, Issue 8, 2010, Pages 597-602, ISSN 1474-6670, ISBN 9783902661913, https://doi.org/10.3182/20100712-3-FR-2020.00097.
Morcego B, Javalera V, Puig V, & Vito R (2014). Distributed MPC Using Reinforcement Learning Based Negotiation: Application to Large Scale Systems. In: Maestre J., Negenborn R. (eds) Distributed Model Predictive Control Made Easy. Intelligent Systems, Control and automation: Science and Engineering, vol 69. Springer, Dordrecht
Javalera Rincón V, Distributed large scale systems: a multi-agent RL-MPC architecture, Universitat Politècnica de Catalunya. Institut d’Organització i Control de Sistemes Industrials,Doctoral thesis. 2016. http://upcommons.upc.edu/handle/2117/96332
Note: This article gives the views of the author and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.