Plan S: Promoting full and immediate Open Access publishing

By Luke Kirwan, IIASA Repository and Open Access Manager

IIASA Repository and Open Access Manager Luke Kirwan explains the ins-and-outs of the Plan S policy towards full and immediate Open Access publishing.

With Plan S, which has been implemented from 1 January 2021, new Open Access requirements come into force for project participants, which are intended to accelerate the transformation to complete and immediate Open Access. This has implications for researchers obtaining funding from funders supporting Plan S, such as the Austrian Science Fund (FWF) or Formas (a Swedish Research Council for Sustainable Development).

What exactly is Plan S?

Plan S is an initiative that aims to promote making research immediately open access without embargo periods or restrictions. It requires that, from 2021, scientific publications that result from research funded by public grants must be published in compliant Open Access journals or platforms. A number of national and international research bodies, including the FWF and the European Research Council (ERC), are working jointly on the implementation of Plan S and the promotion of open access research publication. A list of these funding bodies can be found here and more detailed information on the implementation of Plan S is available here.

What you need to know

Starting from 1 January 2021, publications derived from research funded by Plan S research organizations must be made openly accessible immediately upon publication without any embargo period. This applies only to projects submitted after 1 January 2021. Furthermore, this material must be made available under a Creative Commons Attribution license (CC-BY). In some instances, a more restrictive license can be applied, but this must be discussed with the funding body.

Further guidelines are currently being developed for publications that are not journal articles such as books and edited volumes. From 2021 onwards, it is important to closely check the requirements of research funders to ensure that projects are compliant with any open access requirements they may have.

Papers published under Plan S funding has to include an appropriate acknowledgement. In the case of FWF funded research, it must for example follow the following format:

‘This research was funded in whole, or in part, by the Austrian Science Fund (FWF) [Grant number]. For the purpose of open access, the author has applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.’

Authors of papers published under Plan S funding will retain the copyright of their work, and will be providing journals with a license to publish their material rather than fully transferring copyright to them. Publishers that require a license to publish must allow the authors to make either the published version, or the accepted version, immediately available under an open license. No embargo period is permitted.

Routes to compliance

  • Publish in an open access journal
  • Make the accepted manuscript immediately available in an open access repository (like PURE) under a CC-BY license
  • Publish in a subscription journal where IIASA has an open access agreement (For a list of IIASA’s current agreements please see here)

COAlition S has provided a journal checker tool so that you can check a journals compliance with the Plan S requirements.

The FWF’s statement and guidelines for Plan S can be found here. The operation and success of Plan S will be reviewed by the end of 2024. For any further information or assistance, please contact the library.

Related links:

Science family of journals announces change to open-access policy (Jan 2021)

Nature journals reveal terms of landmark open-access option (NOV 2020)

Plan S toolkit (coalition S website)

Note: This article gives the views of the author, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.

Crafting mines from satellite images

By Victor Maus, alumnus of the IIASA Ecosystems Services and Management Program and researcher at the Vienna University of Economics and Business

The mining of coal, metals, and other minerals causes loss of natural habitats across the entire globe. However, available data is insufficient to measure the extent of these impacts. IIASA alumnus Victor Maus and his colleagues mapped more than 57,000 km² of mining areas over the whole world using satellite images.

 

© Pix569 | Dreamstime.com

Our modern lifestyles and consumption patterns cause environmental and social impacts geographically displaced in production sites thousands of kilometres away from where the raw materials are extracted. Complex supply chains connecting mineral mining regions to consumers often obscure these impacts. Our team at the Vienna University of Economics and Business is investigating these connections and associated impacts on a global-scale www.fineprint.global.

However, some mining impacts are not well documented across the globe, for example, where and how much area is used to extract metals, coal, and other essential minerals are unknown. This information is necessary to assess the environmental implications, such as forest and biodiversity loss associated with mining activities. To cover this data gap, we analyzed the satellite images of more than 6,000 known mining regions all around the world.

Visually identifying such a large number of mines in these images is not an easy task. Imagine you are flying and watching from the window of a plane, how many objects on the Earth’s surface can you identify and how fast? Using satellite images, we searched and mapped mines over the whole globe. It was a very time-consuming and exhausting task, but we also learned a lot about what is happening on the ground. Besides, it was very interesting to virtually visit a vast range of mining places across the globe and realize the large variety of ecosystems that are affected by our increasing demand for nature’s resources.

The result of our adventure is a global data set covering more than 21,000 mapped areas adding up to around 57,000 km² (that is about the size of Croatia or Togo). These mapped areas cover open cuts, tailings dams, piles of rocks, buildings, and other infrastructures related to the mining activities — some of them extending to almost 10 km (see figure below). We also learned that around 50 % of the mapped mining area is concentrated in only five countries, China, Australia, the United States, Russia, and Chile.

Examples of mines viewed from Google Satellite images. (a) Caraj\'{a}s iron ore mine in Brazil, (b) Batu Hijau copper-gold mine in Indonesia, and (c) Super Pit gold mine in Australia. In purple is the data collected for these mines (Figure source: www.nature.com/articles/s41597-020-00624-w).

Using these data, we can improve the calculation of environmental indicators of global mineral extraction and thus support the development of less harmful ways to extract natural resources. Further, linking these impacts to supply chains can help to answer questions related to our consumption of goods. For example, which impacts the extraction of minerals used in our smartphones cases and where on the planet they occur? We hope that many others will use the mining areas data for their own research and applications. Therefore, the data is fully open to everyone. You can explore the global mining areas using our visualization tool at www.fineprint.global/viewer or you can download the full data set from doi.pangaea.de/10.1594/PANGAEA.910894. The complete description of the data and methods is in our paper available from www.nature.com/articles/s41597-020-00624-w.

This blog post first appeared on the Springer Nature “Behind the paper” website. Read the original post here.

Note: This article gives the views of the authors, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.

Open science has to go beyond open source

By Daniel Huppmann, research scholar in the IIASA Energy Program

Daniel Huppmann sheds light on how open-source scientific software and FAIR data can bring us one step closer to a community of open science.

© VectorMine | Dreamstime.com

Over the past decade, the open-source movement (e.g., the Free Software Foundation (FSF) and the Open Source Initiative (OSI)) has had a tremendous impact on the modeling of energy systems and climate change mitigation policies. It is now widely expected – in particular by and of early-career researchers – that data, software code, and tools supporting scientific analysis are published for transparency and reproducibility. Many journals actually require that authors make the underlying data available in line with the FAIR principles – this acronym stands for findable, accessible, interoperable, and reusable. The principles postulate best-practice guidance for scientific data stewardship. Initiatives such as Plan S, requiring all manuscripts from projects funded by the signatories to be released as open-access publications, lend further support to the push for open science.

Alas, the energy and climate modeling community has so far failed to realize and implement the full potential of the broader movement towards collaborative work and best practice of scientific software development. To live up to the expectation of truly open science, the research community needs to move beyond “only” open-source.

Until now, the main focus of the call for open and transparent research has been on releasing the final status of scientific work under an open-source license – giving others the right to inspect, reuse, modify, and share the original work. In practice, this often means simply uploading the data and source code for generating results or analysis to a service like Zenodo. This is obviously an improvement compared to the previously common “available upon reasonable request” approach. Unfortunately, the data and source code are still all too often poorly documented and do not follow best practice of scientific software development or data curation. While the research is therefore formally “open”, it is often not easily intelligible or reusable with reasonable effort by other researchers.

What do I mean by “best practice”? Imagine I implement a particular feature in a model or write a script to answer a specific research question. I then add a second feature – which inadvertently changes the behavior of the first feature. You might think that this could be easily identified and corrected. Unfortunately, given the complexity and size to which scientific software projects tend to quickly evolve, one often fails to spot the altered behavior immediately.

One solution to this risk is “continuous integration” and automated testing. This is a practice common in software development: for each new feature, we write specific tests in an as-simple-as-possible example at the same time as implementing the function or feature itself. These tests are then executed every time that a new feature is added to the model, toolbox, or software package, ensuring that existing features continue to work as expected when adding a new functionality.

Other practices that modelers and all researchers using numerical methods should follow include using version control and writing documentation throughout the development of scientific software rather than leaving this until the end. In addition, not just the manuscript and results of scientific work should be scrutinized (aka “peer review”), but such appraisal should also apply to the scientific software code written to process data and analyze model results. In addition, like the mentoring of early-career researchers, such a review should not just come at the end of a project but should be a continuous process throughout the development of the manuscript and the related analysis scripts.

In the course that I teach at TU Wien, as well as in my work on the MESSAGEix model, the Intergovernmental Panel on Climate Change Special Report on Global Warming of 1.5°C scenario ensemble, and other projects at the IIASA Energy Program, I try to explain to students and junior researchers that following such best-practice steps is in their own best interest. This is true even when it is just a master’s thesis or some coursework assignment. However, I always struggle to find the best way to convince them that following best practice is not just a noble ideal in itself, but actually helps in doing research more effectively. Only when one has experienced the panic and stress caused by a model not solving or a script not running shortly before a submission deadline can a researcher fully appreciate the benefits of well-structured code, explicit dependencies, continuous integration, tests, and good documentation.

A common trope says that your worst collaborator is yourself from six months ago, because you didn’t write enough explanatory comments in your code and you don’t respond to emails. So even though it sounds paradoxical at first, spending a bit more time following best practice of scientific software development can actually give you more time for interesting research. Moreover, when you then release your code and data under an open-source license, it is more likely that other researchers can efficiently build on your work – bringing us one step closer to a community of open science!

Note: This article gives the views of the authors, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.

What did we learn from COVID-19 models?

By Sibel Eker, researcher in the IIASA Energy Program

IIASA researcher Sibel Eker explores the usefulness and reliability of COVID-19 models for informing decision making about the extent of the epidemic and the healthcare problem.

© zack Ng 99 | Dreamstime.com

In the early days of the COVID-19 pandemic, when facts were uncertain, decisions were urgent, and stakes were very high, both the public and policymakers turned not to oracles, but to mathematical modelers to ask how many people could be infected and how the pandemic would evolve. The response was a plethora of hypothetical models shared on online platforms and numerous better calibrated scientific models published in online repositories. A few such models were announced to support governments’ decision-making processes in countries like Austria, the UK, and the US.

With this announcement, a heated debate began about the accuracy of model projections and their reliability. In the UK, for instance, the model developed by the MRC Centre for Global Infectious Disease Analysis at Imperial College London projected around 500,000 and 20,000 deaths without and with strict measures, respectively. These different policy scenarios were misinterpreted by the media as a drastic variation in the model assumptions, and hence a lack of reliability. In the US, projections of the model developed by the University of Washington’s Institute for Health Metrics and Evaluation (IHME) changed as new data were fed into the model, sparking further debate about the accuracy thereof.

This discussion about the accuracy and reliability of COVID-19 models led me to rethink model validity and validation. In a previous study, my colleagues and I showed that, based on a vast scientific literature on model validation and practitioners’ views, validity often equates with how good a model represents the reality, which is often measured by how accurately the model replicates the observed data. However, representativeness does not always imply the usefulness of a model. A commentary following that study emphasized the tradeoff between representativeness and the propagation error caused by it, thereby cautioning against an exaggerated focus on extending model boundaries and creating a modeling hubris.

Following these previous studies, in my latest commentary in Humanities and Social Sciences Communications, I briefly reviewed the COVID-19 models used in public policymaking in Austria, the UK, and the US in terms of how they capture the complexity of reality, how they report their validation, and how they communicate their assumptions and uncertainties. I concluded that the three models are undeniably useful for informing the public and policy debate about the extent of the epidemic and the healthcare problem. They serve the purpose of synthesizing the best available knowledge and data, and they provide a testbed for altering our assumptions and creating a variety of “what-if” scenarios. However, they cannot be seen as accurate prediction tools, not only because no model is able to do this, but also because these models lacked thorough formal validation according to their reports in late March. While it may be true that media misinterpretation triggered the debate about accuracy, there are expressions of overconfidence in the reporting of these models, even though the communication of uncertainties and assumptions are not fully clear.

© Jaka Vukotič | Dreamstime.com

© Jaka Vukotič | Dreamstime.com

The uncertainty and urgency associated with pandemic decision-making is familiar to many policymaking situations from climate change mitigation to sustainable resource management. Therefore, the lessons learned from the use of COVID models can resonate in other disciplines. Post-crisis research can analyze the usefulness of these models in the discourse and decision making so that we can better prepare for the next outbreak and we can better utilize policy models in any situation. Until then, we should take the prediction claims of any model with caution, focus on the scenario analysis capability of models, and remind ourselves one more time that a model is a representation of reality, not the reality itself, like René Magritte notes that his perfectly curved and brightly polished pipe is not a pipe.

References

Eker S (2020). Validity and usefulness of COVID-19 models. Humanities and Social Sciences Communications 7 (1) [pure.iiasa.ac.at/16614]

Note: This article gives the views of the author, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.

Mapping habitats in support of biodiversity research

By Martin Jung, postdoctoral research scholar in the IIASA Ecosystems Services and Management Program.

IIASA postdoc Martin Jung discusses how a newly developed map can help provide a detailed view of important species habitats, contribute to ongoing ecosystem threat assessments, and assist in biodiversity modeling efforts.

Biodiversity is not evenly distributed across our planet. To determine which areas potentially harbor the greatest number of species, we need to understand how habitats valuable to species are distributed globally. In our new study, published in Nature Scientific Data, we mapped the distribution of habitats globally. The habitats we used are based on the International Union for Conservation of Nature (IUCN) Red List habitat classification scheme, one of the most widely used systems to assign species to habitats and assess their extinction risk. The latest map (2015) is openly available for download here. We also built an online viewer using the Google Earth Engine platform where the map can be visually explored and interacted with by simply clicking on the map to find out which class of habitat has been mapped in a particular location.

Figure 1: View on the habitat map with focus on Europe and Africa. For a global view and description of the current classes mapped, please read Jung et al. 2020 or have a look at the online interactive interface.

The habitat map was created as an intersection of various, best-available layers on land cover, climate, and land use (Figure 1). Specifically, we created a decision tree that determines for each area on the globe the likely presence of one of currently 47 mapped habitats. For example, by combining data on tropical climate zones, mountain regions and forest cover, we were able to estimate the distribution of subtropical/tropical moist mountainous rain forests, one of the most biodiverse ecosystems. The habitat map also considers best available land use data to map human modified or artificial habitats such as rural gardens or urban sites. Notably, and as a first, our map also integrates upcoming new data on the global distribution of plantation forests.

What makes this map so useful for biodiversity assessments? It can provide a detailed view on the remaining coverage of important species habitats, contribute to ongoing ecosystem threat assessments, and assist in global and national biodiversity modeling efforts. Since the thematic legend of the map – in other words the colors, symbols, and styles used in the map – follows the same system as that used by the IUCN for assessing species extinction risk, we can easily refine known distributions of species (Figure 2). Up to now, such refinements were based on crosswalks between land cover products (Figure 2b), but with the additional data integrated into the habitat map, such refinements can be much more precise (Figure 2c). We have for instance conducted such range refinements as part of the Nature Map project, which ultimately helped to identify global priority areas of importance for biodiversity and ecosystem services.

Figure 2: The range of the endangered Siamang (Symphalangus syndactylus) in Indonesia and Malaysia according to the IUCN Red List. Up to now refinements of its range were conducted based on land cover crosswalks (b), while the habitat map allows a more complete refinement (c).

Similar as with other global maps, this new map is certainly not without errors. Even though a validation has proved good accuracy at high resolution for many classes, we stress that – given the global extent and uncertainty – there are likely fine-scale errors that propagate from some of the input data. Some, such as the global distribution of pastures, are currently clearly insufficient, with existing global products being either outdated or not highly resolved enough to be useful. Luckily, with the decision tree being implemented on Google Earth Engine, a new version of the map can be created within just two hours.

In the future, we plan to further update the habitat map and ruleset as improved or newer data becomes available. For instance, the underlying land cover data from the European Copernicus Program is currently only available for 2015, however, new annual versions up to 2018 are already being produced. Incorporating these new data would allow us to create time series of the distribution of habitats. There are also already plans to map currently missing classes such as the IUCN marine habitats – think for example of the distribution of coral reefs or deep-sea volcanoes – as well as improving the mapped wetland classes.

Lastly, if you, dear reader, want to update the ruleset or create your own habitat type map, then this is also possible. All input data, the ruleset and code to fully reproduce the map in Google Earth Engine is publicly available. Currently the map is at version 003, but we have no doubt that the ruleset and map can continue to be improved in the future and form a truly living map.

Reference:

Jung M, Raj Dahal P, Butchart SHM, Donald PF, De Lamo X, Lesiv M, Kapos V,Rondinini C, & Visconti P (2020). A global map of terrestrial habitat types. Nature Scientific Data DOI: 10.1038/s41597-020-00599-8 

Note: This article gives the views of the author, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.

How citizen science can fill data gaps for the SDGs

By Dilek Fraisl, researcher in the IIASA Ecosystems Services and Management Program and chair of the WeObserve SDGs and Citizen Science Community of Practice.

How can we address the data gaps for achieving the United Nations’ Sustainable Development Goals (SDGs)? What is the potential of citizen science to track progress on the SDGs as a new source of data? How can we harness citizen science data effectively for evidence-based policymaking and SDG achievement?

These were just some of the questions we had in mind when we started research into the contributions of citizen science to SDG monitoring at the Sustainable Solutions Development Network (SDSN) Thematic Research Network on Data and Statistics (TReNDS). We were aware that citizen science has a role to play, but we didn’t know what the extent of that role would be. We wanted to show where exactly the real potential of citizen science lies in the global SDG indicator framework and also to understand what we can do to bring all the key players together to fully realize this potential.

This research led to our paper “Mapping Citizen Science Contributions to the UN Sustainable Development Goals”, which was recently published in the journal Sustainability Science.

© Litter Intelligence by Sustainable Coastlines

Our most remarkable finding was that citizen science could contribute to the achievement of all 17 Sustainable Development Goals (SDGs) by  providing data for 33% of all SDG indicators. There are currently 247 SDG indicators that are defined in an evolving framework that includes 17 goals and 169 targets. This has huge potential.

We first investigated the metadata and work plans of all the SDG indicators and then searched for citizen science initiatives at global, national, and even local scales that could potentially contribute data to the monitoring of these indicators. This work was carried out with volunteer members of the SDGs and Citizen Science Community of Practice (SDGs CoP) that was launched a year and a half ago for the WeObserve project.

We also looked at the overlap between contributions from citizen science and earth observations in our study. Based on the mapping exercise GEO undertook of the 29 indicators identified, citizen science could support 24. This shows great potential for citizen science and earth observation approaches to complement each other. One example would be Picture Pile  ̶  a flexible tool that ingests imagery from satellites, unmanned aerial vehicles (UAVs), or geotagged photos for rapid assessment and classification.

In Picture Pile, the volunteers are provided with a pair of images taken at different times and asked whether they see any tree loss (to identify deforestation), damaged buildings after a disaster (for post disaster damage assessment), marine plastics (to understand the extent of plastics problem), or to assess levels of poverty (to map poverty), among others. Picture Pile combines earth observation and citizen science approaches that could be used for monitoring some SDG indicators. To name but a few: 1.5.2 Direct economic loss attributed to disasters in relation to global gross domestic product (GDP); 11.1.1 Proportion of urban population living in slums, informal settlements, or inadequate housing; 14.1.1b Floating plastic debris density; and 15.1.1 Forest area as a proportion of total land area. Exploring and realizing this potential of citizen science and earth observation is one of our priorities at the GEO Community Activity on Citizen Science (GEO-CITSCI).

Thanks to this study, we now know which initiatives could be leveraged to contribute to SDG monitoring, and we have the groundwork to show to project teams, National Statistical Offices, and custodian agencies to start discussions around how to realize it fully.

The SDG indicators where citizen science projects are “already contributing” (in green), “could contribute” (in yellow) or where there is “no alignment” (in grey). The overall citizen science contributions to each SDG are summarized as pie charts. Black borders around indicators show the overlap between citizen science and EO, as identified by GEO (2017).

The Picture Pile application (both online and for mobile devices) is designed to be a generic and flexible tool for ingesting imagery that can then be rapidly classified by volunteers. Picture Pile, IIASA.

Another important finding of our work was that the greatest potential for citizen science  ̶  when existing and potential future contributions are combined  ̶  could occur respectively in SDG 15 (Life on Land), SDG 11 (Sustainable Cities and Communities), SDG 3 (Good Health and Wellbeing), and SDG 6 (Clean Water and Sanitation). This shows that citizen science has the greatest potential for input to the environmental SDG indicators.

Of the 93 environmental indicators in the SDG indicator framework identified by the United Nations Environment Programme (UNEP), citizen science could provide inputs for 37 (around 40%) indicators. As 68% of these environmental SDG indicators lack data, again identified by UNEP, also given that we only have 10 years left to achieve the SDGs, we need to start thinking about how to leverage this potential citizen science offer for SDG monitoring.

In order to effectively monitor and ultimately achieve the SDGs, traditional ways of data collection such as censuses or household surveys will not be sufficient. Additionally, they will also be too expensive to cover the wide range of the SDGs with its 169 targets and 247 indicators on a regular basis. We urgently need to act on the results of this study, and to utilize the potential of new ways of data collection such as citizen science, if we are to achieve the SDGs by 2030, but how? Where do we start?

We need to keep working on demonstrating the value of citizen science in the global data ecosystem through initiatives such as the WeObserve the SDGs CoP, building partnerships around citizen science data involving all the stakeholders, and encouraging investment to leverage the use of citizen science data for the SDGs. We should develop case studies and success stories about the use of citizen science by NSOs and design the citizen science initiatives with NSOs and other government agencies to ensure that their data quality requirements are met.

I believe it is important to mention that citizen science is not only a source of data that could fill gaps, but it is also a great way to mobilize action and get everyone on board to play their part in addressing the world’s greatest challenges by engaging the public in scientific research. Working together, we can harness the potential of citizen science to achieve the UN Sustainable Development Goals (SDGs).

This post first appeared on the Group on Earth Observations (GEO) blog.

Note: This article gives the views of the author, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.