Open science has to go beyond open source

By Daniel Huppmann, research scholar in the IIASA Energy Program

Daniel Huppmann sheds light on how open-source scientific software and FAIR data can bring us one step closer to a community of open science.

© VectorMine | Dreamstime.com

Over the past decade, the open-source movement (e.g., the Free Software Foundation (FSF) and the Open Source Initiative (OSI)) has had a tremendous impact on the modeling of energy systems and climate change mitigation policies. It is now widely expected – in particular by and of early-career researchers – that data, software code, and tools supporting scientific analysis are published for transparency and reproducibility. Many journals actually require that authors make the underlying data available in line with the FAIR principles – this acronym stands for findable, accessible, interoperable, and reusable. The principles postulate best-practice guidance for scientific data stewardship. Initiatives such as Plan S, requiring all manuscripts from projects funded by the signatories to be released as open-access publications, lend further support to the push for open science.

Alas, the energy and climate modeling community has so far failed to realize and implement the full potential of the broader movement towards collaborative work and best practice of scientific software development. To live up to the expectation of truly open science, the research community needs to move beyond “only” open-source.

Until now, the main focus of the call for open and transparent research has been on releasing the final status of scientific work under an open-source license – giving others the right to inspect, reuse, modify, and share the original work. In practice, this often means simply uploading the data and source code for generating results or analysis to a service like Zenodo. This is obviously an improvement compared to the previously common “available upon reasonable request” approach. Unfortunately, the data and source code are still all too often poorly documented and do not follow best practice of scientific software development or data curation. While the research is therefore formally “open”, it is often not easily intelligible or reusable with reasonable effort by other researchers.

What do I mean by “best practice”? Imagine I implement a particular feature in a model or write a script to answer a specific research question. I then add a second feature – which inadvertently changes the behavior of the first feature. You might think that this could be easily identified and corrected. Unfortunately, given the complexity and size to which scientific software projects tend to quickly evolve, one often fails to spot the altered behavior immediately.

One solution to this risk is “continuous integration” and automated testing. This is a practice common in software development: for each new feature, we write specific tests in an as-simple-as-possible example at the same time as implementing the function or feature itself. These tests are then executed every time that a new feature is added to the model, toolbox, or software package, ensuring that existing features continue to work as expected when adding a new functionality.

Other practices that modelers and all researchers using numerical methods should follow include using version control and writing documentation throughout the development of scientific software rather than leaving this until the end. In addition, not just the manuscript and results of scientific work should be scrutinized (aka “peer review”), but such appraisal should also apply to the scientific software code written to process data and analyze model results. In addition, like the mentoring of early-career researchers, such a review should not just come at the end of a project but should be a continuous process throughout the development of the manuscript and the related analysis scripts.

In the course that I teach at TU Wien, as well as in my work on the MESSAGEix model, the Intergovernmental Panel on Climate Change Special Report on Global Warming of 1.5°C scenario ensemble, and other projects at the IIASA Energy Program, I try to explain to students and junior researchers that following such best-practice steps is in their own best interest. This is true even when it is just a master’s thesis or some coursework assignment. However, I always struggle to find the best way to convince them that following best practice is not just a noble ideal in itself, but actually helps in doing research more effectively. Only when one has experienced the panic and stress caused by a model not solving or a script not running shortly before a submission deadline can a researcher fully appreciate the benefits of well-structured code, explicit dependencies, continuous integration, tests, and good documentation.

A common trope says that your worst collaborator is yourself from six months ago, because you didn’t write enough explanatory comments in your code and you don’t respond to emails. So even though it sounds paradoxical at first, spending a bit more time following best practice of scientific software development can actually give you more time for interesting research. Moreover, when you then release your code and data under an open-source license, it is more likely that other researchers can efficiently build on your work – bringing us one step closer to a community of open science!

Note: This article gives the views of the authors, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.

Mapping habitats in support of biodiversity research

By Martin Jung, postdoctoral research scholar in the IIASA Ecosystems Services and Management Program.

IIASA postdoc Martin Jung discusses how a newly developed map can help provide a detailed view of important species habitats, contribute to ongoing ecosystem threat assessments, and assist in biodiversity modeling efforts.

Biodiversity is not evenly distributed across our planet. To determine which areas potentially harbor the greatest number of species, we need to understand how habitats valuable to species are distributed globally. In our new study, published in Nature Scientific Data, we mapped the distribution of habitats globally. The habitats we used are based on the International Union for Conservation of Nature (IUCN) Red List habitat classification scheme, one of the most widely used systems to assign species to habitats and assess their extinction risk. The latest map (2015) is openly available for download here. We also built an online viewer using the Google Earth Engine platform where the map can be visually explored and interacted with by simply clicking on the map to find out which class of habitat has been mapped in a particular location.

Figure 1: View on the habitat map with focus on Europe and Africa. For a global view and description of the current classes mapped, please read Jung et al. 2020 or have a look at the online interactive interface.

The habitat map was created as an intersection of various, best-available layers on land cover, climate, and land use (Figure 1). Specifically, we created a decision tree that determines for each area on the globe the likely presence of one of currently 47 mapped habitats. For example, by combining data on tropical climate zones, mountain regions and forest cover, we were able to estimate the distribution of subtropical/tropical moist mountainous rain forests, one of the most biodiverse ecosystems. The habitat map also considers best available land use data to map human modified or artificial habitats such as rural gardens or urban sites. Notably, and as a first, our map also integrates upcoming new data on the global distribution of plantation forests.

What makes this map so useful for biodiversity assessments? It can provide a detailed view on the remaining coverage of important species habitats, contribute to ongoing ecosystem threat assessments, and assist in global and national biodiversity modeling efforts. Since the thematic legend of the map – in other words the colors, symbols, and styles used in the map – follows the same system as that used by the IUCN for assessing species extinction risk, we can easily refine known distributions of species (Figure 2). Up to now, such refinements were based on crosswalks between land cover products (Figure 2b), but with the additional data integrated into the habitat map, such refinements can be much more precise (Figure 2c). We have for instance conducted such range refinements as part of the Nature Map project, which ultimately helped to identify global priority areas of importance for biodiversity and ecosystem services.

Figure 2: The range of the endangered Siamang (Symphalangus syndactylus) in Indonesia and Malaysia according to the IUCN Red List. Up to now refinements of its range were conducted based on land cover crosswalks (b), while the habitat map allows a more complete refinement (c).

Similar as with other global maps, this new map is certainly not without errors. Even though a validation has proved good accuracy at high resolution for many classes, we stress that – given the global extent and uncertainty – there are likely fine-scale errors that propagate from some of the input data. Some, such as the global distribution of pastures, are currently clearly insufficient, with existing global products being either outdated or not highly resolved enough to be useful. Luckily, with the decision tree being implemented on Google Earth Engine, a new version of the map can be created within just two hours.

In the future, we plan to further update the habitat map and ruleset as improved or newer data becomes available. For instance, the underlying land cover data from the European Copernicus Program is currently only available for 2015, however, new annual versions up to 2018 are already being produced. Incorporating these new data would allow us to create time series of the distribution of habitats. There are also already plans to map currently missing classes such as the IUCN marine habitats – think for example of the distribution of coral reefs or deep-sea volcanoes – as well as improving the mapped wetland classes.

Lastly, if you, dear reader, want to update the ruleset or create your own habitat type map, then this is also possible. All input data, the ruleset and code to fully reproduce the map in Google Earth Engine is publicly available. Currently the map is at version 003, but we have no doubt that the ruleset and map can continue to be improved in the future and form a truly living map.

Reference:

Jung M, Raj Dahal P, Butchart SHM, Donald PF, De Lamo X, Lesiv M, Kapos V,Rondinini C, & Visconti P (2020). A global map of terrestrial habitat types. Nature Scientific Data DOI: 10.1038/s41597-020-00599-8 

Note: This article gives the views of the author, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.

EGU2020 – a virtual experience of a first-time conference visit

Jarmo Kikstra, a research assistant in the IIASA Energy Program, shares his experience at EGU2020: Sharing Geosciences Online.

© Freepik

When our abstract for the 2020 General Assembly of the European Geosciences Union (EGU2020) was accepted, I was very excited as this would be my first scientific conference. EGU2020 was a few weeks ago and took place completely virtually for the first time due to COVID-19. Let’s reflect upon what this experience was like.

As an early career researcher, I was very much looking forward to presenting the research I have worked on for many months. While I had presented preliminary results of this and other research before, both at my university and at my research department, these presentations had been internal or small-scale. EGU2020 was the first time presenting my research to the public, with experts from various fields being able to see the work and provide their input. It felt to me, like a first step into entering the pubic academic debate, an important step into becoming part of a research community.

But clearly, with the ongoing COVID pandemic, the conference was quite different to what I had expected my first conference to be like. EGU2020 became “EGU2020: Sharing Geoscience Online”. With 16,273 scientists participating last year, clearly a big effort took place to move such an event online, and with 26,219 individual online registrations in the online chat system, it seems to have been a success. But of course, not all registrations are equal, and participation numbers are not the only thing that count. So, how was this virtual EGU2020 experience for me?

© Jarmo Kikstra

First, my experience was much shorter than originally envisaged. While this is largely a matter of choice, I, and many other participants, participated in fewer sessions than we would have done during a physical conference. The simple fact of not being ‘out of office’ contributed to me continuing to work on other ongoing tasks for large parts of the week.

However, the chat session and oral presentation session I joined,  were surprisingly intense. Many presentations (that would have normally been poster presentations) were discussed in a plenary chat or oral session, and there was little time (~6 min) available for each presentation, meaning that content was very dense, and discussed at breakneck speed. In this way, a snapshot of the current state of research in my field was provided openly with everyone seeing all comments and all presentations in the session. Something that was missing that could have been useful, by complementing the main chat box, were separate channels for each presentation. This could have made follow-up discussions in the chat sessions easier, without interrupting main discussions on the current presentations, and therewith stimulating one of the most important parts of conferences – feedback on the work you have presented.

Proponents of virtual events will argue that doing this will greatly reduce the environmental footprint of science, as (air) travel is the biggest chunk of GHG emissions of many scientists. In fact, a central debate at EGU2020 discussed this topic, and the first question of the Cercedilla Manifesto reads: “Is a physical meeting necessary?”. Opponents however point to the current impossibilities of replacing the benefits of meeting in-person, including higher engagement, getting an academic network, unexpected (group) discussions, social encounters and events, and the possibility for live feedback, etc. Especially for early career scientists, it is often said that attending conferences is very beneficial.

Networking virtually will never be exactly the same as in person, and I don’t think this is something to aim for. Networking can happen in many different formats; however, it is clear to me now that we can still take quite a few steps into increasing the effectiveness of virtual networking during such events. For instance, I did not ‘meet’ new people, whereas that would have surely happened during a physical meeting, even if I would not have actively made an effort. So perhaps when organizers are putting together a virtual event, it may pay off to be creative in providing virtual networking opportunities.

Many argue that an online event is also much more conducive to opening up science, with an enormous potential for increasing the accessibility to science and scientific discussions and stimulating the development of knowledge. The great success of EGU2020 is probably already in its name: “Sharing Geoscience Online”. On the EGU2020 website you can find thousands of presentations, on all topics that are related to geosciences, with many contributions from IIASA. In other words, a lot of research content has been uploaded to one place, open to everyone; thereby turning this scientific event into a great resource for sharing, learning, asking questions and providing feedback. Discussions on this platform will be ongoing until the end of the month. So, take advantage of this opportunity and have a look!

Note: This article gives the views of the author, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.

Is India’s Ujjwala cooking gas program a success or failure?

By Abhishek Kar, Postdoctoral Research Scientist at Columbia University, USA, and IIASA Young Scientists Summer Program (YSSP) alumnus.

Abhishek Kar shares his thoughts on the Indian government’s Ujjwala program, which aims to scale up household access to Liquefied Petroleum Gas (LPG) for clean cooking.

© Kaiskynet | Dreamstime.com

About 2.9 billion people depend on burning traditional fuels like firewood rather than modern cooking fuels like gas and electricity to cook their daily meals. The household air pollution caused when these fuels are burned, along with the resultant exposure to kitchen smoke causes several respiratory and other diseases. It is estimated that between 2 and 3.6 million people die every year due to lack of access to clean cooking fuels. It also has severe environmental effects like forest degradation and contributes to climate change. To address these challenges, the Indian Government launched a massive program called Pradhan Mantri Ujjwala Yojana (PMUY, or Ujjwala) to scale up household access to Liquefied Petroleum Gas (LPG) in May 2016.

My IIASA Young Scientists Summer Program (YSSP) project under Shonali Pachauri’s supervision was about analyzing consumption patterns of LPG in rural India. We looked at whether there was any differences in consumption patterns between the Ujjwala beneficiaries and general consumers. The analysis formed part of my PhD research and was eventually published as the cover story for the September 2019 issue of the journal, Nature Energy. The journal also invited us to write a policy brief, which was published in January 2020. The study’s findings received widespread media attention, especially in India. When I talk to journalists, they often ask whether the Ujjwala program is a success or a failure. I would like to use this opportunity to clear common misconceptions and share my thoughts.

The Ujjwala program’s original mandate was to tackle the challenge of “lack of access to clean fuel” and to make LPG affordable for poor women. The program provided capital subsidies to this end. Unfortunately, the policy document neither discussed usage of LPG as an exclusive or primary cooking fuel, nor did it provide any incentive for regular use (barring the universal LPG cylinder subsidy that is provided to everyone). The program was ambitious in terms of both scale and timeline, and fulfilled its original aim of providing LPG connections for millions of poor women.

Current debates around the program’s failure to result in smokeless kitchens are happening only because Ujjwala succeeded in fulfilling its original mandate of ensuring physical access. In my opinion, it is truly a remarkable achievement to have reached out to 80 million poor women within 40 months. The process not only involved massive awareness generation and community mobilization, but also ramping up the supply chain to meet increased demand. While I have a lot to say about how Ujjwala can be improved, I think it would be unfair to call it a failure. Access is the first step towards transition to clean fuels, and at least in this respect, it was an extraordinary success, making it a model of energy access for developing countries.

Our research shows that Ujjwala was able to attract new consumers rapidly, but those consumers did not start using LPG on a regular basis. Based on the literature and my own experience, there are five reasons why regular LPG use is a challenge for Ujjwala consumers, and the scheme did not have any specific provisions to effectively address them.

First, rural communities generally have easy access to free firewood, crop residues, cattle dung, etc. So why would they start paying for commercial fuel, when free fuel is readily available for cooking?

Secondly, Ujjwala (bravely) targeted poor women, who generally have limited disposable cash and seasonal, agriculture linked fluctuations in income. If there is no additional income, what costs would a poor family on an already tight budget have to cut to afford such a regular additional expense? While the program has made a 5 kg cylinder option available in response to this issue, the impact on LPG sales is still unknown.

Thirdly, home delivery of LPG cylinders is a challenge in most rural areas, as the cost of delivery for LPG distributors often outweighs the commission they receive. If there is no delivery option, poor rural families who often don’t have access to transport would need to arrange for a cylinder to be picked up from a far-off retail outlet. Oil Marketing Companies have vigorously been pushing for home delivery, but unless there are explicit incentives for this, the situation is unlikely to improve.

© Dmitrii Melnikov | Dreamstime.com

In the fourth place, gender dynamics make the situation even more complicated. Men are often financial decision makers who have to make budget cuts, while women are the primary beneficiaries of LPG in terms of a quick and smokeless cooking experience, with the side benefit of avoiding the drudgery of fuelwood collection. The laudable effort of the LPG panchayat platform, where women share their success stories and strategies to overcome opposition within their homes, is a step in the right direction, but it is unlikely that this will be sufficient to tackle a deep-rooted societal problem.

Lastly, and perhaps most importantly, people will have to stop using mud stoves and start using LPG stoves, which may involve real (or, perceived) changes in the taste, texture, look, and size of food items. As a student of habit change literature, I am surprised that anyone expected that such a switch would not be accompanied by behavior change interventions.

Ultimately, the Ujjwala scheme provided incentives to reduce the burden of the capital cost of LPG connections, and poor female consumers responded to it positively. This is a successful first step towards clean cooking energy transition. However, there were no scheme incentives to promote use, except general LPG subsidies, which is available to all, including the urban middle class. Consumers simply decided that the transition to LPG through regular purchase of LPG refills was not worth it, and did not take the next step. I would however not call this a failure of Ujjwala, as that was never the original program objective.

We have to acknowledge that Ujjwala’s phenomenal success in providing access to clean fuel has put the spotlight on its ineffectiveness to ensure sustained regular use. If you ask me, this is a classic case of the glass half-full or half-empty scenario. Or, as my PhD supervisor at the University of British Columbia, Hisham Zerrifi, puts it: “It depends!”

References:

[1] Kar A, Pachauri S, Bailis R, & Zerriffi H (2019). Using sales data to assess cooking gas adoption and the impact of India’s Ujjwala program in rural Karnataka. Nature Energy DOI: 10.1038/s41560-019-0429-8 [pure.iiasa.ac.at/15994]

Note: This article gives the views of the author, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.

Cost effective solutions to manage nutrient pollution in the Yangtze

By Maryna Strokal, Department of Environmental Sciences, Water Systems and Global Change, Wageningen University and Research, The Netherlands

Maryna Strokal discusses a new integrated approach to finding cost-effective solutions for nutrient pollution and coastal eutrophication developed with IIASA colleagues.

© Huy Thoai | Dreamstime.com

Have you ever wondered why the water in some rivers appear to be green? The green tinge you see is due to eutrophication, which means that too many nutrients – specifically nitrogen and phosphorus – are present in the water. This happens because rivers receive these nutrients from various land-based activities like run-off from agricultural fields and sewage effluents from cities. Rivers in turn export many of these nutrients to coastal waters, where it serves as food for algae. Too many nutrients, however, cause the algae and their blooms to grow more than normal. Because algae consumes a lot of oxygen, this lowers the available oxygen supply in the water, killing off fish and other marine life. Some algae can also be toxic to people when they eat seafood that have been exposed to, or fed on it. Polluted river water on the other hand, is unfit for direct use as drinking water, or for cooking, showering, or any of our other daily needs. Before we can use this water, it needs to be treated, which of course costs money.

To better understand and address these issues, I worked with colleagues from IIASA, Wageningen University, and China to develop an integrated approach to identify cost-effective solutions (read cheapest) to reduce river pollution and thus coastal eutrophication. Our integrated approach takes into account human activities on land, land use, the economy, the climate, and hydrology. We implemented the new approach for the Yangtze Basin in China.

The Yangtze is the third longest river in the world and exports nutrients from ten sub-basins to the East China Sea, where the coast often experiences severe eutrophication problems that may increase in the coming years. The Chinese government has called for effective actions to ensure clean water for both people and nature.

In our paper on this work, which was recently published in the journal Resources, Conservation, and Recycling, my colleagues and I conclude that reducing more than 80% of nutrient pollution in the Yangtze will cost US$ 1–3 billion in 2050. This cost might seem high, but it is actually far below 10% of the income level in the Yangtze basin. We also identified an opportunity in the negative or zero cost range, which would result in a below 80% reduction in nutrient export by the Yangtze. This negative or zero cost alternative involves options to recycle manure on land and reduce the use of chemical fertilizers (Figure 1). More recycling means that farmers will buy less chemical fertilizers and potential savings can then compensate for the expenses related to recycling the manure. We also illustrated the costs that would be involved for ten sub-basins to reduce their nutrient export to coastal waters.

Figure 1. Summarized illustration of eutrophication causes and cost-effective solutions for reducing nutrient export by Yangtze and thus coastal eutrophication in the East China Sea in 2050.

Recycling manure on cropland is an important and cost-effective solution for agriculture in the sub-basins of the Yangtze River (Figure 1). Manure is rich in the nutrients that crops need, and opting for this alternative instead of chemical fertilizers avoids loss of nutrients to rivers, and thus ultimately to coastal waters. Current practices are however still far from ideal, with manure – and especially liquid manure – often being discharged into water because crop and livestock farms are far away from each other, which makes it practically and economically difficult to transport manure to where it is needed. Another reason is the historical practice of farmers using chemical fertilizers on their crops – it is simply how they are used to doing things. Unfortunately, the amounts of fertilizers that farmers apply are often far above what crops actually need, thus leading to river pollution.

The Chinese government are investing in combining crop and livestock production, in other words, they are creating an agricultural sector where crops are used to feed animals and manure from the animals is in turn used to fertilize crops. Chinese scientists are working with farmers to implement these solutions.

In our paper, we showed that these solutions are not only sustainable, but also cost-effective in terms of avoiding coastal eutrophication. We invite you to read our paper for more details.

References

Strokal M, Kahil T, Wada Y, Albiac J, Bai Z, Ermolieva T, Langan S, Ma L, et al. (2020). Cost-effective management of coastal eutrophication: A case study for the Yangtze River basin. Resources, Conservation and Recycling 154: e104635. https://doi.org/10.1016/j.resconrec.2019.104635.

Note: This article gives the views of the author, and not the position of the Nexus blog, nor of the International Institute for Applied Systems Analysis.