Category Archives: General

Report from the EGU General Assembly 2025

Anyone who thinks geosciences and sociology are light years apart should take a look at the European Geoscience Union (EGU) General Assembly 2025. With around 20,000 participants from over 100 countries, this event is one of the largest platforms worldwide for interdisciplinary scientific exchange—and impressively demonstrates how closely natural, social, and economic sciences are already intertwined. From societal reactions to climate change to digital infrastructures: the dialogue is lively, complex, and forward-looking.

WZB in the Thick of It: Peter Löwe Connects Science & Informatics

The WZB was represented by Peter Löwe (IT & eScience), who serves as in the EGU Programme Committee in the “Earth and Space Science Informatics” (ESSI) division. Since 2023, he has been supporting as a Programme Convener emerging conference topics on “Data, Software and Computing Infrastructures”, linking data-driven infrastructures and emerging scientific best practices.


What Does the Conference Offer from a Sociological Perspective?

1. Software as Both Scientific Object and Tool: Changing Citation Culture
Peter’s analysis of the growing use of Persistent Identifiers (DOIs) for scientific software reflects on the societal recognition and acceptance within the scientific system, publishers and software developer communities. The debate with international publishers underscores that a stable scientific culture requires clear standards, coordinated processes, and training to acknowledge software as a “publication”—an exciting field for ongoing empirically reflective (disciplinary) practice. This work is based on code citation best practices introduced by the book chapter “Open Source – GIS” of the Springer Handbook of Geographic Information (2023, 2n Ed., ) which was compiled under WZB guidance.

2. “Repository Crisis Scorecards”: Quality Assurance in the Digital Research Space
Objective evaluations of research data repositories are a current attempt to make the informational infrastructure more transparent and resilient. Such scorecards not only assist with quality assurance but are also likely to strengthen the trust of researchers, funding bodies, and the public in digital knowledge repositories—a topic that is directly reflected in WZB’s upcoming town hall meeting on Dataverse.

3. GitLab & GitHub: The Social Organization of Technical Tools
Experiences from DLR and the Australian Commonwealth Scientific and Industrial Research Organisation (CSIRO) highlight that implementing software development tools is less about technology itself and more about governance, role distribution, and organizational cultures. This insight invites us to understand implementation processes as socio-technical practices that also shape our understanding of collaboration and responsibility within research institutions.

4. Open Science Education: Continuing to Learn Scientific Practice
The relaunch of the extended “Open Science 101” course by NASA emphasizes that good scientific practice cannot be taken for granted but remains an ongoing educational task. For sociologically interested actors, this is an exciting opportunity to help shape participatory learning formats for reflecting on knowledge norms and ethics.

5. International Division of Labor in Open Science Initiatives
WZB’s invitation to collaborate in the Research Data Alliance (https://www.rd-alliance.org/) implementation group on Complex Citations shows that research is increasingly coordinated globally—normative questions related to software citation and data practices are part of this governance. Here, technical and social standards are negotiated that carry significance far beyond individual disciplines.

6. National Research Infrastructures as Social Structures
Participation in infratsructure-focused projects of the German Nationale Forschungsdateninfrastruktur (NFDI) federation, such as NFDIbase, NFDIearth, and NFDIxsw demonstrates how digital research infrastructure makes national coalitions between institutions, domains, and funding policies visible—a rewarding research field for power relations, resource distribution, and knowledge production.


The Big Debate: Language Models Between Fascination and Reflection

One of the highlights was the controversial discussion about the rapid development of AI language models. These developments are following exponential growth comparable to Moore’s Law—performance and energy consumption nearly double annually. For the WZB, this not only entails technological challenges but also ethical, societal, and scientific questions: What should responsible research with such tools look like? What are the impacts on transparency, inclusion, and epistemic justice? This debate exemplifies the exciting tension fields of digital transformation in science.


Conclusion: Between Rock and Byte – New Horizons for Science and Society in the Face of Climate Change

The EGU General Assembly 2025 offers far more than geo-technical insights—it opens a window onto the complex interweaving of natural and social sciences, technological infrastructures, and societal demands. The WZB is actively involved in shaping this dynamic process—from data culture to digital collaboration to open science and Climate Change. For those interested in sociology, this means: research remains a multifaceted project that must continuously bring technical innovation and societal reflection together.

Stay curious—more exciting reports will follow on the WZB’s eScience blog!


Further Reading:

Some thoughts about the use of cloud services and web APIs in social science research

In the recent weeks I’ve collaborated on the online book APIs for social scientists and added two chapters: a chapter about the genderize.io API and a chapter about the GitHub API. The book seeks to provide an overview about web or cloud services and their APIs that might be useful for social scientists and covers a wide range from text translation to accessing social media APIs complete with code examples in R. By harnessing the GitHub workflow model, the book itself is also a nice example of fruitful collaboration via work organization methods that were initially developed in the open source software community.

While working on the two chapters and playing around with the APIs, I once again noticed the double-edged nature of using web APIs in research. It can greatly improve research or even enable research that was not possible before. At the same time, data collected from these APIs can inject bias and the use of these APIs may cause issues with research transparency and replicability. I noted some of these issues in the respective book chapters and I’ve written about them before,[1]See this article in WZB Mitteilungen (only in German) together with Jonas Wiedner. but the two APIs that I covered for the book provide some very practical examples of the main issues when working with web APIs and I wanted to point them out in this blog post.

Read More →

Spatially weighted averages in R with sf

Spatial joins allow to augment one spatial dataset with information from another spatial dataset by linking overlapping features. In this post I will provide an example showing how to augment a dataset containing school locations with socioeconomic data of their surrounding statistical region using R and the package sf (Pebesma 2018). This approach has the drawback that the surrounding statistical region doesn’t reflect the actual catchment area of the school. I will present an alternative approach where the overlaps of the schools’ catchment areas with the statistical regions allow to calculate the weighted average of the socioeconomic statistics. If we have no data about the actual catchment areas of the schools, we may resort to approximating these areas as circular regions or as Voronoi regions around schools.

Read More →

A tip for the impatient: Simple caching with Python pickle and decorators

During testing and development, it is sometimes necessary to rerun tasks that take quite a long time. One option is to drink coffee in the mean time, the other is to use caching, i.e. save once calculated results to disk and load them from there again when necessary. The Python module pickle is perfect for caching, since it allows to store and read whole Python objects with two simple functions. I already showed in another article that it’s very useful to store a fully trained POS tagger and load it again directly from disk without needing to retrain it, which saves a lot of time.

Read More →

About the WZB Data Science Blog

This blog collects some experiences from my daily work in the Data Science field of the WZB. The posts will focus around the following topics:

  • Data extraction / data mining
  • Data visualization
  • Data analysis

Read More →