Author Archives: Markus Konrad

Interactive visualization of geospatial data with R Shiny

As a supplement to a recently published study by Marcel Helbig and Katja Salomo (available only in German) about socioeconomic inequalities for children in seven German cities, I’ve created an interactive web visualization with R Shiny and I wanted to share a few experiences that I made during development. This will be mainly about interactive visualization of geospatial data and custom UI elements. Below is a link to an example showing social welfare support rate amongst children and several environmental characteristics in Saarbrucken.

Read More →

Simplifying geospatial features in R with sf and rmapshaper

When working with geospatial data, memory consumption and computation time can become quite a problem, since these datasets are often very large. You may have very granular, high resolution data although that isn’t really necessary for your use-case, for example when plotting small scale maps or when applying calculations at a spatial level for which lower granularity is sufficient. In such scenarios, it’s best to first simplify the geospatial features (the sets of points, lines and polygons that represent geographic entities) in your data. By simplify I mean that we generally reduce the amount of information that is used to represent these features, i.e. we remove complete features (e.g. small islands), we join features (e.g. we combine several overlapping or adjacent features) or we reduce the complexity of features (e.g. we remove vertices or holes in a feature). Since applying these operations comes with information loss you should be very careful about how much you simplify and if this in some way biases you results.

In R, we can apply this sort of simplification with a few functions from the packages sf and, for some special cases explained below, rmapshaper. In the following, I will show how to apply them and what the effects of their parameters are. The data and source code are available on GitHub.

Read More →

Linkdump #136

R
Python
Other interesting articles, projects and news

Robust web scraping or web API based data collection

There are thousands of articles on the web about web scraping and accessing web APIs. Most of them show you how to extract information from specific elements on a web page or how to communicate with a specific API in order to collect data. For smaller data collection projects, this knowledge may be sufficient, but large scale data collection which must run reliably over days or even weeks brings up additional problems that mainly focus on the robustness of the data collection process. I will try to tackle some of these problems in this post. I will use examples in Python, but the basic concepts can easily be translated to R or other programming languages.

Read More →

Linkdump #135

R
Python
Other interesting articles, projects and news

Spiegel Online news topics and COVID-19 – a topic modeling approach

I created a project to showcase topic modeling with the tmtoolkit Python package: I use a corpus of articles from the German online news website Spiegel Online (SPON) to create a topic model for before and during the COVID-19 pandemic. This topic model is then used to analyze the volume of media coverage regarding the pandemic and how it changed over time.

National daily infection numbers clearly drive the volume of media coverage on COVID-19 during the observation period (January 2020 to end of August 2020) on SPON, which is probably not very surprising. Even though infection rates increased dramatically in the world in summer 2020 (e.g. in Brazil, India and USA), media coverage first decreased and then stayed at a moderate level, indicating that SPON doesn’t respond so much to rising infection rates at an international level.

You can have a look at the report here. All scripts are available in the GitHub repository.

Linkdump #134

R
Python
Other interesting articles, projects and news

Linkdump #133

R
Python
Other interesting articles, projects and news

Linkdump #132

R
Python
Other interesting articles, projects and news

Linkdump #131

R
Python
Other interesting articles, projects and news