As a supplement to a recently published study by Marcel Helbig and Katja Salomo (available only in German) about socioeconomic inequalities for children in seven German cities, I’ve created an interactive web visualization with R Shiny and I wanted to share a few experiences that I made during development. This will be mainly about interactive visualization of geospatial data and custom UI elements. Below is a link to an example showing social welfare support rate amongst children and several environmental characteristics in Saarbrucken.
Read More →Using Google Places data to analyze changes in mobility during the COVID-19 pandemic
During the COVID-19 pandemic, it’s apparent that location data gathered by private IT companies and telcos is a primary source for many studies about the effect of mobility restrictions on people’s behaviors and movements. In this blog post, I’d like to have a look at the “popular times” data provided by Google Places. I explain the limitations of this data, show how to gather it and provide some results from data that I fetched during March and April.
Read More →A Twitter network of members of the 19th German Bundestag β part II
This is the second part about my project that deals with the Twitter network of members of the Bundestag. After getting the necessary data, which was explained in part 1, we will now focus on creating a network graph with links between the representatives’ Twitter accounts for exploratory network analysis.
Read More →A Twitter network of members of the 19th German Bundestag β part I
For the R tutorial that I gave at the WZB in the previous semester, I gave an introduction on how to query web APIs β specifically the Twitter API β and automated data extraction from websites (i.e. web scraping). I showed an example that combined both of these techniques for the goal of getting data about the Twitter activities of members of the current (19th) German Bundestag, which is the federal German parliament. The focus was especially on the question “who follows who” on Twitter. I thought it’s a nice little project showing how to use the Twitter API, do web scraping, combine the collected data and do some exploratory network analysis β all within the R environment. So I decided to polish the code a little bit, put in on GitHub and wrote two blog posts. The first part, i.e. this part, is all about getting the data.
Zooming in on maps with sf and ggplot2
When working with geo-spatial data in R, I usually use the sf package for manipulating spatial data as Simple Features objects and ggplot2 with geom_sf
for visualizing these data. One thing that comes up regularly is “zooming in” on a certain region of interest, i.e. displaying a certain map detail. There are several ways to do so. Three common options are:
- selecting only certain areas of interest from the spatial dataset (e.g. only certain countries / continent(s) / etc.)
- cropping the geometries in the spatial dataset using
sf_crop()
- restricting the display window via
coord_sf()
I will show the advantages and disadvantages of these options and especially focus on how to zoom in on a certain point of interest at a specific “zoom level”. We will see how to calculate the coordinates of the display window or “bounding box” around this zoom point.
Lab report: Development of school sites in eastern Germany
I wanted to share a small lab report on a project about the development of school sites in eastern Germany since 1992. Rita Nikolai (HU Berlin), Marcel Helbig (WZB) and I published our results a few months ago (see this WZB Discussion Paper or this WZBrief), but I’d like to provide some additional information on the (technical) background in this post as this was not the aim of the mentioned papers.
Three ways of visualizing a graph on a map
When visualizing a network with nodes that refer to a geographic place, it is often useful to put these nodes on a map and draw the connections (edges) between them. By this, we can directly see the geographic distribution of nodes and their connections in our network. This is different to a traditional network plot, where the placement of the nodes depends on the layout algorithm that is used (which may for example form clusters of strongly interconnected nodes).
In this blog post, I’ll present three ways of visualizing network graphs on a map using R with the packages igraph, ggplot2 and optionally ggraph. Several properties of our graph should be visualized along with the positions on the map and the connections between them. Specifically, the size of a node on the map should reflect its degree, the width of an edge between two nodes should represent the weight (strength) of this connection (since we can’t use proximity to illustrate the strength of a connection when we place the nodes on a map), and the color of an edge should illustrate the type of connection (some categorical variable, e.g. a type of treaty between two international partners).
Visualizing graphs with overlapping node groups
I recently came across some data about multilateral agreements, which needed to be visualized as network plots. This data had some peculiarities that made it more difficult to create a plot that was easy to understand. First, the nodes in the graph were organized in groups but each node could belong to multiple groups or to no group at all. Second, there was one “super node” that was connected to all other nodes (while “normal” nodes were only connected within their group). This made it difficult to find the right layout that showed the connections between the nodes as well as the group memberships. However, digging a little deeper into the R packages igraph and ggraph it is possible to get satisfying results in such a scenario.
LATINNO Database online
This week the LATINNO project has published its comprehensive database on democratic innovations in South and Latin America on its official website. 2,400 cases of these innovations have been collected, coded and reviewed and are now publicly available. They can be browsed with the online search tool. Several interactive visualizations have been created to sum up the data.
As reported before, this project on which I have also been working on in the last months was created with the Django framework using the hvad extension for multilingual support. The visualizations were implemented with d3.js.
LATINNO is an ongoing project and more cases of innovations are expected to be added to the database in the next months.
Interactive Balloon Charts with d3-balloon
As explained before, balloon plots can be a good way to compare many observations with lots of variables. I have now created a small extension for d3.js v4 that allows to implement interactive balloon charts quite easily. It is available on GitHub and can be seen live in action for a recent news article about segregation in schools on the WZB website.
Recent Comments