Category Archives: Gis / Spatial Data

Spatially weighted averages in R with sf

July 1, 2021 3:51 pm , Markus Konrad

Spatial joins allow to augment one spatial dataset with information from another spatial dataset by linking overlapping features. In this post I will provide an example showing how to augment a dataset containing school locations with socioeconomic data of their surrounding statistical region using R and the package sf (Pebesma 2018). This approach has the drawback that the surrounding statistical region doesn’t reflect the actual catchment area of the school. I will present an alternative approach where the overlaps of the schools’ catchment areas with the statistical regions allow to calculate the weighted average of the socioeconomic statistics. If we have no data about the actual catchment areas of the schools, we may resort to approximating these areas as circular regions or as Voronoi regions around schools.

Posted in: General, GIS / spatial data, R , Tagged: r-bloggers

Interactive visualization of geospatial data with R Shiny

April 16, 2021 3:23 pm , Markus Konrad

As a supplement to a recently published study by Marcel Helbig and Katja Salomo (available only in German) about socioeconomic inequalities for children in seven German cities, I’ve created an interactive web visualization with R Shiny and I wanted to share a few experiences that I made during development. This will be mainly about interactive visualization of geospatial data and custom UI elements. Below is a link to an example showing social welfare support rate amongst children and several environmental characteristics in Saarbrucken.

Posted in: GIS / spatial data, R, Shiny, Visualization, Web Development

Simplifying geospatial features in R with sf and rmapshaper

March 15, 2021 3:50 pm , Markus Konrad

When working with geospatial data, memory consumption and computation time can become quite a problem, since these datasets are often very large. You may have very granular, high resolution data although that isn’t really necessary for your use-case, for example when plotting small scale maps or when applying calculations at a spatial level for which lower granularity is sufficient. In such scenarios, it’s best to first simplify the geospatial features (the sets of points, lines and polygons that represent geographic entities) in your data. By simplify I mean that we generally reduce the amount of information that is used to represent these features, i.e. we remove complete features (e.g. small islands), we join features (e.g. we combine several overlapping or adjacent features) or we reduce the complexity of features (e.g. we remove vertices or holes in a feature). Since applying these operations comes with information loss you should be very careful about how much you simplify and if this in some way biases you results.

In R, we can apply this sort of simplification with a few functions from the packages sf and, for some special cases explained below, rmapshaper. In the following, I will show how to apply them and what the effects of their parameters are. The data and source code are available on GitHub.

Posted in: GIS / spatial data, R , Tagged: r-bloggers

Zooming in on maps with sf and ggplot2

April 30, 2019 10:58 am , Markus Konrad

When working with geo-spatial data in R, I usually use the sf package for manipulating spatial data as Simple Features objects and ggplot2 with geom_sf for visualizing these data. One thing that comes up regularly is “zooming in” on a certain region of interest, i.e. displaying a certain map detail. There are several ways to do so. Three common options are:

selecting only certain areas of interest from the spatial dataset (e.g. only certain countries / continent(s) / etc.)
cropping the geometries in the spatial dataset using sf_crop()
restricting the display window via coord_sf()

I will show the advantages and disadvantages of these options and especially focus on how to zoom in on a certain point of interest at a specific “zoom level”. We will see how to calculate the coordinates of the display window or “bounding box” around this zoom point.

Posted in: GIS / spatial data, R, Visualization , Tagged: r-bloggers

Lab report: Development of school sites in eastern Germany

February 14, 2019 5:11 pm , Markus Konrad

I wanted to share a small lab report on a project about the development of school sites in eastern Germany since 1992. Rita Nikolai (HU Berlin), Marcel Helbig (WZB) and I published our results a few months ago (see this WZB Discussion Paper or this WZBrief), but I’d like to provide some additional information on the (technical) background in this post as this was not the aim of the mentioned papers.

Posted in: Data Mining, GIS / spatial data, PDFs, Python, Visualization

Tools and packages for geospatial processing with Python

August 9, 2018 4:04 pm , Markus Konrad

In the social sciences, geospatial data appears quite often. You may have social indicators for different places on earth at different administrative levels, e.g. countries, states or municipalities. Or you may study spatial distribution of hospitals or schools in a given area, or visualize GPS referenced data from an experiment. For such scenarios, there’s fortunately a rich supply of open-source tools and packages. As I’ve worked recently quite a lot with geospatial data, I want to introduce some of this software, especially those available for the Python programming language.

Posted in: GIS / spatial data, Python

Three ways of visualizing a graph on a map

May 31, 2018 2:32 pm , Markus Konrad

When visualizing a network with nodes that refer to a geographic place, it is often useful to put these nodes on a map and draw the connections (edges) between them. By this, we can directly see the geographic distribution of nodes and their connections in our network. This is different to a traditional network plot, where the placement of the nodes depends on the layout algorithm that is used (which may for example form clusters of strongly interconnected nodes).

In this blog post, I’ll present three ways of visualizing network graphs on a map using R with the packages igraph, ggplot2 and optionally ggraph. Several properties of our graph should be visualized along with the positions on the map and the connections between them. Specifically, the size of a node on the map should reflect its degree, the width of an edge between two nodes should represent the weight (strength) of this connection (since we can’t use proximity to illustrate the strength of a connection when we place the nodes on a map), and the color of an edge should illustrate the type of connection (some categorical variable, e.g. a type of treaty between two international partners).

Posted in: GIS / spatial data, Network analysis, R, Visualization , Tagged: r-bloggers

Creating and plotting Voronoi regions for geographic data with geovoronoi

March 16, 2018 1:07 pm , Markus Konrad

Recently, I’ve worked a lot with geospatial data in Python. One thing that we needed for our analysis was generating Voronoi regions (or “cells”) from a given set of coordinates inside certain administrative boundaries (a country, a state, etc.). Such regions are interesting for spatial analysis, because each random point inside a Voronoi region is closest to the cell’s “origin point” (the point the cell was generated from) than to any other cell’s origin. As a practical example: In Melbourne parents can see which is the closest school for their home, by looking at an online map of Voronoi regions of schools.

These regions allow to calculate an estimate of a “coverage”: For each point’s Voronoi region, the area can be calculated, which represents the area theoretically covered by this point. Referring to the Melbourne example: The schools at the edge of the city cover a larger area than those in the city center. This approach of course does not take geographic properties into account. So if there’s a large lake inside a cell, it is also part of the covered area. Still, Voronoi tessellation is useful when looking at how the shape of the Voronoi regions changes over time, for example when new schools open or others close. We could then see for example, if the coverage of schools in the city center becomes better over the years, whereas in the rural areas it gets more sparse.

So all in all, Voronoi regions can be a very useful tool in spatial data analysis. QGIS provides a tool for Voronoi tessellation but we needed a more flexible approach that also fit into our workflow and could be used in our Python scripts. I decided to write a small Python package named geovoronoi that takes a set of points, a boundary object (the geographic shape enclosing the points – e.g. a country boundary) and then calculates the Voronoi regions using SciPy. These regions are then “cut” to the enclosing shape (using the excellent shapely package). The resulting Voronoi cells can then be used for further calculations (areas, distances, unions, etc.) and can also be visualized on a map.

The package geovoronoi is now available on PyPI (install it with pip install geovoronoi[plotting]) and the source is uploaded on the WZB’s GitHub page.

Posted in: GIS / spatial data, Python

Geocoding an address and performing point-polygon tests with GDAL/OGR in Python

August 11, 2017 11:22 am , Markus Konrad

Suppose you have a list of addresses and want to connect them with some kind of location-based information. For example, your addresses might scatter across several neighborhoods and you want to find out to which neighborhood each address belongs, because you have further information (like mean income, percentage of migrants, etc.) about each neighborhood and want to combine it with your data. In many countries, administrative authorities gather such geographical information and provide the data on their websites.

In the given scenario, three steps are necessary in order to combine the addresses with geographical information:

Geocoding the address, i.e. finding out the geographical coordinates (latitude, longitude) for this address
Given a file with geographical information (GIS data) that form several distinct areas as polygons, finding out which of these polygons contains the geocoded address
Obtain necessary information such as a neighborhood identifier from the polygon

This short post shows how to do that with the Python packages googlemaps and GDAL.

Posted in: GIS / spatial data, Python

Category Archives: Gis / Spatial Data

Spatially weighted averages in R with sf

Interactive visualization of geospatial data with R Shiny

Simplifying geospatial features in R with sf and rmapshaper

Zooming in on maps with sf and ggplot2

Lab report: Development of school sites in eastern Germany

Tools and packages for geospatial processing with Python

Three ways of visualizing a graph on a map

Creating and plotting Voronoi regions for geographic data with geovoronoi

Geocoding an address and performing point-polygon tests with GDAL/OGR in Python

Recent posts

Categories

Links

Links

Recent Posts

Recent Comments

Archives

Categories

Meta