Refugee Watch keeps track of news related to refugees and migration. Take action in your community by recording and sharing voices from refugee communities. Together, we can create a space for dialogue about issues that matter.

What we learned at State of the Map 2022

At State of the Map in Florence, HOT’s Data team, in collaboration with our Tech team, members of the Quality Assurance & Control working group, Training working group, and HeiGIT, organized several sessions focusing on identifying and addressing data quality issues in OpenStreetMap (OSM). The data quality discussion session was attended by 30 participants, representing diverse interests and organizations.

The first question we asked the group was to “define data quality” from their perspective in order to understand and to set expectations of the session. The responses in the workshop can roughly be grouped according to four themes:

These raise a number of follow up questions:

OSM data needs to be fit for purpose

From our side, the definition the Data team has been using to structure its work is that OSM data has to be fit for purpose; it must be able to address and inform the most important and impactful use cases and decision making to aid our (humanitarian) partners and communities’ use of maps and data. What constitutes “good enough” data depends on the context and immediate use cases that the data is being requested for.

For example, consider how different the data needs look in two sudden onset disasters such as cyclone Idai in Mozambique or humanitarian response for refugee movements such as the South Sudanese civil war. Programs where data is used for rural development focusing on rural electrification and financial inclusion, or urban planning with a specific focus on public transport and flood resilience.

A large community of researchers has already analyzed the quality of OSM data. The work of Senaratne et al. (2015) provides a good overview. It has been acknowledged that OSM data in general is strongly biased, in part due to a much larger contributor basis in countries in the global North as a consequence of socio-economic inequalities and the digital divide.

The HeiGIT team at Heidelberg University has addressed OSM data quality questions in research projects over the past years, e.g. to for land use and land cover, and others have investigated the completeness of the road network in OSM. The differences in OSM data quality are visible also on the regional and local scale, such as the ongoing research about building completeness in cities. That’s why we need better spatial data quality assessment to find out if data is “good enough” in my study area for my purpose. This would promote the adoption and (right) usage of new sources of data such as OSM and data products based on OSM. The ohsome quality analyst (OQT) software computes and provides some of these regional data quality estimations, such as indicators for the currentness and completeness of buildings and roads.

Bringing OSM data quality research into practice and to better understand what is the “purpose” OSM data should be fit for, HOT has defined its 5 Impact areas. This helps us decide what we consider to be the core datasets (and OSM data models) that HOT’s communities and partners use. The impact areas are also the starting points to define a set of “data use cases” that contribute most to HOT’s overall mission and strategy.

unnamed (1)-f5c90b.png

This leads to a set of requirements that indicate when data is suitable to satisfy selected data use cases.Along with identifying metrics and measurements of data quality situated within the three major categories:positional accuracy, completeness, and semantic accuracy. We identified these as part of the first version of the “Data Quality top 10”.

What happens next

A number of activities aiming to improve data quality are already underway. These focus strongly on areas around remote sensing, digitization, and validation via Tasking Manager. We are expanding on these with a wider engagement around data quality aspects via the following:

Based on requests from the group that attended, we will be keeping those wanting to learn more about HOT’s roadmap and activities informed in the following ways:

This post was originally published on Humanitarian OpenStreetMap Team.

Citations

[1]https://www.tandfonline.com/doi/abs/10.1080/13658816.2016.1189556[2] IDEAL-VGI: Information Discovery from Big Earth Observation Data Archives by Learning from Volunteered Geographic Information ➤ https://www.geog.uni-heidelberg.de/gis/ideal_en.html[3] The world’s user-generated road map is more than 80% complete | PLOS ONE ➤ https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0180698[4] Investigating the digital divide in OpenStreetMap: spatio-temporal analysis of inequalities in global urban building completeness | Research Square ➤ https://www.researchsquare.com/article/rs-1913150/v1[5] GitHub - GIScience/ohsome-quality-analyst: Data quality estimations for OpenStreetMap ➤ https://github.com/GIScience/ohsome-quality-analyst[6]https://www.hotosm.org/impact-areas/impact-areas/