Processing Raw OSM Data for Maximum Insights

By David Meertens and Arjan Knibbe

ChatGPT Image Nov 10, 2025, 10_23_03 AM

In recent years, the utilization of raw, unprocessed OpenStreetMap (OSM) data has gained significant traction as an alternative to the use of private data for Geographic Information Systems (GIS) applications. The reasons for this trend are rooted in several key advantages of raw OSM data over APIs based on private data. Investors in real assets, with strong spatial-related risk/return/uncertainty characteristics, need to pay particular attention.

Data Availability

One of the major challenges in data analysis is obtaining access to sufficient data to achieve statistically significant results. When working with private data APIs, this challenge is compounded by the restrictions and limitations that are often placed on the amount and type of data that can be accessed. This can hinder meaningful analysis, as the amount of data obtained is often insufficient to draw meaningful conclusions. On the other hand, raw OSM data is freely available for download and can be used without restrictions. This allows users the ability to work with the full set of data, without worrying about hitting usage caps or incurring additional costs. With access to the complete set of data, users can perform more detailed and comprehensive analyses, which can lead to more accurate conclusions and insights. In addition, the ability to work with the complete set of data enables users to discover patterns and relationships that may have been missed with limited data access.

Shield Against Price Gouging

Using OSM as a source can also prevent opportunistic behaviour by re-sellers of private data. For example when prices are increased, once the API detects strong client dependency. Not unlike booking flights from certain URL addresses. Radically increased data pricing can leave dependent investors with the choice to either redesign their data architecture and effectively their investment process or pay up.

Data Freshness

Data freshness is another important aspect of using spatial data. When it comes to APIs that use private data, the data they provide is often delayed by a few days or even weeks. This can be problematic when trying to obtain the most up-to-date information about a location. In contrast, raw OSM data is constantly updated and can be downloaded at any time, ensuring the most current and accurate representation of the world.

Data Customization

Additionally, raw OSM data offers greater flexibility in terms of data customization. Unlike APIs that use private data, which provide pre-processed data that may be limited in terms of attributes and fields, raw OSM data can be processed in a manner that suits the specific needs of the user. This allows for the creation of custom attributes and fields and offers greater control over the data being processed. In the context of machine learning, this flexibility is especially important because the success of a machine learning model relies heavily on the quality of the training data.

For example, in the case of identifying construction sites from satellite imagery, certain attributes, such as the size of the construction site, the type of construction activity taking place, and the date of the construction, may be more relevant than others. In contrast, private APIs often provide pre-processed data that may be limited in terms of attributes and fields, which can hinder the development of an accurate machine-learning model.

Speed and Scalability

The raw nature of OSM data also offers improved speed and scalability. APIs that use private data can become slow and unreliable when dealing with large amounts of data or complex queries, whereas raw OSM data can be processed locally using optimized algorithms and data structures. This can result in faster processing times and improved performance when handling large amounts of data.

Identifying and monitoring future real estate supply

At KR&A we have managed to cleanse, enrich and aggregate building site information to the assets of the funds on our platform. Both the portfolios in the public domain: the listed property companies and the REITs and the German Open-Ended Funds, as well as the private real estate funds of clients. As a result customers now understand future property supply earlier and much more precisely than others. In addition, they can monitor this supply over time for their portfolios and those of others. This off-market information allows them to better estimate scarcity, rental growth, property valuations and vacancy rates. This gives the opportunity to adjust portfolios before the market reflects this new information.

OSM benefits

In summary, utilizing raw and unprocessed OSM data instead of APIs that use private data presents several notable benefits, including enhanced data availability, freshness, customization, speed, scalability, and offline access. These benefits make raw OSM data a valuable asset for various real estate investment applications and use cases, providing greater flexibility and control in data processing. With raw OSM data, real estate investors have access to a complete set of data, without the limitations and costs associated with private data, allowing for improved analysis and conclusions to be drawn. Furthermore, raw OSM data can be customized to suit specific needs and processed locally, offering faster processing times and the ability to handle large amounts of data. The constant updates to the data through the OSM platform ensure that the most recent information is available for investors to analyze, providing a more up-to-date representation of the world. KR&A has successfully cleaned, enriched, and aggregated building site information for their platform’s assets, enabling clients to gain a more precise understanding of future property supply, monitor it over time, and adjust portfolios accordingly, ultimately leading to better estimates of scarcity, rental growth, property valuations, and vacancy rates.

Continue Reading

Subscribe to our newsletter
for the latest updates

You will be updated on the latest developments and informed about new blogs being published.