Digital Geography

28. August 2018

Query OpenStreetMap in ArcGIS: OSMQuery

QUickOSM is my weapon of choice when it comes to downloading data from OSM in QGIS. The tool offers an easy way to access tag/key combinations with a designated spatial query. As I was asked how many bus stops Berlin has, I was interested in a similar approach for ArcGIS. So I created my own little tool: OSMQuery.The widely used “ArcGIS Editor for OpenStreetMap” Toolbox for ArcGIS enables user to download and style OpenStreetMap data, create routing networks from them and so on. Yet it always uses a bbox or an osm file. The latter one can be downloaded from the geofabrik server as an example. Yet I think one usecase for OSM is not well covered. The simple question: “Where are all bakeries in Northumberland?” is not easy to answer using the toolbox described above. So I created my own.

Inputs

The tool needs some inputs. The tools uses tags/keys from the QuickOSM plugin as they are quite useful. furthermore we need a String input for a desired region or a bbox field. I was using a Python Toolbox shell for my inputs and logic:

Arcpy Toolbox hull

The inputs are defined as two filter lists, a choice box, a string field for the region and a spatial extent. The outputs are designed as derived outputs without the option to choose a name or something:
    def getParameterInfo(self):
        """Define parameter definitions"""
        ###let's read the config files with Tags and keys###
        param0 = arcpy.Parameter(
            displayName="Tag",
            name="in_tag",
            datatype="GPString",
            parameterType="Required",
            direction="Input"
        )
        param0.filter.list = self.getConfig('all')
        param0.value = param0.filter.list[0]
        param1 = arcpy.Parameter(
            displayName="Key",
            name="in_key",
            datatype="GPString",
            parameterType="Required",
            direction="Input"
        )
        param2 = arcpy.Parameter(
            displayName="Spatial Extent",
            name="in_regMode",
            datatype="GPString",
            parameterType="Required",
            direction="Input"
        )
        param2.filter.list = ["geocode region","define simple bounding box"]
        param2.value = "geocode region"
        param3 = arcpy.Parameter(
            displayName="Region",
            name="in_region",
            datatype="GPString",
            parameterType="Optional",
            direction="Input"
        )
        param4 = arcpy.Parameter(
            displayName="Area Of Interest",
            name="in_bbox",
            datatype="GPExtent",
            parameterType="Optional",
            direction="Input"
        )
        param_out0 = arcpy.Parameter(
            displayName="Layer Containing the Points",
            name="out_nodes",
            datatype="GPFeatureLayer",
            parameterType="Derived",
            direction="Output"
        )
        param_out1 = arcpy.Parameter(
            displayName="Layer Containing the Polylines",
            name="out_ways",
            datatype="GPFeatureLayer",
            parameterType="Derived",
            direction="Output"
        )
        param_out2 = arcpy.Parameter(
            displayName="Layer Containing the Polygons",
            name="out_poly",
            datatype="GPFeatureLayer",
            parameterType="Derived",
            direction="Output"
        )
        params = [param0, param1, param2, param3, param4, param_out0, param_out1, param_out2]

        return params
As you can see, the tag/key pairs are loaded from another function/file:
    def getConfig(self, configItem):
        from os.path import dirname, join, exists, abspath, isfile
        from json import load
        ###load config file
        json_file_config = join(dirname(abspath(__file__)), 'config/tags.json')
        if isfile(json_file_config):
            with open(json_file_config) as f:
                config_json = load(f)
        array = []
        ###select all major tags:
        if configItem == "all":
            for tag in config_json:
                array.append(tag)
        ###select all keys for the desried tag:
        if configItem != "all":
            for key in config_json[configItem]:
                array.append(key)
        return array
The config file is set up as a json dict with a simple tag-key structure:
{
	"admin_level": [
		"1",
		"2",
		...
	],
	"amenity": [
		"animal_boarding",
		"animal_shelter",
		"arts_centre",
		...
	],
	...
}
As we change the tag we also need to reload the list of keys. Therefore we use the updateParameters function:
    def updateParameters(self, parameters):
        #update the parameters of keys according the values of "in_tag"
        parameters[1].filter.list = self.getConfig(parameters[0].value)
        return
Unfortunately we also want to set a parameter enabled or disabled according the choice of the user whether or not to use the bbox/region spatial model. Therefore we enhance the updateParameters function:
    def updateParameters(self, parameters):
        [....]
        if parameters[2].value == "geocode region":
            parameters[3].enabled = True
            parameters[4].enabled = False
        else:
            parameters[3].enabled = False
            parameters[4].enabled = True
        return

Getting the Data via OSMQuery

First of all: we want to send the query to the Overpass API to get results. But as we need the spatial information as well we need to take of this as a first step. If the user checks to use the bbox option we get the coordinates for the query inside the parameter. If the user uses the region option we will need to find the right area. First I geocode the input of the given parameter using Nominatim and convert the resulting relation into an area ID:
        import json     
        import requests   
        if parameters[2].value != "geocode region":
            bboxHead = ''
            bbox = [parameters[4].value.YMin,parameters[4].value.XMin,parameters[4].value.YMax,parameters[4].value.XMax]
            bboxData = '(' + ','.join(str(e) for e in bbox) + ');'
        else:
            ###getting areaID from Nominatim:
            nominatimURL = 'https://nominatim.openstreetmap.org/search?q=' + parameters[3].valueAsText + '&format=json'
            NominatimResponse = requests.get(nominatimURL)
            arcpy.AddMessage("gecode region using the url " + nominatimURL)
            try:
                NominatimData = NominatimResponse.json()

                for result in NominatimData:
                    if result["osm_type"] == "relation":
                        areaID = result['osm_id']
                        try:
                            arcpy.AddMessage("found area " + result['display_name'])
                        except:
                            arcpy.AddMessage("found area " + str(areaID))
                        break
                bboxHead = 'area(' + str(int(areaID) + 3600000000) + ')->.searchArea;'
                bboxData = '(area.searchArea);'
            except:
                arcpy.AddError("no area found!")
                return
Together with the bbox coordinates or the areaID we are now able to query the Overpass API. If you look into the network tab when using the Overpass API you will see the data query in each call:

Overpass API Call

Now we are able to build a comparable query with the tag/ key combination and the result from the Nominatim call. We will use the Requests library:
        url = "http://overpass-api.de/api/interpreter"
        start = "[out:json][timeout:25];("
        nodeData = 'node["' + parameters[0].value + '"="' + parameters[1].value + '"]'
        wayData = 'way["' + parameters[0].value + '"="' + parameters[1].value + '"]'
        relationData = 'relation["' + parameters[0].value + '"="' + parameters[1].value + '"]'
        end = ');(._;>;);out;>;'
        query = start + bboxHead + nodeData + bboxData + wayData + bboxData + relationData + bboxData + end
        arcpy.AddMessage("Overpass API Query:")
        arcpy.AddMessage(query)
        response = requests.get(url,params={'data': query})
        if response.status_code!=200:
            arcpy.AddMessage("server response was " + str(response.status_code) )
            return
        try:
            data = response.json()
        except:
            arcpy.AddMessage("server responded with non JSON data: ")
            arcpy.AddError(response.text)
            return
        if len(data["elements"]) == 0:
            arcpy.AddMessage("No data found!")
            return
        else:
            arcpy.AddMessage("collected " + str(len(data["elements"])) + " objects (incl. reverse objects)")

Dealing with the Response in OSMQuery

The response contains elements which can be nodes, ways or relations. At first hand we need to treat nodes as points. The used query also give elements which are selected as parts of ways. Each way consists of nodes and the nodes define the coordinates of a way. These describing nodes usually don’t have tags assigned to them. Ways normally correspond to a polyline feature. If the nodes contained in a way have the same nodeID in the beginning and end, the way should be treated as a polygon.
Each element could have tags that are not shared by other elements of the same type. But we need to define a datamodel of each feature class using the union of all used tags in all elements of the same type. Therefore the logic tries to determine whether we have point-, line- and/or polygon feature classes and then determines the set of attributes for each feature class. In the end the feature class is filled with the result of your query. This part creates the feature classes:
        ###determine sorts of feature classes:
        def createFieldArray(element, array):
            for tag in element["tags"]:
                if tag not in array:
                    array.append(tag)
            return array
        import time
        time =  int(time.time())
        ########################################################
        ###creating feature classes according to the response###
        ########################################################
        pointsCreated = False
        linesCreated = False
        polygonsCreated = False
        for element in data['elements']:
            if element["type"]=="node" and pointsCreated == 0 and "tags" in element:
                nodesFCName = "Points_" + str(time)
                nodeFields = []
                nodesFC = arcpy.CreateFeatureclass_management(arcpy.env.scratchWorkspace, nodesFCName, "POINT", "", "DISABLED", "DISABLED", arcpy.SpatialReference(4326), "")
                for element in data['elements']:
                    if element["type"]=="node":
                    ###some elements don't have tags.
                        if "tags" in element:
                            nodeFields = createFieldArray(element, nodeFields)
                arcpy.AddField_management(nodesFC,"OSM_ID", "DOUBLE", 12,0, "",  "OSM_ID")
                for tag in nodeFields:
                    try:
                        arcpy.AddField_management(nodesFC, tag.replace(":", ""), "STRING", 255, "", "",  tag.replace(":", "_"), "NULLABLE")
                        arcpy.AddMessage("adding field " + tag + " to point shapefile")
                    except:
                        arcpy.AddMessage("adding field " + tag + " failed")
                pointsCreated = True
                arcpy.AddMessage("point shapefile created")
                rowsNodesFC = arcpy.InsertCursor(nodesFC)
            if (element["type"]=="way" and linesCreated == 0 and (element["nodes"][0] != element["nodes"][len(element["nodes"])-1])) and "tags" in element:
                waysFCName = "Lines_" + str(time)
                wayFields = []
                waysFC = arcpy.CreateFeatureclass_management(arcpy.env.scratchWorkspace, waysFCName, "POLYLINE", "", "DISABLED", "DISABLED", arcpy.SpatialReference(4326), "")
                for element in data['elements']:
                    if element["type"]=="way" and element["nodes"][0] != element["nodes"][len(element["nodes"])-1]:
                        if "tags" in element:
                            wayFields = createFieldArray(element, wayFields)
                arcpy.AddField_management(waysFC,"OSM_ID", "DOUBLE", 12,0, "",  "OSM_ID")
                for tag in wayFields:
                    try:
                        arcpy.AddField_management(waysFC, tag.replace(":", ""), "STRING", 255, "", "",  tag.replace(":", "_"), "NULLABLE")
                        arcpy.AddMessage("adding field " + tag + "  to line shapefile")
                    except:
                        arcpy.AddMessage("adding field " + tag + " failed")
                linesCreated = True
                arcpy.AddMessage("line shapefile created")
                rowsWaysFC = arcpy.InsertCursor(waysFC)
            if (element["type"]=="way" and polygonsCreated == 0 and element["nodes"][0] == element["nodes"][len(element["nodes"])-1]) and "tags" in element:
                polysFCName = "Polygons_" + str(time)
                polyFields = []
                polyFC = arcpy.CreateFeatureclass_management(arcpy.env.scratchWorkspace, polysFCName, "POLYGON", "", "DISABLED", "DISABLED", arcpy.SpatialReference(4326), "")
                for element in data['elements']:
                    if element["type"]=="way" and element["nodes"][0] == element["nodes"][len(element["nodes"])-1]:
                        if "tags" in element:
                            polyFields = createFieldArray(element, polyFields)
                arcpy.AddField_management(polyFC,"OSM_ID", "DOUBLE", 12,0, "",  "OSM_ID")
                for tag in polyFields:
                    try:
                        arcpy.AddField_management(polyFC, tag.replace(":", ""), "STRING", 255, "", "",  tag.replace(":", "_"), "NULLABLE")
                        arcpy.AddMessage("adding field " + tag + "  to polygon shapefile")
                    except:
                        arcpy.AddMessage("adding field " + tag + " failed")
                polygonsCreated = True
                arcpy.AddMessage("polygon shapefile created")
                rowsPolyFC = arcpy.InsertCursor(polyFC)
As we do have the feature classes we can add some features now:
for element in data['elements']:
            ###we deal with nodes first
            if element["type"]=="node" and "tags" in element:
                row = rowsNodesFC.newRow()
                PtGeometry = arcpy.PointGeometry(arcpy.Point(element["lon"], element["lat"]), arcpy.SpatialReference(4326))
                row.setValue("SHAPE", PtGeometry)
                row.setValue("OSM_ID", element["id"])
                for tag in element["tags"]:
                    try:
                        row.setValue(tag.replace(":", ""), element["tags"][tag])
                    except:
                        arcpy.AddMessage("adding value failed")
                rowsNodesFC.insertRow(row)
                del row
            if element["type"]=="way" and "tags" in element:
                ### getting needed Node Geometries:
                nodes = element["nodes"]
                nodeGeoemtry = []
                ### finding nodes in reverse mode
                for node in nodes:
                    for NodeElement in data['elements']:
                        if NodeElement["id"] == node:
                            nodeGeoemtry.append(arcpy.Point(NodeElement["lon"],NodeElement["lat"]))
                            break
                if nodes[0]==nodes[len(nodes)-1]:
                    arcpy.AddMessage("treating way as polygon")
                    row = rowsPolyFC.newRow()
                    pointArray = arcpy.Array(nodeGeoemtry)
                    row.setValue("SHAPE", pointArray)
                    row.setValue("OSM_ID", element["id"])
                    ###now deal with the way tags:
                    if "tags" in element:
                        for tag in element["tags"]:
                            try:
                                row.setValue(tag.replace(":", ""), element["tags"][tag])
                            except:
                                arcpy.AddMessage("adding value failed")
                    rowsPolyFC.insertRow(row)
                    del row
                else: #lines have different start end endnodes:
                    arcpy.AddMessage("treating way as polyline")
                    row = rowsWaysFC.newRow()
                    pointArray = arcpy.Array(nodeGeoemtry)
                    row.setValue("SHAPE", pointArray)
                    row.setValue("OSM_ID", element["id"])
                    ###now deal with the way tags:
                    if "tags" in element:
                        for tag in element["tags"]:
                            try:
                                row.setValue(tag.replace(":", ""), element["tags"][tag])
                            except:
                                arcpy.AddMessage("adding value failed")
                    rowsWaysFC.insertRow(row)
                    del row
In the end we get some nice features using OSMQuery in ArcGIS Pro or ArcMap:
Donwload the OSMQuery tool via GitHub.