Geocoding with PL/JSON
September 17, 2013 at 3:23 pm | Posted in Oracle Developement | 7 CommentsTags: address elements, api, curly brackets, geocoding, google maps, json, pl/json, pl/sql
The Google Maps API geocoding web service makes it easy to geocode addresses. You can pass an address to the web service and receive either an XML or a JSON response back.
In this post I’ll show how to use the PL/JSON package to parse the JSON response from Google, in order to get to the various address elements and the coordinates associated with a given address.
PL/JSON is a library consisting of a PL/SQL package and some PL/SQL objects and types. It is designed to create and parse JSON. The documentation is a bit lacking, especially when it comes to handling a more complicated JSON object like the Google geocode response. Hopefully, this article will help to clarify its usage.
To get started, first download and install PL/JSON into a schema in your Oracle database. I used version 1.0.3. The installation is simply done by extracting the downloaded zip file, and running the install.sql script. Once installed, make sure all objects are valid. You can run a quick test of the package by running the json_test.sql included in the testsuite directory of the PL/JSON installation.
The Google Geocoding Web Service
Version 3 of the Google Maps API, no longer requires a key. So you can make a simple call to the service like this:
The result of the URL is a JSON object containing the address and geocode data.
To make sense of the JSON object you need to understand that curly brackets {} contain objects, and square brackets [] contain arrays. Strings are enclosed with double quotes, numbers are not. Data is represented in name/value pairs separated by colons.
The Google response consists of two objects: results and status. A successful response will have a status of “OK”.
The results object is an array of four other objects:
- address_components – an array of elements such as street number, city, state, etc.
- formatted_address – a single string containing the full address
- geometry – an array of coordinate data elements
- types – an array of strings, which in this case is always “street_address”.
Calling the Web Service
Calling the Google web service from within PL/SQL, I use the ult_http package. Since the JSON response can be quite large, I use the utl_http.request_pieces function to fetch the result. This works around the 2000 byte limit of the utl_http.request function . I simply loop through all the pieces returned, and concatenate them to a CLOB variable inside the procedure.
Note that I make the http request inside a WHILE loop (line 50). This is helpful when you attempt to geocode a number of addresses in rapid succession (e.g. by looping through a table with addresses that need to be geocoded). Google will return an error (OVER_QUERY_LIMIT) when it receives too many requests per second. So the loop waits two seconds before the next attempt. I try a total of three attempts. If I still don’t get an OK status back, I give up. Note that Google limits the free geocoding to 2,500 requests per IP per day. If that limit is reached, you also receive an OVER_QUERY_LIMIT error.
In order to get to the status of the response, I use the json() function to convert the JSON string to a JSON object (lines 64 and 67).
Parsing the JSON response
Now that we have our geocoded address stored in a JSON object, I can now use PL/JSON to extract the various elements we need.
Since the results object consist of an array, I pass this array to a JSON_LIST object (l_results) (line 96). This array, however, consists only of a single element, so I pass this to a variable of type JSON, named l_tempobj (line 99).
The variable l_tempobj now contains the JSON of the four elements listed above (address_components, formatted_address, geometry, and status).
Since the address_components is yet another array, I pass it to another JSON_LIST object named l_addr_comps (line 102). Now I can loop through the address components and test the types array element to find out what type of address component I’m dealing with. I use the CASE statement (line 117) to test the types and to assign the address elements to their individual PL/SQL variables.
Once the loop is done, I have all my address components I looked for, but I still want to get to the coordinates. Those are stored in the third array element (geometry) of l_tempobj. I pass the location element to the l_loc variable (line 140) of type JSON., then extract the lat/lng strings from it, and pass them to PL/SQL varchar2 variables (lines 146 and 147).
At this point the procedure is done, and I print the results.
Notes
The JSON and JSON_LIST data types are created on installation of PL/JSON. When parsing through a JSON object, you need to be keenly aware of what you are processing: Whether it is a nested JSON object or a string or number. Notice that I had to use the get_number function to extract lat/lng, while using get_string to get the status.
Use json_list() to pass the value of an element that consists of an array:
"address_components" : [{ "name 1" : value} ,{"name 2" : "value"},{"name 3" : "value"}]
Use json_ext.get_string (or get_number) to get the value of a non-array element:
"status" : "OK"
Other errors from the Google API:
ZERO_RESULTS – no address data was fund
REQUEST_DENIED – most likely the sensor parameter was not specified correctly. Check the error_message text included in the response.
I’ve also included a few commented out <obj>.print statements for you to see the contents of various objects.
Let me know how you fare with this. Comments are always appreciated.
CREATE OR REPLACE PROCEDURE geocode_address(p_address IN VARCHAR2) IS l_obj json; l_results json_list; l_tempobj json; l_addr_comps json_list; l_addr json; l_typesarr json_list; l_geom_obj json; l_loc json; l_status VARCHAR2(255); l_types VARCHAR2(30); l_short_name VARCHAR2(200); l_street_number VARCHAR2(200); l_street VARCHAR2(200); l_city VARCHAR2(200); l_state VARCHAR2(30); l_zip VARCHAR2(10); l_county VARCHAR2(30); l_country VARCHAR2(10); l_tries PLS_INTEGER := 0; l_address VARCHAR2(4000); l_request VARCHAR2(4000); l_pieces utl_http.html_pieces; l_json CLOB; l_lat VARCHAR2(40); l_lng VARCHAR2(40); e_query_limit EXCEPTION; e_zero_results EXCEPTION; l_prc CONSTANT VARCHAR2(16) := 'GEOCODE ADDRESS'; BEGIN dbms_application_info.set_module(module_name => l_prc, action_name => 'START'); -- replace empty spaces with + l_address := REPLACE(TRIM(p_address), ' ', '+'); BEGIN l_request := 'http://maps.googleapis.com/maps/api/geocode/json?address=' || l_address || chr(38) || 'sensor=false'; l_tries := 0; l_status := NULL; l_json := NULL; -- Try three gets in case of OVER_QUERY_LIMIT errors WHILE nvl(l_status, '*') != 'OK' AND l_tries < 3 LOOP dbms_application_info.set_client_info(client_info => 'HTTP REQUEST try ' || l_tries); -- JSON object may be long, so split it into pieces l_pieces := utl_http.request_pieces(l_request); FOR i IN 1 .. l_pieces.count LOOP l_json := l_json || l_pieces(i); END LOOP; -- Create json object from google response l_obj := json(l_json); -- Check the response status. If it is OK then proceede with parsing. l_status := json_ext.get_string(l_obj, 'status'); IF l_status != 'OK' THEN dbms_output.put_line(l_prc || ' status [' || l_status || ']'); IF l_status = 'OVER_QUERY_LIMIT' AND l_tries >= 3 THEN dbms_output.put_line(l_prc || ': query limit reached. Exiting'); RAISE e_query_limit; ELSIF l_status = 'ZERO_RESULTS' THEN RAISE e_zero_results; ELSE l_tries := l_tries + 1; dbms_lock.sleep(2); END IF; END IF; --dbms_output.put_line(l_json); l_json := NULL; END LOOP; BEGIN dbms_application_info.set_client_info(client_info => 'PARSE JSON'); -- The overall JSON is an array named results l_results := json_list(l_obj.get('results')); -- There is only a single element in the results array, so get the first (and last) one l_tempobj := json(l_results.get(1)); -- The next level contains an array named address_components l_addr_comps := json_list(l_tempobj.get(1)); -- loop through the address components and test the types array for address elements FOR i IN 1 .. l_addr_comps.count LOOP l_addr := json(l_addr_comps.get(i)); --l_addr.print; l_typesarr := json_list(l_addr.get('types')); -- Types is not a json array, but a string array so we have to get -- the first element using the types[1] syntax l_types := json_ext.get_string(l_addr, 'types[1]'); l_short_name := json_ext.get_string(l_addr, 'short_name'); CASE l_types WHEN 'street_number' THEN l_street_number := l_short_name; WHEN 'route' THEN l_street := l_short_name; WHEN 'locality' THEN l_city := l_short_name; WHEN 'administrative_area_level_1' THEN l_state := l_short_name; WHEN 'administrative_area_level_2' THEN l_county := l_short_name; WHEN 'postal_code' THEN l_zip := l_short_name; WHEN 'country' THEN l_country := l_short_name; ELSE NULL; END CASE; END LOOP; -- now get lat/lng l_geom_obj := json(l_tempobj.get(3)); --l_geom_obj.print; l_loc := json_ext.get_json(l_geom_obj, 'location'); --l_loc.print; l_lat := to_char(json_ext.get_number(l_loc, 'lat')); l_lng := to_char(json_ext.get_number(l_loc, 'lng')); dbms_output.put_line('Address: ' || l_street_number || ' ' || l_street || ' ' || l_city || ' ' || l_state || ' ' || l_zip || ' ' || l_county || ' ' || l_country); dbms_output.put_line('lat/lng: ' || l_lat || ' ' || l_lng); EXCEPTION WHEN e_query_limit THEN raise_application_error(-20001, 'GOOGLE QUERY LIMIT REACHED'); WHEN OTHERS THEN dbms_output.put_line(l_prc || ': ' || SQLERRM); END; EXCEPTION WHEN e_zero_results THEN dbms_output.put_line(l_prc || ': Zero Results. Getting next address'); END; COMMIT; dbms_application_info.set_client_info(client_info => 'DONE'); EXCEPTION WHEN OTHERS THEN RAISE; END geocode_address;