Geocoding with PL/JSON

September 17, 2013 at 3:23 pm | Posted in Oracle Developement | 7 Comments
Tags: , , , , , , ,

The Google Maps API geocoding web service makes it easy to geocode addresses. You can pass an address to the web service and receive either an XML or a JSON response back.

In this post I’ll show how to use the PL/JSON package to parse the JSON response from Google, in order to get to the various address elements and the coordinates associated with a given address.

PL/JSON is a library consisting of a PL/SQL package and some PL/SQL objects and types. It is designed to create and parse JSON. The documentation is a bit lacking, especially when it comes to handling a more complicated JSON object like the Google geocode response. Hopefully, this article will help to clarify its usage.

To get started, first download and install PL/JSON into a schema in your Oracle database. I used version 1.0.3. The installation is simply done by extracting the downloaded zip file, and running the install.sql script. Once installed, make sure all objects are valid. You can run a quick test of the package by running the json_test.sql included in the testsuite directory of the PL/JSON installation.

The Google Geocoding Web Service

Version 3 of the Google Maps API, no longer requires a key. So you can make a simple call to the service like this:

http://maps.googleapis.com/maps/api/geocode/json?address=5605+n+macarthur+blvd+irving+tx&sensor=false

The result of the URL is a JSON object containing the address and geocode data.

To make sense of the JSON object you need to understand that curly brackets {} contain objects, and square brackets [] contain arrays. Strings are enclosed with double quotes, numbers are not. Data is represented in name/value pairs separated by colons.

The Google response consists of two objects: results and status. A successful response will have a status of “OK”.

The results object is an array of four other objects:

  1. address_components – an array of elements such as street number, city, state, etc.
  2. formatted_address – a single string containing the full address
  3. geometry – an array of coordinate data elements
  4. types – an array of strings, which in this case is always “street_address”.

Calling the Web Service

Calling the Google web service from within PL/SQL, I use the ult_http package.  Since the JSON response can be quite large, I use the utl_http.request_pieces function to fetch the result. This works around the 2000 byte limit of the utl_http.request function .  I simply loop through all the pieces returned, and concatenate them to a CLOB variable inside the procedure.

Note that I make the http request inside a WHILE loop (line 50). This is helpful when you attempt to geocode a number of addresses in rapid succession (e.g. by looping through a table with addresses that need to be geocoded). Google will return an error (OVER_QUERY_LIMIT) when it receives too many requests per second. So the loop waits two seconds before the next attempt. I try a total of three attempts. If I still don’t get an OK status back, I give up. Note that Google limits the free geocoding to 2,500 requests per IP per day. If that limit is reached, you also receive an OVER_QUERY_LIMIT error.

In order to get to the status of the response, I use the json() function to convert the JSON string to a JSON object (lines 64 and 67).

Parsing the JSON response

Now that we have our geocoded address stored in a JSON object, I can now use PL/JSON to extract the various elements we need.

Since the results object consist of an array, I pass this array to a JSON_LIST object (l_results) (line 96). This array, however, consists only of a single element, so I pass this to a variable of type JSON, named l_tempobj (line 99).

The variable l_tempobj now contains the JSON of the four elements listed above (address_components, formatted_address, geometry, and status).

Since the address_components is yet another array, I pass it to another JSON_LIST object named l_addr_comps (line 102). Now I can loop through the address components and test the types array element to find out what type of address component I’m dealing with. I use the CASE statement (line 117) to test the types and to assign the address elements to their individual PL/SQL variables.

Once the loop is done, I have all my address components I looked for, but I still want to get to the coordinates. Those are stored in the third array element (geometry) of l_tempobj.  I pass the location element to the l_loc variable (line 140) of type JSON., then extract the lat/lng strings from it, and pass them to PL/SQL varchar2 variables (lines 146 and 147).

At this point the procedure is done, and I print the results.

Notes

The JSON and JSON_LIST data types are created on installation of PL/JSON. When parsing through a JSON object, you need to be keenly aware of what you are processing: Whether it is a nested JSON object or a string or number. Notice that I had to use the get_number function to extract lat/lng, while using get_string to get the status.

Use json_list() to pass the value of an element that consists of an array:

"address_components" :  [{ "name 1" : value} ,{"name 2" : "value"},{"name 3" : "value"}]

Use json_ext.get_string (or get_number) to get the value of a non-array element:

"status" : "OK"

Other errors from the Google API:

ZERO_RESULTS – no address data was fund

REQUEST_DENIED – most likely the sensor parameter was not specified correctly. Check the error_message text included in the response.

I’ve also included a few commented out <obj>.print statements for you to see the contents of various objects.

Let me know how you fare with this. Comments are always appreciated.

CREATE OR REPLACE PROCEDURE geocode_address(p_address IN VARCHAR2) IS
  l_obj           json;
  l_results       json_list;
  l_tempobj       json;
  l_addr_comps    json_list;
  l_addr          json;
  l_typesarr      json_list;
  l_geom_obj      json;
  l_loc           json;

  l_status        VARCHAR2(255);
  l_types         VARCHAR2(30);
  l_short_name    VARCHAR2(200);
  l_street_number VARCHAR2(200);
  l_street        VARCHAR2(200);
  l_city          VARCHAR2(200);
  l_state         VARCHAR2(30);
  l_zip           VARCHAR2(10);
  l_county        VARCHAR2(30);
  l_country       VARCHAR2(10);
  l_tries         PLS_INTEGER := 0;
  l_address       VARCHAR2(4000);
  l_request       VARCHAR2(4000);
  l_pieces        utl_http.html_pieces;
  l_json          CLOB;
  l_lat           VARCHAR2(40);
  l_lng           VARCHAR2(40);

  e_query_limit  EXCEPTION;
  e_zero_results EXCEPTION;

  l_prc CONSTANT VARCHAR2(16) := 'GEOCODE ADDRESS';

BEGIN
  dbms_application_info.set_module(module_name => l_prc, action_name => 'START');

  -- replace empty spaces with +
  l_address := REPLACE(TRIM(p_address), ' ', '+');

  BEGIN

    l_request := 'http://maps.googleapis.com/maps/api/geocode/json?address=' ||
                 l_address || chr(38) || 'sensor=false';

    l_tries  := 0;
    l_status := NULL;
    l_json   := NULL;

    -- Try three gets in case of OVER_QUERY_LIMIT errors
    WHILE nvl(l_status, '*') != 'OK' AND l_tries < 3
    LOOP
      dbms_application_info.set_client_info(client_info => 'HTTP REQUEST try ' ||
                                                           l_tries);

      -- JSON object may be long, so split it into pieces
      l_pieces := utl_http.request_pieces(l_request);

      FOR i IN 1 .. l_pieces.count
      LOOP
        l_json := l_json || l_pieces(i);
      END LOOP;

      -- Create json object from google response
      l_obj    := json(l_json);

      -- Check the response status. If it is OK then proceede with parsing.
      l_status := json_ext.get_string(l_obj, 'status');

      IF l_status != 'OK'
      THEN
        dbms_output.put_line(l_prc || '  status [' || l_status || ']');
        IF l_status = 'OVER_QUERY_LIMIT' AND l_tries >= 3
        THEN
          dbms_output.put_line(l_prc || ': query limit reached. Exiting');
          RAISE e_query_limit;
        ELSIF l_status = 'ZERO_RESULTS'
        THEN
          RAISE e_zero_results;
        ELSE
          l_tries := l_tries + 1;
          dbms_lock.sleep(2);
        END IF;

      END IF;

      --dbms_output.put_line(l_json);
      l_json := NULL;

    END LOOP;

    BEGIN

      dbms_application_info.set_client_info(client_info => 'PARSE JSON');

      -- The overall JSON is an array named results
      l_results := json_list(l_obj.get('results'));

      -- There is only a single element in the results array, so get the first (and last) one
      l_tempobj := json(l_results.get(1));

      -- The next level contains an array named address_components
      l_addr_comps := json_list(l_tempobj.get(1));

      -- loop through the address components and test the types array for address elements
      FOR i IN 1 .. l_addr_comps.count
      LOOP
        l_addr := json(l_addr_comps.get(i));
        --l_addr.print;

        l_typesarr := json_list(l_addr.get('types'));

        -- Types is not a json array, but a string array so we have to get
        -- the first element using the types[1] syntax
        l_types      := json_ext.get_string(l_addr, 'types[1]');
        l_short_name := json_ext.get_string(l_addr, 'short_name');

        CASE l_types
          WHEN 'street_number' THEN
            l_street_number := l_short_name;
          WHEN 'route' THEN
            l_street := l_short_name;
          WHEN 'locality' THEN
            l_city := l_short_name;
          WHEN 'administrative_area_level_1' THEN
            l_state := l_short_name;
          WHEN 'administrative_area_level_2' THEN
            l_county := l_short_name;
          WHEN 'postal_code' THEN
            l_zip := l_short_name;
          WHEN 'country' THEN
            l_country := l_short_name;
          ELSE
            NULL;
        END CASE;

      END LOOP;

      -- now get lat/lng
      l_geom_obj := json(l_tempobj.get(3));
      --l_geom_obj.print;

      l_loc := json_ext.get_json(l_geom_obj, 'location');
      --l_loc.print;

      l_lat := to_char(json_ext.get_number(l_loc, 'lat'));
      l_lng := to_char(json_ext.get_number(l_loc, 'lng'));

      dbms_output.put_line('Address: ' || l_street_number || ' ' ||
                           l_street || ' ' || l_city || ' ' || l_state || ' ' ||
                           l_zip || ' ' || l_county || ' ' || l_country);
      dbms_output.put_line('lat/lng: ' || l_lat || ' ' || l_lng);

    EXCEPTION
      WHEN e_query_limit THEN
        raise_application_error(-20001, 'GOOGLE QUERY LIMIT REACHED');
      WHEN OTHERS THEN
        dbms_output.put_line(l_prc || ': ' || SQLERRM);
    END;
  EXCEPTION
    WHEN e_zero_results THEN
      dbms_output.put_line(l_prc || ': Zero Results. Getting next address');
  END;

  COMMIT;

  dbms_application_info.set_client_info(client_info => 'DONE');
EXCEPTION
  WHEN OTHERS THEN
    RAISE;

END geocode_address;


Entries and comments feeds.