Saturday, August 2, 2014

Built in geometry validation.

Post origin.

The idea for this post comes from a tweet by a FME user (thanks Richard!) and since it is cucumber time I thought what the heck why not try it out myself.

Geometry validation in FME.

The GeometryValidator is the transformer for geometry validations in FME. To be able to validate the data you need to read it into the workspace.

With database formats, with their built in ability to validate geometry, validation and tagging can take place when reading.
In Oracle that can be done with the geometry-related PL/SQL subprograms in the SDO_GEOM package.
In this case I am using the sdo_geom.validate_geometry_with_context  subprogram to locate and tag geometry errors.

Built in geometry validation in the Oracle reader.

To validate the data while reading, the SQL statement should be used in the feature types parameters select statement (not to be confused with the reader select statement)
When configured correctly both valid and invalid features are returned.

As you all know, databases return error's as a number and error location and that there are internal ways to use existing database capabilities to make the error's more human readable.
But since FME is great at reaching out to the web, grabbing data and making use of it, what can be better (and more fun) than parsing the web pages with the error's descriptions and adding it to the features?

Adding error descriptions.

Since I am using Oracle, the logical place to search for error's descriptions is the Oracle on-line documentation. Grabbing the web page is done with the HTTPFetcher, parsing it with the XMLFragmenter and then it's a matter of testing for the correct string. To finish it up I have created a custom transformer that does exactly that.



As in most things FME, there are many ways to achieve the the same result, and it's a matter of personal preference and experience on how one approaches a problem. That said, a problem should not be the sole reason for using FME. It can also be just for fun.
So is it useful to have built in geometry validation and tagging? well I guess it is since otherwise why would anybody try?
Some other advantages might be:
  • Awareness of  geometry error's (do you assume the data is always geometrically correct?)
  • The option to act upon that awareness.
  • Reducing the number of features that need repair. In translations that involve a high volume of data and long computations, reducing the number can result in gaining workspace related efficiency.