Wednesday, October 22, 2014

Heat maps and FME.

Heat map.

Heat map.

According to Wikipedia a heat map is a graphical representation of data where the individual values contained in a matrix are represented as colors.
In the past heat maps were mostly used in other sectors (biology, statistics, etc.) than the geospatial sector, where maps are the obvious way of data representation.
Nowadays there are plenty of resources to transform your data into a spatial heat map representation.


Google heat map.

The Google Developers site provides a multitude of resources and samples on how to used and incorporate Google's products. 
The Google Maps JavaScript demonstrates how a spatial heat map is created via a simple JavaScript.
Without going into too much details, the script's components include location data, a map center point and visualizations options (colors, gradients and additional functions)

JavaScript in FME.

If you mention JavaScript to an FME user, he will probably think you mean GeoJSON since that is the most common way for spatially representing Java objects (JSON or 'XML's Baby brother' )
There are dedicated readers and writers for JSON in FME and plenty of resources on the subject to be found at FMEpedia.
Since the script is essentially plain text, FME can be used to manipulate the script with a simple text writer.


highway location marker.
To demonstrate how essentially any spatial data can be represented by a heat map via FME, I made use of the national roads dataset (NWB) freely available via the Dutch SDI (PDOK)
The features used are highway location markers (point features) but also line and polygon features can be potentially represented via heat maps.


Actually it is a very simple workspace in which I am extracting the point coordinates into attributes, reprojecting them and concatenating them into the predefined order.
To extract the map center point a BoundingBoxAccumulator, CenterPointReplacer are used on the national border. Finally the CsmapReprojector transformer is used to bring it into the desired coordinate system (LL84).


The result is a html file that can be viewed with most browsers.
In this case I have only used FME on 3 script components and added some images into the header.
Potentially other components such as gradient colors and styling can also be directly manipulated.

The Netherlands - highway location markers heat map.

Wednesday, October 1, 2014

Georeferencing evaluation with FME.

FME and data evaluation.

FME is a great tool to validate and evaluate data (next to the many things you can do with FME)
There are plenty of resources available on the subject demonstrating FME's data validation and Q&A capabilities.

Data evaluation can involve different aspects and have many forms.
For this post I choose to evaluate how well a publicly available data set can be georeferenced (if you can add value to it and put it on a map, why shouldn't you...)
For any serious conclusions, you'll have to work it out yourself, since my main intention is to demonstrate FME capabilities (and not bad mouth anybody particularly...)

Data source.

The Dutch government publishes many data sets openly and the numbers are increasing all the time.
I choose to use the data set of the national education registry since it is highly dynamic and it contains addresses, which makes it possible to potentially georeference the features.

The data used is available in csv format, which can easily be accessed online via the CSV reader (just point it to the url). For limiting sorting and filtering the incoming data, see my previous post: Where clause on text.
This results in a continuously updated data source, which is great to have but poses a challenge when displaying the results.

 Georeferencing the data.

    For georeferencing the source data I am using the BAG Geocoding service, available via the National SDI.
    An easy way in FME to access the service is by a HTTPFetcher transformer.
    By constructing the URL in the transformer's text editor and making use of attributes values, a very flexible solution is created.

    BAG Geocoding service results.

    The BAG Geocoding service returns the location(s) in an XML snippet that translates into geometry and attributes. In case of ambiguity or lack of sufficient input, the service returns an aggregate geometry.
    Somewhere in the aggregate geometry the corresponding location and attributes can be found (well most of the times...)
    Using the total count of both georeferenced and failed features, simple statistics (percentage of correct georeferencing) can be gathered and used for display.


    Interpreting the results.

    Some of the 'failed' to georeference features do actually exist (BAG Web) and can be correctly geocoded by slightly changing the address used, see for example georeferenced (note the street tag) and not georeferenced (note URL used =  input address)

    Displaying the results.

    I am using Google Fusion Tables to display  the results since it is an easy way to share geographical information (article is in dutch)
    Also non-spatial data can be shared this way and the failed features are saved into a non-spatial Google Fusion Table. Needless to say FME supports both spatial and non spatial reading and writing of this format.
    Some limitations of this format are the number of features supported and that it is still considered an experimental format, something that unfortunately makes it less reliable.
    As mentioned before the input source data is updated frequently and in contrast the displayed results are static and present a moment in time.

    Map of results, created 1-10-2014.

    Findings and (possible) future developments.

    A way to keep the displayed results up to date would be to use FME Cloud, something I still have not got around to try. I imagine that using FME Cloud to run this workspace would not require almost any resources or adaptations, since most of the data is on-line.
    Some of the finding are:

    • Saving the source csv data is necessary due to memory issues (something that is easily done in the HTTPFetcher)
    • Another curious issue found is that using the postcode in the request string actually results in less features georeferenced.
    Don't forget that by the time you read this post the output might look very different.