Monday, December 23, 2013

XML the xfMap path

Most of the people I meet in the Geo-information sector are aware of FME as a translation tool  from one format to another.

That is originally what FME was built for (20 years ago) and is still one of the jobs where FME transcend all the rest of the available tools.

But FME is much more than a format translation tool, is as good as any GIS software available , and in my opinion it blows away most of them.


I constantly get amazed looks from costumers, when explaining them that most of their requirements can be done with FME, and in most cases in a single workspace.

The ability to dazzle with FME is not restricted to the spatial domain, FME's support of non spatial formats (for example the Excel writer, the shapefile of non spatial) makes it possible for non spatial people to benefit from FME when a serious user is around.

XML is another format where FME can be of great benefit, since almost any imaginable operation on XML can be done in FME, the combination of spatial and non spatial makes FME a truly versatile tool.


Since nowadays you can find XML almost everywhere, from OGC services styling to cadaster data sources all is served via XML.
The only main difference in the data is the schema (data model) and that can range from simple to extremely complex.

In FME the advised way of approaching XML data is by using feature paths, this makes XML handling very easy and does not require any in depth knowledge of the data schema.
However in some cases using feature path doesn't (and sometimes can't) handle the XML as good as xfMap.
xfMaps are the basis of XML handling in FME and if you know your xfMap then handling XML becomes super easy. The main drawback is that you do need in depth knowledge of the data schema.

The big advantage of using  xfMap over feature paths is the ability to read very large XML files, I suspect that when using feature paths the file is read into the computers memory, this can be an issue when dealing with huge volumes of data, even if your machine is supercharged and all of your FME settings are in place.

When using xfMap the file that crashed your machine trying to handle it with feature paths, will now easily load the same data.

I have tested this with the new BRK (new cadastral registration) data, available from the Dutch Kadaster.
Trying to parse the XML (260MB) with feature paths, just crashed my machine (and I do have a reasonable machine with the FME settings set properly).
With a very simple (and powerful) xfMap the same file loaded within 10 seconds and was ready to be transformed.

So to wrap it up, use xfMap if:
  1. The data volume is large and the feature paths configuration doesn't comply.
  2. The data schema is static
  3. You have too much time on your hands.

This is the last post for 2013, if you have already had some glimpses of what is to be expected for us with FME 2014, then I am sure you are as excited as I am!

A prosperous 2014 to us all.

Itay




Tuesday, November 5, 2013

Bye Bye Universal Viewer



What is the first thing they teach you in a FME course? To have a look at the data !

All the way back to my first careful steps with FME, the Universal Viewer was there to help visualize and query the data I was working with.

A great tool and without it, it would have been unthinkable to work with any type of data.

 FME Universal Viewer
Since the first appearance of the next generation viewing application of FME (Data Inspector), I have been hesitant to make the transition from the known and familiar Universal Viewer to the new Data Inspector.

Sure there have been moments when I thought OK now I am finally going over to the Inspector, I mean its fancy and you can view in 3D (That was back in the days when the 3D was hot). But somehow I never got around to finally take the final step.
???
Since FME 2013 was released I have been only using the Data Inspector (I remember somewhere in the beginning of 2013 that the FME Evangelist himself made the transition and naturally recommended it to us all).
Throughout 2013 a number of features have been added to the Data Inspector (background maps, table view) that made the Universal Viewer really seem old fashioned.
The only Universal Viewer functionality I still miss in the Data Inspector is the ability to save the viewed features.

So bye bye good old Universal Viewer I suspect I wont see much of you anymore (wasn't FME 2013 also the last release with the Universal Viewer?)

And for those of you that still cling to the Universal Viewer, take the final step it's worth it!

It's something unpredictable, but in the end is right, I hope you had the time of your life.
- Green Day

Friday, November 1, 2013

Rotten Apple


As an ETL specialist you are mostly involved in moving data, doesn't matter what you do with the data you are always moving it in some way.

There is always a starting point (E) and an end point (L) to the transformation (T), this time I would like to share some of my experience concerning the ETL process.

This will no be so much about how to do this or that, but more about the delicate and sometime awkward situations that can arise in the process.

 If you are doing custom transformation work or it is just part of your daily work, the hardest part of the ETL process can sometimes be the process around the technicality of it.

Since you are responsible for the 'black box' for most people (and managers especially) you are bound to get the blame if it goes wrong.

 Lesson 1: make it clear from the start what your responsibilities are!

Make sure that the parameters and expected results of your job are well defined before you even open FME!

This can take some time and you will be seen as a pain in the .... but it can save you a lot of pain and it will make your goals clear and well defined.

In most cases involving large and complicated database schema and design, you are dependent on others to provide you an insight about the database schema or even better: queries to preform on the database.

Most of the times you get a reaction like: " I thought you are the ETL specialist...." or " Don't you know how to do it yourself?", sounds familiar?

Lesson 2: clearly define the input and expected results (especially if large and complicated database schema is involved)

OK you got over the first hurdles and created your workspace and it works or at least you think it does, but wait a minute after some time it turns out that it is not providing the expected results (oops embarrassing moment) but hey we are all human and we make mistakes.

So you rack your brain trying to find where the mistake is done. Eventually you find no fault in your work and the workspace does what it is suppose to do.

In fact you have delivered what was possible based on the information you got, the problem arises when you are not supplied the correct information.

Lesson 3: document the defined input, expected results, responsibilities and ETL process.

Despite it all it can always go bad, there is always a rotten apple.....somewhere in between it all.





Friday, October 11, 2013

FME Template




The ability to save workspaces as templates was added in FME 2011.
If you keep preforming the same actions, for example use a reader's bounding box settings, then this is something you should consider.

A template is basically a saved workspace, in which you can have as many transformers and readers in it as you like or none at all.
None at all? well yes actually the one I use the most doesn't have any transformers (yet) and only one reader and an inspector.

If you are a heavy database user then consider the following:
  • You constantly need to access data from different services with different users/passwords and you want a to spatially select the data and tables from where the data comes from.

You can create a workspace from scratch each time, but if you hate repetitive and not particularity efficient tasks (like me) then a template can save you lots of time (which you can spend on the fun part = transformation)

To make the template as flexible as possible I make use of many parameters (private and published).
These parameters assist me to define the database service, user name, password, feature types (tables), where clause and location (bounding box) when running the workspace (prompt and run)

Prompt window

The only action I actually have to do is to select my translation source parameters and run, viola!

Wednesday, October 9, 2013

Search feature

As a seasoned FME user you are probably aware of the fact that a reader's bounding box coordinates settings are an efficient way to read only data which is spatially relevant for your location.

A little less known option, native to the FME  ESRI database readers (SDE and Arc Objects GDB  is the search feature. This option lets you use the reader (actually the underlaying database) to preform a selection of spatial selection based on a feature.

To make use of this functionality requires a little bit of preprocessing if you don't have the search features in the necessary format.
Let's take an example to make this clear.
  • Lets say you have projects boundaries polygons available (doesn't matter in which format since FME is the champion of formats) and you want to use these features as a search feature on data in an SDE or GDB database.
  • You basically need to convert your boundaries into the format accepted by the search feature setting (space delimited coordinates)
  • This can be easily done with FME (see the CoordinateConcatenator transformer >FME 2013)
What I ended up with is a space delimited csv file:
Search features read by the csv reader

  • Now it is time to implement it on the database reader, go to the Navigator> Parameters > Advanced > Search feature
  • Create a parameter from the setting (right click > Create user parameter), select choice with alias (multiple).
  • In the parameter's configuration setting select the import option, this will basically open a wizard (much the same as a normal reader) in which you can select the csv file created.
  • Make sure you correctly select the values and alias fields and you will end up with a selection menu for the search feature.

Selection of search features when prompting

As the name of the setting suggests you can only select one feature, and if you intend to have a selection from multiple features: the FeatureReader is the way to go.

You might think 'pfff.... I can use the FeatureReader ' well you can!! as in all things FME there are many roads that lead to the desired results but since this functionality exists, why not make use of it?


Tuesday, October 8, 2013

XML



I used to be a bit cautious when dealing with XML in the past, but since I have started using FME with it, it just broke down all the barriers.

I actually find it to be a very exiting data type to work with, it is simple to understand and with FME (what else) it is simple to update, create and completely transform.

There is a all bunch of FME transformers for dealing with XML in the transformer gallery.
















I have to admit that I only use a couple of them on a regular basis (XMLTemplater, XMLFormatter and the XMLValidatior) which ,as the names suggest, help me to create, validate and make then human readable.

I guess that Don's love and enthusiasm for XML has affected me too!

Sunday, July 21, 2013

First rumblings


Where to start.....


I guess a bit of free typing wont hurt at this stage, I mean nobody is actually going to read this? right?
My conclusion is quite easy for me to understand and argue, but for any other some clarifications are needed.


My initial idea was to create a personal log that one day might take shape into a full grown blog, this soon took another form, an blog to promote/add exposure to my self.

Shortly, I am (or trying to be) a freelance spatial ETL professional, I live in the Netherlands one of the most advanced geospatial nation in Europe. 
Because of that and the amount of competition in the geospatial market, exposure is critical for a start up business.

BTW did I mention FME?


More on that later....