ETL

SDE-Trickery and Awesomeness: Data Transformations Behind-The-Scenes

Imagine that you collected geographic data in one schema and you want to provide it in some other schema. For many of us, this only happens pretty much all of the time.

For example, maybe your organization has an enterprise database which stores all the information that you have about recent, active faults. Now you want to provide that data as a WFS service with fields that conform to the ActiveFaults template being used by the AASG Geothermal Data System. Perhaps you store all your geologic map data in such a database and you want to convert it into FeatureClasses that fit into NCGMP09-style geodatabases.

Here's a nice way to convert your data formats in such a way that you only have to do it once. You can continue to make changes to your data just like you always have, and there will always be a featureclass hanging out there in the new schema which conveys the most up-to-date information that you have. The pre-requisites are:

XSLT transformations in Python through the Gnome libxml C parser

This is an example script on how to do XSLT transformations in Python 2.6 through the Gnome libxml XML C parser and toolkit. The relative fast C library is available on multiple platforms but you will also need Python bindings for libxml2 and libxslt. There is a handy libxml2 and libxslt Python bindings installer for Windows which also includes the C libraries in DLL form.

Import MEF metadata archive into GeoNetwork through Python

This is a Python example script for importing GeoNetwork Metadata Exchange Format 1.1 (MEF) archives to GeoNetwork 2.4.2's mef.import service. The mef.import service requires a multipart/form-data POST through a modified MultipartPostHandler.py library which now supports Unicode (urllib2). It has been tested in Windows XP and Python 2.6.

Create GeoNetwork MEF files from ISO 19139 XML through Python

This is an example Python script that showcases the creation of GeoNetwork Metadata Exchange Format 1.1 (MEF) archives from ISO 19139 metadata XML files.It has been tested in Windows XP and Python 2.6.

GeoNetwork authentication and CSW transactions through Python

 This is an example Python script that showcases GeoNetwork authentication, session handling, and CSW transactions. 

XSLT to Transform WMS GetCapabilities response to CSW Insert transaction XML

Tested with deegree-csw 2.3pre
Read the XSLT file for more information.

Attached is an example XSLT1 script to transform a WMS GetCapabilities 1.1.1 response to a CSW Insert transaction. The script is based on deegree's wms2iso19119.xsl (http://www.deegree.org/).
Note that currently it only supports WMS version 1.1.1 (<WMT_MS_Capabilities>) responses because it chokes on 1.3.0 (<WMS_Capabilities>) responses.

XML ETL presentation at Data Preservation Workshop

Following is a presentation on XML Extract-Transform-Load (ETL) we did at the Geoscience Data Preservation Techniques Workshop, Indiana Geological Survey, Bloomington, IN on July 2009.

Although this presentation was geared towards National Geological and Geophysical Data Preservation Program (NGGDPP) metadata, the same tools can be used for other ETL needs.

ETL Debug Blog

A group blog on implementing and debugging Extract-Transform-Load (ETL) efforts.

Post to this group blog about your Extract-Transform-Load (ETL) efforts.

Syndicate content