OGC Schema Issues in Relation to CSW
In order to create metadata for both static datasets and dynamic, online services and for use with CSW, the OGC created an xml schema that merges the schema for ISO19115 (dataset metadata) and ISO19119 (service metadata) (see secion D.1.5, page 105 in OGC 07-045). The way that was accomplished was by creating a schema located at http://schemas.opengis.net/csw/2.0.2/profiles/apiso/1.0.0/apiso.xsd. The contents of that schema are quite simple -- it looks like this:
<?xml version="1.0" encoding="utf-8"?> <xs:schema targetNamespace="http://www.isotc211.org/2005/gmd" elementFormDefault="qualified" version="0.1" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:gmd="http://www.isotc211.org/2005/gmd"> <!-- ================================= Annotation ================================ --> <xs:annotation> <xs:documentation>ISO Wrapper to include service related type to GMD</xs:documentation> </xs:annotation> <!-- ================================== Imports & Includes ================================== --> <xs:include schemaLocation="../../../../../iso/19139/20060504/gmd/gmd.xsd"/> <xs:import namespace="http://www.isotc211.org/2005/srv" schemaLocation="../../../../../iso/19139/20060504/srv/srv.xsd"/> </xs:schema>
I've highlighted the important parts in bold. The targetNamespace allows us to use this schema to validate our metadata records. The schema simply makes use of the 2006 versions of gmd.xsd and srv.xsd through the "include" and "import" elements.
Trouble begins to arise because these 2006 schemas reference a "local" cache of GML, versioned at 3.2.0. This GML schema is located at http://schemas.opengis.net/iso/19139/20060504/gml/gml.xsd. The readme document in the 2006 ISO directory states that:
For the sake of convenience, GML 3.2 XML schemas (version 19136 DIS - 2005 november) are (temporarily) provided with the 19139 set of schemas. They were retrieved from http://www.isotc211.org/2005/ . Once these schemas are finalized they will become OGC GML 3.2.1 and ISO/TS 19136.
Now, if we look at the real GML repository, we see that GML went from version 3.1.1 to 3.2.1. The 3.2.0 version only exists in this 2006 ISO cache. It was put there apparently to move forward with the ISO19139 roll out before GML 3.2.1 was finalized. The trouble is, there are problems in the 3.2.0 version that prevent some XML validation tools from validating against the schema. One identified problem is in the 3.2.0 version of coverage.xsd:
<complexType name="GridDomainType"> <complexContent> <restriction base="gml:DomainSetType"> <sequence minOccurs="0"> <choice> <choice> <element ref="gml:Grid"/> </choice> <choice/> </choice> </sequence> <attributeGroup ref="gml:OwnershipAttributeGroup"/> <attributeGroup ref="gml:AssociationAttributeGroup"/> </restriction> </complexContent> </complexType>
Visual Studio 2008, for example, will not validate a metadata file against the apiso.xsd schema because it doesn't like the empty <choice /> element in this sequence. Its obvious that there is something screwy here. Even if it is actually valid I think it is clear that it is not written very well. Why have a choice between no options (the empty <choice />)? Why have the choice above it with only one option to choose (an element reference all by itself wrapped in <choice> tags)? Why try and choose between those two choices? Wouldn't this say exactly the same thing if I removed all of the <choice> tags?
At GML 3.2.1, the above issue does not exist. In fact the 3.2.1 coverage.xsd does not even have any mention of this "GridDomainType". In 2007 the OGC created a new version of the ISO 19139 schemas that references GML 3.2.1. However, there's no mention of the SRV namespace (http://www.isotc211.org/2005/srv) anywhere in this new ISO 19139 version. The SRV namespace is where, in our metadata documents using the 2006 version, we specified all our information about dynamic, online services such as WFS and WMS.
At first, we thought that we could simply write a "new" version of the APISO profile that would reference the new, 2007 version of ISO 19139 and would look something like this:
<?xml version="1.0" encoding="utf-8"?> <xs:schema targetNamespace="http://www.isotc211.org/2005/gmd" elementFormDefault="qualified" version="0.1" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:gmd="http://www.isotc211.org/2005/gmd"> <!-- ================================= Annotation ================================ --> <xs:annotation> <xs:documentation>ISO Wrapper to include service related type to GMD</xs:documentation> </xs:annotation> <!-- ================================== Imports & Includes ================================== --> <xs:include schemaLocation="http://schemas.opengis.net/iso/19139/20070417/gmd/gmd.xsd"/> <xs:import namespace="http://www.isotc211.org/2005/srv" schemaLocation="http://schemas.opengis.net/iso/19139/20060504/srv/srv.xsd"/>e </xs:schema>
Right now, one option that we have is to simply use XML validators that don't choke on the spurious <choice>s in the 2006 coverage.xsd. However that is less than ideal, especially in light of the fact that the 3.2.0 version of GML is already superseded by 3.2.1. However the trouble with these 2006 ISO schemas don't stop there either.
We often wish to validate an XML document that is a CSW transaction, such as an insert. Roughly speaking, those documents look something like this:
<csw:Transaction xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" service="CSW" version="2.0.2"> <csw:Insert> <gmd:MD_Metadata xmlns:gmd="http://www.isotc211.org/2005/gmd" xmlns:gco="http://www.isotc211.org/2005/gco" xmlns:gml="http://www.opengis.net/gml" xmlns:xlink="http://www.w3.org/1999/xlink">
... and on to the rest of the metadata entry. The problem now is that we have no way of validating both the Metadata record itself and the CSW transaction elements at the same time. We can try to by adjusting what I've shown above to include schemaLocations:
<csw:Transaction xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" service="CSW" version="2.0.2" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://opengis.new/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-publication.xsd"> <gmd:MD_Metadata xmlns:gmd="http://www.isotc211.org/2005/gmd" xmlns:gco="http://www.isotc211.org/2005/gco" xmlns:gml="http://www.opengis.net/gml" xmlns:xlink="http://www.w3.org/1999/xlink" xsi:schemaLocation="http://www.isotc211.org/2005/gmd http://schemas.opengis.net/csw/2.0.2/profiles/apiso/1.0.0/apiso.xsd">
... and so on. However, the CSW schemas end up referencing GML 3.1.1, while the APISO schema, and its included 2006 ISO schemas reference GML 3.2.0. Because 3.1.1 and 3.2.0 both use a common namespace (http://www.opengis.net/gml), this results in duplicated declarations of a number of GML-related elements, and the document cannot be validated using any validation engine. Again, this was solved at GML 3.2.1, which thoughtfully took on a new namespace (http://www.opengis.net/gml/3.2). Still, I can't use GML 3.2.1 to bypass the problem because there is no ISO 19139 schema that both uses GML 3.2.1 and accounts for service-related metadata elements through the SRV namespace.
To conclude, what we need is a version of ISO 19139 that:
- Allows us to use GML 3.2.1, which is significantly different and an improvement over the "cached" GML 3.2.0 that was used in the 2006 version of ISO 19139.
- Contains the critical service-related metadata elements that are encompassed in the SRV namespace (http://www.isotc211.org/2005/srv) of the 2006 version.
We would also like to see an update of the APISO CSW profile to include such improvements.
The following is from communication with Uwe Voges, who is the editor of the APISO specification and chair of the OGC Standards Working Group:
As far as I remember correctly the problem results from different GML schemas imported by OGC Filter (as part of OGC CSW) and ISO19139. In general this should work but they forgot to assign a unique namespace for the different versions (so two different GML versions use the same namespace). We have solved this e.g. by validating the frame (e.g. GetRecordsResponse) and the content (ISO19139) separately...This problem will be fixed in the new version. But the question is when a new version will be available, as we decided to start working on CSW AP ISO 1.1/2.0 when CSW 2.1/3.0 will be finishing (and this is not yet the case)...
Basically, they are aware of the issue. We've been doing the same thing to validate our csw:Transactions. At GML 3.2.1 they've adjusted the namespace, so this problem will go away, as well as the other validation issues we're having with some validation engines. Unfortunately I don't have any way of knowing when these changes will be finalized.
In the meantime we've found that the French Institut Geographique National has generated their own unofficial version of the enitre ISO 19139 schema, including both aspects of 19115 (datasets, GMD namespace) and 19119 (services, SRV namespace) and utilizing GML 3.2.1. We're experimenting with it now -- it solves validation issues within Visual Studio 2008, but still has difficulties with csw:Transactions. The trouble here is not from duplicate declaration of GML elements, but duplicate references to xlink.xsd... The csw-Publication.xsd leads us to an xlink.xsd at schemas.opengis.net, while the new ISO 19139 schemas references its own cached version of the same file. This results in multiple declarations of elements from that schema and validation issues...