Harvesting metadata from catalogs
The ability to 'harvest' or transfer collections of metadata records between catalogs to keep them synchronized is a desirable operation in a federated catalog system. The major reasons are performance and reliability. Although the OGC catalog service specification (CSW) includes provisions for federated catalogs to propagate requests to other servers, most people that we have talked to who actually have implemented such functionality report that perfomance is too slow and unreliable. If one of the servers a request is cascaded to is not functioning, it can freeze the entire process, the response is only as fast as the slowest server, and the client must determine how to identify duplicate result records. Harvest and cache of metadata records allows particular metadata registries to specialize on particular kinds of content, and to index records and create stored views to optimize performance with the records held in that registry.
Discussions with developers of Stratigraphy.net indicate that there are problems using the CSW metadata harvesting operations in the context of geoscience metadata resources. They recommend use of the Open Archives Initiative Profile for Metadata Harvesting (OAIPMH) for Harvesting services. They report that based on their experience and that of others, that a distributed metadata catalog architecture based on a collection of metadata providers and portal servers that harvest and cache metadata records is a more viable design than a real time distrubuted query system.
Related Community Groups |
---|
CSW Debug Blog | 17 Posts | Join A group blog to discuss metadata Catalog Service for the Web (CSW) implementation experiences |
Building a GeoSciML WFS Server | 11 Posts | Join Development, testing and implementation of a WFS service that returns GeoSciML documents |
ETL Debug Blog | 12 Posts | Join A group blog on implementing and debugging Extract-Transform-Load (ETL) efforts. |
Presentations and Posters | 12 Posts | Join Post your posters and presentations related to USGIN topics. |
Metadata interest group | 13 Posts | Join group for general posting on metadata content, standards, tools |
USGIN Amazon Virtual Server Development | 18 Posts | Invite only Documenting the process of development of a Web Server in the Amazon EC2 environment. Software installations tailored to the requirements for USGIN |
GeoNetwork configuration and development | 7 Posts | Join Discussion on GeoNetwork setup, configuration, and development. |
Student Projects | 0 Posts | Join Discussion of student projects related to USGIN |
Drupal Development | 6 Posts | Join All about bending Drupal to your needs |
Geoportal on an Amazon Virtual Machine | 3 Posts | Closed Installation, configuration, etc. |
Using Django for USGIN | 7 Posts | Request membership Thought and ideas about using Django to accomplish USGIN-related... things. |
ArcGIS Server and OGC Services | 3 Posts | Join Tips on using ArcGIS Server to provide OGC web services |
Content model discussion | 0 Posts | Request membership Community site for comments on development of content models and encoding for information intechange |
Making Web Maps | 2 Posts | Request membership For information about the myriad of mechanisms for showing service data on a web page. |
Troubleshooting Web Service Deployment - Blog | 5 Posts | Join This blog is for documenting our group's experiences with web service deployment. |
Best Practices for USGIN Web Service Hosting | 10 Posts | Join Tips, techniques, and frequently asked questions for hosting AASG Geothermal Data Web Map Services and Web Feature Services |
Hub Disaster Recovery | 0 Posts | Request membership Discussions around how to harden a distributed federated system against disaster; setting up a system to mirror hub VMs at other hubs. |