Recommendations for metadata content to describe geoscience resources
Now that we have a better grasp of the IT and informatics challenges of a distributed and interoperable information network, we can think more again about its adoption - how to make it work and useful on the ground. Following was our rough chain of thought on metadata:
- We decided that massive amounts of useful metadata are
necessary to jump-start USGIN. The reason
for pursuing structured metadata is to enable smarter (and spatial)
searching, provide more useful information for resource evaluation, and
to enable automated access to services from clients.
- Metadata creation and maintenance frustrations leads to fewer
and less useful (bad) metadata records; hence, they are a barrier to
cyberinfrastructure adoption and the creation of better client/server
tools.
- This leads us to believe that a minimum useful metadata record
(not necessarily the same as a formal metadata standard's minimum
requirements) is better than no metadata or even better than verbose
and crappy metadata.
- To help our data providing partners with creating new metadata records and
maintaining them, we need tools that make their life easier and their
metadata efforts more rewarding.
- Although USGIN uses ISO 19139 in its CSW metadata catalogs,
FGDC and possibly other metadata standards are used or required by
partners.
We decided on prototyping a "generic minimum metadata wizard" that stores minimum useful metadata in a format agnostic data model which allows one to export (download/services) the conceptual metadata records into any number of arbitrarily formatted metadata files. The minimum metadata
requirements are designed to balance the need for on-line,
interoperable metadata discovery and distribution with the cost of
generating
digital metadata. In order to effectively advertise the USGIN partner's resources,
metadata records
must accomplish three major goals:
- Describe the digital
or physical resource or service.
- Credit the owner,
author, originator, or responsible party of the resource.
- Provide
access information to the described resource.
Following is our latest stab at the conceptual metadata model which is based on Dublin Core and is expanded according to practical needs.
Conceptual Minimum Metadata Fields
Key: Groupings; required, conditional,
and optional metadata fields; (number of values that can be specified).
- Citation
- Title (1 entry): Succinct (preferably <250
characters) name of the resource.
- Description (1 entry): Inform the reader about the resource's
content as well as its context.
- Originators (1 to many entries): Authors, editors, or
corporate authors/curators of the resource.
- Publication Date (1 entry): Publication, origination, or update date
(not temporal extent) for the resource. Use a "year" or ISO
8601 date and time format. Alternative date formatting must be machine
readable and consistent across all datasets.
- Keywords
(0 to many entries):
Thematic, spatial and temporal free-form subject descriptors for the resource.
A keyword may be assigned on metadata import if none are present.
- Resource
language (0 to 1 entries): Use
three letter ISO
639-2 language code (defaults to "eng" for English).
- Resource
ID (0 to many entries): Resource identifier(s)
following any public or institutional standard.
- Intellectual
Originator Contact (0-1 entry): The primary
party responsible for creating the resource. Organization name, person name,
street address, city, state, ZIP code, email, phone, fax.
- Bibliographic
Citation (0 to 1 entries): Full bibliographic citation if the resource
has been published.
- Geographic
Extent - Horizontal (1 entry, point or minimum
bounding rectangle): north bounding latitude, south bounding or
point latitude, east bounding longitude,
west bounding or point longitude. Values
given in decimal degrees using the WGS 84 datum. A
minimum bounding rectangle will be created if point coordinates are given.
- Geographic
Extent – Vertical (0 to 1 entries): surface elevation, maximum elevation, minimum
elevation. Values given in meters relative to mean seas level (MSL)
using the EPSG::5714 geodetic parameter
(WGS 84).
- Temporal
Extent – Temporal range over which the resource was collected or is valid.
If the resource pertains to specific Geologic time periods, those terms should
be entered as keywords.
- Resource
- Link to the
resource (0 to 1 entries): A URL pointing to a
resource or resource webpage.
- Access
instructions (0 to 1 entries): A sentence
or paragraph describing how to access the information.
- Distribution
Contact (1 entry): The party to contact about
accessing the resource. Organization name, person name, street address, city, state, ZIP code, email, phone,
fax, URL.
- Quality
statement (0 to 1 entries):
describe the quality of the resource.
- Constraints
statement (0 to 1 entries): describe
the resource's legal and usage constraints.
- Lineage
statement (0 to 1 entries):
describe the resource's provenance.
- Metadata
- Metadata Date (1 entry): Last metadata update/creation
date-time stamp in ISO
8601 date and time format. This may be automatically updated on metadata import
if a metadata format conversion is necessary.
- Metadata
UUID (0-1 entries): A Universally Unique
Identifier (UUID)
will be assigned during the metadata import process if one is not provided. Unique
identification of each metadata record is required to avoid duplicate entries
across multiple metadata catalogs. The UUID format provides unique
identification without centralized coordination.
- Metadata
Contact (1 entry): The party to contact with
questions about the metadata itself. Organization
name, person name, street address,
city, state, ZIP code, email, phone,
fax, URL.