Creating a Document Repository in Drupal

We just finished putting together a Drupal based Document Repository site (http://repository.usgin.org) as a testbed and as an alternative for other Repository systems such as DSpace. The goal was to fulfilled the following requirements in a ca. 2 week development period (200+ man hours):

  • Upload files that have not been put on the web yet and generate metadata for them
  • Add links to resources that are already on the web and generate metadata for them
  • Automatically generate ISO 19139 dataset metadata XML records
  • Offer an as easy-as-possible way to enter minimum ISO 19139 metadata
  • Use Search Engine Optimization to expose content to the web
  • Offer some nice way to search the repository
  • Offer a minimal but intuitive workflow for repository management
  • Find an easy way of submitting spatial extent coordinates
  • Remain flexible to adapt to new/changing input and presentation needs
Initially, we were greatly inspired by FAO's AgriDrupal effort.
Following are some of the things that we still need to address or are at least dreaming of tackling:
  • Bulk loading of repository entries (with and without file uploads)
  • Support more metadata export standards such as FGDC XML and OGC Records (Dublin Core)
  • Expose repository content and metadata through:
    • A very simple CSW service. Only getCapabilities, getRecords, and getRecordsById without any filtering capability?
    • OAI-PMH repository service implementation
    • AJAX widgets that can be embedded in other people's sites
    • OpenSearch implementation
  • Replace Drupal Search with Solr and Lucene indexing services
  • Auto-generate preview images from uploaded PDF
  • ...