Creating ISO 19139 metadata through Drupal Views and Views Bonus Pack

Following is how we modified the Views Bonus Pack module for Drupal 6 to generate ISO 19139 XML metadata files for the USGIN Document Repository. This hack can be be used to relatively easily generate other XML metadata (FGDC, CSW records) or most complex XML. Ideally, this hack will evolve into a separate module. If you are interested or have questions, please participate in the Generate ISO 19139 metadata record XML files Views Bonus Pack issue thread.

Overview

The goal is to generate minimum ISO 19139 dataset metadata XML files (that conform to the USGIN profile) from the core and CCK fields used in the repository's Collection (ct_collection) and Document-like Information Object (DLIO, ct_dlio) content types.

Drupal's Views module acts similar to a database "view" which extracts, arranges, and manipulates node field content into new representations (pages, blocks, feeds, etc.) and styles (unformatted, HTML lists, tables, etc.). One of these representations is an RSS feed - practically, some formatted XML page that contains a node's title, teaser, etc. The Views Bonus Pack module expands Views with, among other, the Export sub-module. The Export sub-module adds additional styles to Views' RSS representation that generate CSV, DOC, TXT, XLS, and XML formatted outputs. I added an "ISO 19139" style to the Export module that produces what we needed.

ISO 19139 Export Style

Views edit page for ISO 19139 hackThe trick is to use the label for a field in Views as the key for the field's value. The field values are then retrieved through the label key in the ISO 19139 template file (views-bonus-export-iso.tpl.php). This means that one has to add as many fields to Views as one wants to control element values or attribute values in the XML document. This (and the long machine-readable labels) can get cumbersome quite fast and would be handled differently in a custom module.

Note that Views (or CCK?) returns CCK multi-value fields as one value with <span> separated  sub-values (or what ever some template specifies).  I delimit multi-value fields in Views with a pipe (|) or semi-colon (;) and then strip out any HTML/XML tags in views_bonus_export_theme.inc. The template file views-bonus-export-iso.tpl.php is effectively an XML file with PHP code to populate XML element and attribute values. I also included logic to deal with element nesting dependencies and a bit of validation through conditional statements. Required metadata elements will show up empty if a value is missing. All optional elements are pruned if required values are missing.

You can find a patch file generated in Eclipse for Views Bonus Pack 6.x-dev (CVS trunk) with the discussed modifications at the bottom of the page.

views_bonus/export/views_bonus_export.theme.inc

Added a new preprocessor function to the existing file:

/**
* Preprocess ISO 19139 xml output template.
*/
function template_preprocess_views_bonus_export_iso(&$vars) {
_views_bonus_export_shared_preprocess($vars);

foreach ($vars['themed_rows'] as $num => $row) {
foreach ($row as $field => $content) {
// Add semicolon delimiter between multiple values seperated by DIV and SPAN tags
$content = str_replace(array('</div><div', '</span><span'), array('</div>; <div', '</span>; <span'), $content);
// Strip HTML tags (not supported ISO 19139 XML)
$content = strip_tags($content);
// Prevent double encoding of the ampersand. Look for the entities produced by check_plain().
$content = preg_replace('/&(?!(amp|quot|#039|lt|gt);)/', '&amp;', $content);
// Convert < and > to HTML entities.
$content = str_replace(
array('<', '>'),
array('&lt;', '&gt;'),
$content);
$vars['themed_rows'][$num][$field] = trim($content);
}
}
}

views_bonus/export/views_bonus_export.views.inc

Appended a new array to the existing file. I decided to stick with the existing XML icon.

      'views_iso' => array(
'title' => t('ISO 19139 XML file'),
'help' => t('Display the view as a txt file.'),
'path' => $path,
'handler' => 'views_bonus_plugin_style_export_iso',
'parent' => 'views_bonus_export',
'theme' => 'views_bonus_export_iso',
'theme file' => 'views_bonus_export.theme.inc',
'uses row plugin' => FALSE,
'uses fields' => TRUE,
'uses options' => TRUE,
'type' => 'feed',
'export headers' => array('Content-Type: text/xml'),
'export feed type' => 'xml',

views_bonus/export/views_bonus_plugin_style_export_iso.inc

Added a new style for Views. In following with Drupal practice, do not close the PHP tag in includes.

<?php
// $ $
/**
* @file
* Plugin include file for export style plugin.
*/

/**
* Generalized style plugin for export plugins.
*
* @ingroup views_style_plugins
*/
class views_bonus_plugin_style_export_iso extends views_bonus_plugin_style_export {
var $feed_text = 'XML';
var $feed_file = 'view-%view.xml';

/**
* Initialize plugin.
*
* Set feed image for shared rendering later.
*/
function init(&$view, &$display, $options = NULL) {
parent::init($view, $display, $options = NULL);
$this->feed_image = drupal_get_path('module', 'views_bonus_export') . '/images/xml.png';
}
}

views_bonus/export/views-bonus-export-iso.tpl.php

Added a new template file for the ISO 19139 XML output - this is where the main action takes place.