Development of USGIN Amazon Machine Image (AMI)

Document Information
Document ID: 
app2009-010

This document outlines the development of the USGIN Amazon Machine Image. The purpose of this image is to provide a means by which any group can implement a web server capable of delivering data in USGIN-endorsed formats at a minimal startup cost. Providing an AMI also allows a user to avoid hardware issues as well as some of the complicated configurations that might prove an obstacle to participation in USGIN. It is our hope that as much as possible, this AMI can work out-of-the-box, making it easy for geologists and geological surveys to focus on what they do best - generating and sharing really good geoscience data.

This document is in "book" form. Use the navigation tools at the bottom of each page to browse the contents of the book. You can use the "Up" link to return to this Table of Contents from any page.

Resources

Initial Creation of the Instance

  1. Log in to the web console located at https://console.aws.amazon.com/ec2/home
  2. Go to the Instances page by clicking the link on the left, and then click the "Launch Instance" button at the top of the window.
  3. Click the "Community AMIs" tab. In the search box, search for ami-ccf615a5.
    1. This is an image created as part of the Alestic.com series.
    2. This runs Ubuntu 9.04 Jaunty Server
    3. More information about the image
  4. Create one small instance
  5. Create a new key-pair for root authentication, save the .pem file someplace safe.
    1. I created a key-pair called "BaseAdmin"
    2. BaseAdmin.pem located at \\malachite\workspace\EC2\newinstance\Keys at present
  6. Set up the Security Group, which are essentially a set of firewall rules
    1. Used "StandardFirewallRules" group. This allows traffic on:
      1. Port 22 (SSH) from anywhere (0.0.0.0/0)
      2. Port 80 (HTTP) from anywhere (0.0.0.0/0)
      3. Port 3306 (MySQL) from the Tucson AZGS Office (159.87.39.14/24)
      4. Port 8080 (Tomcat) from the Tucson AZGS Office
  7. Start up the Instance, returns an instance ID, Public and Private DNS
    1. Instance ID: i-5e42c436
    2. Public DNS: ec2-174-129-102-90.compute-1.amazonaws.com
    3. Private DNS: domU-12-31-38-00-3D-B3.compute1.internal
  8. Generate a new Elastic IP address by going to the Elastic IPs window and clicking "Allocate New Address".
  9. Link the IP to the new instance by clicking the new Elastic IP and then the "Associate" button.
    1. Elastic IP for this instance: 174.129.193.63

Create and Mount an Elastic Block Data-Store

If an Amazon EC2 instance crashes, everything on it is lost. Your only resource is to launch a new instance from the last image you made of your machine. For that reason, we want to make sure that any data, logs or dynamic configuration files are located in an Elastic Block Store. This data is hosted on Amazon's S3 service, and will not be lost should your EC2 instance crash and burn.

Create a Elastic Block Storage Volume

  1. Log in to the EC2 Web Console.
  2. Go to the Elastic Block Storage window by clicking the Volumes link under Elastic Block Store on the left side of the page.
  3. Click the Create Volume Button.
  4. You will need to allocate space to the volume. Note that you will pay $0.10 per GB per month for each GB that you allocate -- even if that volume is empty. The capacity cannot be increased later, although you can relatively easily create a new, larger volume and transfer information to it.
  5. Make sure that the Availability Zone is the same as the Instance to which you'd like to link the volume.
  6. If you'd like to, specify a snapshot to use to replicate data from a backup
  7. After the Volume is created, select it from the list and click the "Attach Volume" button to link the volume to your EC2 instance. I don't know what you're supposed to pick for the attachment point... I picked /dev/sdh because that's what was used before...
Create a File System on the Volume
  1. Log in to the Instance as root and type the following command: mkfs -t ext3 /dev/sdh
    1. This creates an ext3 file system. There are other options, but I don't know the drawbacks/benefits.
Mount the Data Store
  1. Log in to the instance as root.
  2. Type the following commands:
    1. mkdir /mnt/data-store: This creates a folder to act as the mount point.
    2. mount /dev/sdh /ment/data-store: This mounts the volume to the folder you created.

Logging In and Adding New Users

By default, the image that we're using in our instance (ami-ccf615a5 ) only allows SSH logins with valid certificates. This means that users defined on the instance cannot SSH log in using usernames and passwords; they must have the appropriate certificates available. 

The configuration of SSH access is controlled by the file /etc/ssh/sshd_config

First Time Login

Locate the .pem file that you downloaded when creating the instance. See this post for information on creating an instance. Next, follow these instructions to create a PuTTY Private Key (.ppk) to use to log in using PuTTY.

  1. Run PuTTYgen. It can be downloaded here: http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html 
  2. At the top of the window, go to Conversions > Import Key. Browse to the .pem file
  3. Create a blank .txt document called "authorized_keys.txt". Paste the text from the field at the top of the PuTTYgen window into this file, save it and close it.
  4. Remove the .txt extension from the authorized_keys file. You may need this file later - keep it safe!
  5. Back in PuTTYgen, you'll next generate a private key. If you'd like, give it a passphrase and a custom comment before clicking "Save Private Key" in the lower right. After that you're donw with PuTTYgen
Now, use your .ppk file to create a login session in PuTTY.
  1. In the tree on the left, go to Connection > SSH > Auth
  2. In the text box in the middle of the window, specify the path to the .ppk file, or browse to it by clicking "Browse..."
  3. In the tree on the left, go back to Session
  4. Specify the Host Name (or IP address), give the session a name in the text box in the middle of the window beneath "Saved Sessions", then click the "Save" button.
  5. Click "Open" on the lower right to connect
  6. The SSH terminal window opens, and prompts you for a username. Type "root" and hit Enter.
    1. This is the OS-level username to which the certificate you are using is bound. If you are trying to log in as a different user, see below.
  7. If you specified a passphrase for your private key, it will prompt you for it. Type it in and hit Enter. Now you're logged in.
Adding a New User, Allowing The User to Log In
  1. First, make sure that a key-pair is created for the user using the EC2 Web Console.
  2. Download the .pem file and follow the above directions to generate a .ppk file for the new key-pair.
  3. Log in to the instance using a different certificate than the one you just made. Make sure you're logging in as a user who has permission to create new users (i.e. root).
  4. Once you're logged in, type adduser <username> where <username> is the name of the new user you wish to create. You may wish to add the user to a group in order to control permission effectively using the useradd command (See Helpful Commands).
  5. Log in to WinSCP using your root certificate. Place the authorized_keys file you generated into /home/<username>/.ssh. You'll have to create the .ssh folder.
  6. Within WinSCP, adjust the permissions as follows
    1. authorized_keys file: 0600
    2. .ssh folder: 0700 and change the group and ownership to <username>.
  7. Back in PuTTY, still logged in as root, change the ownership of the authorized_keys file by typing the following command: 
    1. chown <username>:<username> /home/<username>/.ssh/authorized_keys
  8. Now you can use the .ppk file you generated to log in as the new user.

Create an Amazon Machine Image

Building an AMI is essentially a way for you to backup and share the Amazon EC2 Instance that you've created. It is also useful for creating a fall-back point if an installation or process goes awry, and you'd like to revert to a previous state.

Prepare to Bundle

  1. You'll need to locate a few things:
    1. The private key that identifies your Amazon Web Services account. This is a .pem file.
    2. The public key that identifies your Amazon Web Services account. This is also a .pem file.
    3. Your Amazon Web Services Account number.
    4. Your Amazon Web Services Password and Super-Secret Password.
  2. Put the two .pem files somewhere accessible from your instance. However, you want to put them in a place that will not get bundled into the image itself. The /mnt directory is a great place. You could also put the files into your Elastic Block Store if you've built one.
Bundle the Image
  1. The image I used to spin up the Instance in the first place came with the Amazon Web Services Tools already installed. Otherwise these would have to be installed on the Virtual Machine.
  2. Bundling the instance uses the command ec2-bundle-vol. For more details on the command, you can type ec2-bundle-vol --help. Here is the command line that you'll use to execute the bundling
    1. ec2-bundle-vol -d <location to put image> -k <path to private key .pem> -c <path to public key .pem> -u <AWS Account Number> -r i386 -p <image name> -s 9000
  3. Next you'll need to upload the image to Amazon's place for these things...
    1. ec2-upload-bundle -b <image name> -m <path to .xml manifest> -a <Access Key ID> -s <Secret Access Key ID>
    2. Note: The manifest should be located in the <location to put image> that you specified in the previous step.
  4. Finally, log in to the EC2 Web Console and use the "Register New AMI" button in the AMIs page. The path to your manifest should be <image name>/<image name>.manifest.xml.
Remaining Issues
  1. This process created a Private Image. I don't know how to create a public one.
  2. There seems to be a little bit of lag-time after you register the image. During this time you may see the following issues:
    1. The resulting Image does not show up as "Owned By Me" in the EC2 Web Console.
    2. The metadata in the manifest seems pretty incomplete.
    3. When trying to Launch an Instance from the Image, it does not allow the launch of a small instance.

Software Installations

The basic USGIN Amazon Machine Image will contain a few applications used to provide geoscience data services. All the software is free and open-source. These chapters provide walkthroughs of the installation process that was used to install these applications on our AMI.

Apache HTTP Server 2.x

Preparation: Updates and Upgrades

	apt-get update
	apt-get upgrade

These commands update your apt system with what is available in the repositories, and upgrade any packages already installed to which upgrades are available. This seems like a good thing to do on a regular basis, or at least before any software installations.

 

Installing Apache

	apt-get install apache2

Simple. This installs the latest version of Apache HTTP Server 2.x. At the time of this writing, that is version 2.2.14. When the process is complete the server will be started, and you should be able to test it by visiting http://<Elastic IP Address>. You should see a message "It Works!".

 

Starting and Stopping the Server

The commands are simple:

	/etc/init.d/apache2 start
	/etc/init.d/apache2 stop
	/etc/init.d/apache2 reload

These commands should be run as root. This might sound like a security hole, but it isn't... From the  Apache Documentation:

If the   Listen   specified in the configuration file is default of 80 (or any other port below 1024), then it is necessary to have root privileges in order to start apache, so that it can bind to this privileged port. Once the server has started and performed a few preliminary activities such as opening its log files, it will launch several child processes which do the work of listening for and answering requests from clients. The main httpd process continues to run as the root user, but the child processes run as a less privileged user.

From another page of the Apache Documentation 

In typical operation, Apache is started by the root user, and it switches to the user defined by the   User   directive to serve hits.

If you take a look at /etc/apache2/apache2.conf you'll find

	# These need to be set in /etc/apache2/envvars
	User ${APACHE_RUN_USER}
	Group ${APACHE_RUN_GROUP}
And sure enough, if you look at /etc/apache2/envars you'll find
	export APACHE_RUN_USER=www-data
	export APACHE_RUN_GROUP=www-data

Without you even realizing it, apt created this new user www-data and set up Apache to use it for child processes.

 

Setup A Website

  1. Copy /etc/apache2/default to /etc/apache2/usgin. Open the copied file for editing and make the following changes:
    1. ServerAdmin from webmaster@localhost.com to ryan.clark@azgs.az.gov.
    2. DocumentRoot from /var/sites/usgin/www to /mnt/data-store/sites/usgin/www
    3. Change <Directory /var/www/>  to <Directory /mnt/data-store/sites/usgin/www>.
    4. Change ErrorLog to /mnt/data-store/sites/usgin/logs/error.log
    5. Change CustomLog to /mnt/data-store/sites/usgin/logs/access.log
    6. Add Line ServerName vm2.usgin.org
  2. Create the folders that the site will point to and assign appropriate permissions. 
  3. 	mkdir /mnt/data-store/sites
    	mkdir /mnt/data-store/sites/usgin
    	mkdir /mnt/data-store/sites/usgin/logs
    	mkdir /mnt/data-store/sites/usgin/www
    	chown root:adm /mnt/data-store/sites/usgin/logs
    	chmod 0750 /mnt/data-store/sites/usgin/logs
    	chmod 0755 /mnt/data-store/sites/usgin/www
  4. Copy the generic .html file from the default site into the new location
  5. 	cp /var/www/index.html /mnt/data-store/sites/usgin/www	
  6. Enable the new website, disable the default one, and restart Apache
  7. 	a2ensite usgin
    	a2dissite default
    	/etc/init.d/apache2 reload

Tomcat 6.x

 

Preparation: Updates and Upgrades

	apt-get update
	apt-get upgrade

 These commands update your apt system with what is available in the repositories, and upgrade any packages already installed to which upgrades are available. This seems like a good thing to do on a regular basis, or at least before any software installations.

 

Install Sun's Java 6 JDK

You could skip this step and jump down to Installing Tomcat 6.x, but the trouble is that the tomcat6 package by default installs an Open-Source JDK instead of that published by Sun Microsystems. This is all fine and good for Tomcat, but GeoServer does not work with the OpenJDK. Since we'll be wanting to use GeoServer on the machine, we need to install Sun's JDK. Fortunately, it's as easy as anything:

	apt-get install sun-java6-jdk

* Note (7/28/2010) -- Ubuntu 10 encourages you to use the OpenJDK by not allowing you to easily install the Sun JDK. See http://www.clickonf5.org/linux/how-install-sun-java-ubuntu-1004-lts/7777 for instructions on how to install the Sun JDK under these circumstances.

Installing Tomcat 6.x

	apt-get install tomcat6

It's just that easy. I also installed the admin package which gives us the tomcat manager and host-manager.

	apt-get install tomcat6-admin

You may also want to install the documentation.

	apt-get install tomcat6-docs

 

Starting and Stopping Tomcat

Use the following commands to start, stop and restart Tomcat

	/etc/init.d/tomcat6 start
	/etc/init.d/tomcat6 stop
	/etc/init.d/tomcat6 restart

These commands should be run with root privileges. The script you're running contains a command specifying the user that should end up running Tomcat itself. This user is called "tomcat6", is unprivileged, and was created when you installed the Tomcat package.

 

Configuring Tomcat 6.x

First, we need to define an admin user who can access the admin webapps that we installed. Open the file /etc/tomcat6/tomcat-users.xml and add the bold line:

<tomcat-users>
<!--
  <role rolename="tomcat"/>
  <role rolename="role1"/>
  <user username="tomcat" password="tomcat" roles="tomcat"/>
  <user username="both" password="tomcat" roles="tomcat,role1"/>
  <user username="role1" password="tomcat" roles="role1"/>
-->
  <user username="AdminUserName" password="AdminUserPassword" roles="admin,manager"/>
</tomcat-users>

At this point, restart Tomcat using the commands listed above. You can check that it is working by pointing your web browser to http://<Elastic IP Address>:8080. You should see a simple "It Works!" page. You can also point your browser to http://<Elastic IP Address>:8080/manager/html, enter the AdminUserName and AdminUserPassword that you used in the /etc/tomcat6/tomcat-users.xml file.

The next thing to configure is logging. We would like log files to be placed on the Elastic Data-store so that they can be read in the event that the instance crashes and burns. For our purposes, Tomcat logs will reside in /mnt/data-store/tomcat/logs. First we will adjust the paths to the log files in /etc/tomcat6/logging.properties

Change:

1catalina.org.apache.juli.FileHandler.level = FINE
1catalina.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
1catalina.org.apache.juli.FileHandler.prefix = catalina.

2localhost.org.apache.juli.FileHandler.level = FINE
2localhost.org.apache.juli.FileHandler.directory = ${catalina.base}/logs
2localhost.org.apache.juli.FileHandler.prefix = localhost.

to:

1catalina.org.apache.juli.FileHandler.level = FINE
1catalina.org.apache.juli.FileHandler.directory = /mnt/data-store/tomcat/logs
1catalina.org.apache.juli.FileHandler.prefix = catalina.

2localhost.org.apache.juli.FileHandler.level = FINE
2localhost.org.apache.juli.FileHandler.directory = /mnt/data-store/tomcat/logs
2localhost.org.apache.juli.FileHandler.prefix = localhost.

Then lets add the directories, and set permissions appropriately. I copied the permission and ownership from the default log location.

mkdir /mnt/data-store/tomcat
mkdir /mnt/data-store/tomcat/logs
chown tomcat6:adm /mnt/data-store/tomcat/logs
chmod 0750 /mnt/data-store/tomcat/logs

There also seems to be a problem with the /etc/tomcat6/policy.d/catalina.policy file that prevents any logs from being written. This file specifies what permissions the .jar file that actually does the logging has. Out-of-the-box the way this file is written prevents log files from being written anywhere. We need to fix it, and make sure that it has read/write permissions to the directory where we want the logs to be. The bold lines below are things that had to be changed or added:

// These permissions apply to the logging API
grant codeBase "file:${catalina.home}/bin/tomcat-juli.jar" {
        permission java.util.PropertyPermission "java.util.logging.config.class", "read";
        permission java.util.PropertyPermission "java.util.logging.config.file", "read";
	permission java.lang.RuntimePermission "shutdownHooks";
        permission java.io.FilePermission "${catalina.base}${file.separator}conf${file.separator}logging.properties", "read";
	permission java.util.PropertyPermission "catalina.base", "read";
        permission java.util.logging.LoggingPermission "control";
	permission java.io.FilePermission "/mnt/data-store/tomcat/logs", "read, write";
        permission java.io.FilePermission "/mnt/data-store/tomcat/logs/*", "read, write";
        permission java.lang.RuntimePermission "getClassLoader";
	permission java.lang.RuntimePermission "setContextClassLoader";
        // To enable per context logging configuration, permit read access to the appropriate file.
        // Be sure that the logging configuration is secure before enabling such access
        // eg for the examples web application:
        // permission java.io.FilePermission "${catalina.base}${file.separator}webapps${file.separator}examples${file.separator}WEB-INF${file.separator}classes${file.separator}logging.properties", "read";
};

 

Adjusting Java Memory Allocation

In order for servlets like GeoNetwork and GeoServer to run smoothly, you'll often need to make some adjustments to the memory allocation of the java instance that runs Tomcat. You can do this in a whole bunch of different places, since "Starting Up Tomcat" really means running a whole string of scripts. I made the adjustment by editing /etc/default/tomcat6.

Uncomment and edit the following line by adding what I've put in bold:

JAVA_OPTS="-Djava.awt.headless=true -Xms256M -Xmx1024M -XX:MaxPermSize=256m -XX:PermSize=128m"

 

Create a Connector from Apache to Tomcat

I did this on a windows machine following the instructions I laid out in this post. However, when installing Apache on Ubuntu using the apt system, you end up with a pretty strikingly different Apache configuration than I was used to. I found an incredibly useful walkthrough written by Robert Peters  that I'll basically re-write here.

  1. After installing Apache and Tomcat, install the Jk module for Apache:
    apt-get install libapache2-mod-jk
  2. Create a file at /etc/apache2/workers.properties and paste in the following lines:
    #Define 1 real worker using ajp13
    worker.list=worker1
    
    #Set properties for worker1 (ajp13)
    worker.worker1.type=ajp13
    worker.worker1.host=localhost
    worker.worker1.port=8009
  3. Edit your Apache configuration by adding a few lines to /etc/apache2/httpd.conf
    JkWorkersFile /etc/apache2/workers.properties
    JkLogFile /var/log/apache2/mod_jk.log
    JkLogLevel info
    JkLogStampFormat "[%a %b %d %H:%M:%S %Y]"
  4. Next, edit Apache's default site. In my case, this was /etc/apache2/sites-enabled/usgin. If you haven't already messed with the default site, it will be /etc/apache2/sites-enabled/default.
    1. Delete or comment out the line that specifies the DocumentRoot.
    2. Add the following two lines right below the line you just removed:
    3. JkMount / worker1
      JkMount /* worker1
      
  5. Enable the "Connector port" 8009 in tomcat by uncommenting the following line in /etc/tomcat6/server.xml:
    <Connector port="8009" protocol="AJP/1.3" redirectPort="8443" />

    Simply remove the <!-- before and the --> after the line to uncomment it.

  6. Now, restart tomcat and then restart apache:
    /etc/init.d/tomcat6 restart
    /etc/init.d/apache2 restart

PostgreSQL and PostGIS

Note: At Ubuntu 10.04 (Lucid), you can install PostgreSQL 8.4 and PostGIS 1.4 using

sudo apt-get install postgresql-8.4-postgis


There are two ways to go about this: The easy way and the hard way. The easy way is to install PostgreSQL 8.3 and PostGIS 1.3. Both of these are out-of-date versions. The hard way installs PostgreSQL 8.4.1 and PostGIS 1.4. I'll outline both ways here. On our machine, I did it the hard way.

PostgreSQL 8.3 and PostGIS 1.3 (The Easy Way)

apt-get install postgresql-8.3-postgis

... and you're done.

 

PostgreSQL 8.4.1 and PostGIS 1.4 (The Hard Way)

First of all -- this walkthrough benefits enormously from blog posts by Mark Feeney  and Javier de la Torre .

  1. apt-get update
    apt-get upgrade

     These commands update your apt system with what is available in the repositories, and upgrade any packages already installed to which upgrades are available. This seems like a good thing to do on a regular basis, or at least before any software installations.

  2. /etc/apt/sources.list is a listing of the repositories used by the apt system. In order to proceed, we need to access some non-standard repositories. Add the following two lines to the file:

    deb http://ppa.launchpad.net/pitti/postgresql/ubuntu jaunty main
    deb-src http://ppa.launchpad.net/pitti/postgresql/ubuntu jaunty main
    
  3. Next, get the key for these new sources. From your command prompt:
    apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 8683D8A2
  4. Now update sources one more time:
    apt-get update
  5. Now you can finally install PostgreSQL 8.4.1!
    apt-get install postgresql-8.4
  6. This starts the PostgreSQL service, but by default it listens on port 5433. To change it to the standard PostgreSQL port (5432), use the following command:
    sed -i.bak -e 's/port = 5433/port = 5432/' /etc/postgresql/8.4/main/postgresql.conf

    Now stop and restart PostgreSQL

    /etc/init.d/postgresql-8.4 stop
    /etc/init.d/postgresql-8.4 start
  7. Now moving on to PostGIS - First we'll need to install the libraries that will be needed to build PostGIS from source:
    apt-get install postgresql-server-dev-8.4 libpq-dev
    apt-get install libgeos-dev
    apt-get install proj
  8. Download and extract the PostGIS tarball:
    wget http://postgis.refractions.net/download/postgis-1.4.0.tar.gz
    tar xvfz postgis-1.4.0.tar.gz 
  9. Now build and install PostGIS:
    cd postgis-1.4.0
    ./configure
    make
    make install
  10. Moving on to configuration... give your postgres user a password at the OS level, and also within PostgreSQL:
    passwd postgres (enter the password at the prompt)
    su postgres
    psql -c "ALTER user postgres WITH PASSWORD '[password]'
  11. Now create a template PostGIS-enabled database:
    createdb geodb
    createlang -d geodb plpgsql
    psql -d geodb -f /usr/share/postgresql/8.4/contrib/postgis.sql
    psql -d geodb -f /usr/share/postgresql/8.4/contrib/spatial_ref_sys.sql
    psql -d geodb -c "SELECT postgis_lib_version();"

    If the last command returns "1.4.0" then the template database is properly setup.

     

Allowing External TCP/IP Connections to PostgreSQL

Having troubles here right now... can get it to work with SSH tunneling though.

 

Put Data on the Elastic Volume

mkdir /mnt/data-store/postgresql
mkdir /mnt/data-store/postgresql/data
cp -R /var/lib/postgresql/8.4/main/* /mnt/data-store/postgresql/data
chown -R postgres:postgres /mnt/data-store/postgresql/data
chmod -R 0700 /mnt/data-store/postgresql/data

/etc/postgresql/8.4/main/postgresql.conf

data_directory = '/mnt/data-store/postgresql/data'

 

Logging - A lot to learn...

And I haven't done anything about it. No changes have been made to the logging configurations.

MySQL 5.4

Install MySQL 5.1

	apt-get install mysql-server-5.1

During installation, specify a password for the root user. I was also prompted to configure Postfix, and selected the "No configuration" option.

 

Allow Remote Access to the MySQL Server

First of all, have to tell MySQL to listen to traffic coming from places other than 127.0.0.1. this is done by editing the file at /etc/mysql/my.cnf. Adjust the following line:

	bind-address 0.0.0.0 # this was 127.0.0.1

 

Allow a Specific User Remote Access

MySQL requires that remote access users be specifically appointed. Issue the following command:

	mysql -u root -p

You'll be prompted for the root MySQL user's password. After entering it, hit enter, and you'll be in the mysql console with a "mysql>" prompt. Enter the following two lines:

	grant all privileges on *.* to 'root'@'[the ip address you'll be connecting from]'
	identified by '[password]';

This means that if user "root" connects from the given IP address using the password you've specified, that user will have full privileges on all databases.

GeoServer (for Simple Services)

  1. First, I needed a clean version of GeoServer 2.0 (without sample data). Here is how I created it:
    1. I downloaded the .war file for GeoServer from http://geoserver.org/display/GEOS/Stable.
    2. You get a .zip archive with the geoserver.war inside it. Extract the .war.
    3. Change the .war extension to .zip and extract that to a folder called "geoserver"
    4. Added context snippet, ran this webapp through Tomcat on my local computer. Removed example workspaces, datastores, layers and styles.
    5. Removed sample data from the installation directory.
    6. Adjusted the Administrator name and password. This is configured in geoserver/data/security/users.properties.
    7. Downloaded the AppSchema extension from http://downloads.sourceforge.net/geoserver/geoserver-2.0.0-app-schema-plugin.zip.
    8. Extract this and place the two .jar files in geoserver/WEB-INF/lib.
  2. Using WinSCP, upload the contents of the folder to /mnt/data-store/geoserver/gsvr.
  3. Add a context snippet file that points Tomcat to the Geoserver folder. The file is /etc/tomcat6/Catalina/localhost/gsvr.xml. Here are its contents:
    <Context path="/gsvr" 
    	docBase="/mnt/data-store/geoserver/gsvr" debug="0"
    	reloadable="true" cachingAllowed="false"
    	allowLinking="true"/>
  4. Adjust the Java permissions in /etc/tomcat6/policy.d/04webapps.policy by adding the following line:
    permission java.security.AllPermission;
  5. Restart Tomcat:
    /etc/init.d/tomcat6 restart

GeoNetwork 2.4.2

Setup the MySQL Database for Geonetwork to Use

  1. From the command prompt on the instance, type the following commands:
    mysqladmin create geonetwork -u root -p

    You will be prompted for the password for the root MySQL user.

  2. Define a geonetwork MySQL user with permissions on the new database:
    mysql -u root -p (enter password when prompted)
    grant all privileges on geonetwork.* to 'geonetwork'@'localhost' identified by 'password';
    grant all privileges on geonetwork.* to 'geonetwork'@'159.87.39.14' identified by 'password';
  3. In order to populate the database for GeoNetwork's use, you'll need to have GAST installed on a remote machine at the IP address used in the "grant" line above. Using the GAST tool on that remote machine, you can connect to the MySQL database you just made using the geonetwork user, and use the Setup tool to add the tables and data that are needed. 
Installing GeoNetwork 2.4.2
  1. First, you'll want to make an install script that looks like this:
    <AutomatedInstallation langpack="eng">
        <com.izforge.izpack.panels.HelloPanel/>
        <com.izforge.izpack.panels.HTMLLicencePanel/>
        <com.izforge.izpack.panels.TargetPanel>
            <installpath>/mnt/data-store/geonetwork</installpath>
        </com.izforge.izpack.panels.TargetPanel>
        <com.izforge.izpack.panels.PacksPanel>
            <selected>
                <pack index="0"/>
    	    <pack index="1"/>
                <pack index="2"/>
    	    <pack index="3"/>
            </selected>
        </com.izforge.izpack.panels.PacksPanel>
        <com.izforge.izpack.panels.InstallPanel/>
        <com.izforge.izpack.panels.ShortcutPanel/>
        <com.izforge.izpack.panels.HTMLInfoPanel/>
        <com.izforge.izpack.panels.FinishPanel/>
    </AutomatedInstallation>
    
    Note that you can specify the install location. You'll want it to be on the Elastic Data Store somewhere.
  2. Create a directory for GeoNetwork to live, switch to it and download the executable .jar file to install it. Upload the install script to this directory as well.
    mkdir /mnt/data-store/geonetwork
    cd /mnt/data-store/geonetwork
    wget http://downloads.sourceforge.net/project/geonetwork/GeoNetwork_opensource/v2.4.2/geonetwork-install-2.4.2-0.jar?use_mirror=softlayer
    
    

     

  3. Install Geonetwork with the following command:
    java -DTRACE=true -jar geonetwork-install-2.4.2-0.jar <path to your install script>

 

Point GeoNetwork at the MySQL Backend

You'll be editing a file located at /mnt/data-store/geonetwork/web/geonetwork/WEB-INF/config.xml. Find the <resources> node and its children. Make the changes outlined below in bold:

	<resources>
		<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
		<!-- mckoi standalone -->
		<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->

		<resource enabled="false">
			<name>main-db</name>
			<provider>jeeves.resources.dbms.DbmsPool</provider>
			<config>
				<user>xRgAPQLl</user>
				<password>X7ByXqvJ</password>
				<driver>com.mckoi.JDBCDriver</driver>
				<url>jdbc:mckoi://localhost:9157/</url>
				<poolSize>10</poolSize>
			</config>


		<activator class="org.fao.geonet.activators.McKoiActivator"><configFile>WEB-INF/db/db.conf</configFile></activator></resource>

		<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->
		<!-- mysql -->
		<!-- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -->

		<resource enabled="true">
			<name>main-db</name>
			<provider>jeeves.resources.dbms.DbmsPool</provider>
			<config>
				<user>geonetwork</user>
				<password>password</password>
				<driver>com.mysql.jdbc.Driver</driver>
				<url>jdbc:mysql://localhost:3306/geonetwork</url>
				<poolSize>10</poolSize>
				<reconnectTime>3600</reconnectTime>
			</config>
		</resource>

 

Adjust GeoServer's Data Directory

GeoNetwork comes with a simple GeoServer installation that is used to draw the basemaps in the Intermap application. The default installation does not point GeoServer at the right place to find its data. Make the following change to/mnt/data-store/geonetwork/web/geoserver/WEB-INF/web.xml:

<context-param>
       <param-name>GEOSERVER_DATA_DIR</param-name>
        <param-value>/mnt/data-store/geonetwork/data/geoserver_data</param-value>
    </context-param> 

 

Adjust GeoNetwork Folder Permissions

There's probably a more elegant way to handle this, but for now....

chown -R tomcat6:tomcat6 /mnt/data-store/geonetwork

 

Add Context Snippets for Tomcat

In /etc/tomcat6/Catalina/localhost, place three files:

geonetwork.xml

<?xml version="1.0" encoding="UTF-8"?>
<!-- configuration to point tomcat at geonetwork directory at root of file system -->
<Context docBase="/mnt/data-store/geonetwork242/web/geonetwork" path="/geonetwork"></Context>

intermap.xml

<!-- configuration to point tomcat at intermap (map on the web interface for 
geonetwork; geoserver map client) directory at root of file system -->
<Context docBase="/mnt/data-store/geonetwork242/web/intermap" path="/intermap"></Context>

geoserver.xml

<?xml version="1.0" encoding="UTF-8"?><!-- configuration to point tomcat at geoserver (WMS service etc.) directory at 
root of file system -->
<Context docBase="/mnt/data-store/geonetwork242/web/geoserver" path="/geoserver"></Context>

 

Restart Tomcat

/etc/init.d/tomcat6 restart

Appendices

This section contains tips and helpful hints for using the USGIN EC2 Instance.

Common Procedures

 

"Restarting" the Virtual Machine

You can't actually restart it, instead you essentially delete it and roll-back to a prior machine image.

  1. Use the EC2 Console  to terminate the instance.
  2. Still in the console, locate the machine image you want to roll back to and Launch an instance of that image.
    1. This post may provide some guidance as to which image to launch.
    2. Specify the details of the instance, we usually create one small instance, use the StandardFirewallRules Security Group, and the BaseAdmin key-pair.
  3. Attach the appropriate Elastic Block Store to the new instance at /dev/sdh.
  4. Follow the directions below to connect to the instance via SSH.
  5. Follow the directions below to mount the Elastic Data Store.
  6. Follow the directions below to start Apache, then PostgreSQL and MySQL, then Tomcat.

 

Connect to the Instance Using SSH (and PuTTY)

If you already have the .ppk file for the appropriate user that you wish to log in with, use it. You may want to set up SSH Tunneling if you want to use pgAdmin to connect to PostgreSQL.

If you don't have a.ppk file, follow the instructions outlined in this post.

 

Mounting the Elastic Block Store

First, use the EC2 Console  to attach the volume to the instance at /dev/sdh

mkdir /mnt/data-store
mount /dev/sdf /mnt/data-store

 

Starting, Stoping, Restarting Applications

These commands should be executed with root privileges

/etc/init.d/apache2 start
/etc/init.d/apache2 stop
/etc/init.d/apache2 reload
/etc/init.d/tomcat6 start
/etc/init.d/tomcat6 stop
/etc/init.d/tomcat6 restart
/etc/init.d/postgresql-8.4 start
/etc/init.d/postgresql-8.4 stop
/etc/init.d/postgresql-8.4 restart
/etc/init.d/mysql start
/etc/init.d/mysql stop
/etc/init.d/mysql restart

Helpful Linux Commands

User Administration

  1. adduser <username>: Create a new user named <username>
  2. userdel -r <username>: Delete the user named <username>. The -r option removes the user's home folder located at /home/<username>.
  3. lastlog: lists all users and the last time that they logged in.
  4. groupadd <groupname>: Create a group called <groupname>
  5. useradd -G <groupname1>,<groupname2>... <username>: Add a user named <username> to a group or multiple groups.
Permissions
  1. chown <username>:<groupname> <file or directory>: Change the owner of at file or directory to <username> and <groupname>.
  2. chmod <permissions code> <file or directory>: Change the permissions on a file or directory to that specified.
"Hardware" Resources
  1. top: Show the top resource using processes
  2. ps aux: Show all processes
  3. free: Show information about memory allocation
  4. du <file or folder name>: Show the size of a file or folder, recursive for folders.
  5. kill <PID>: Kill a process by its ID. Use kill <PID> -9 to force the kill.
EC2 Tools
  1. ec2-bundle-vol: use to bundle an AMI for your running instance. Use --help for syntax information.
  2. ec2-upload-bundle: use to upload your bundle to Amazon's place for these things. Again, Use --help for syntax information.
Package Management / Software Installation
  1. apt-get: Oh my goodness its so easy to install anything!
  2. dpkg -L <package name>: Show a list of all the files that were installed with the named package. Useful since Linux seems so silly about where everything ends up...
Debugging
  1. tail -f <log file>: Real-time display of latest log file changes.
  2. find / -name <file or folder name fragment>*: Recursively search for folders and files by partial name from root down.