Backups and Recovery

~~Currently due to some issues with the S3 recovery we've got a bit of an issue with the Fedora restore so this information is current as of 7/5/16.~~

As of 7/21 a test using Staging data worked well with the S3 storage method and things are back in gear.

Here is the current backup strategy as a diagram:

Recovery Options

Fedora:

S3:

Currently we are using the s3 sync tool (akin to rsync for S3) to pull over key fedora data into the chf-hydra-backup bucket. This is a slight misnomer as it handles backups for ArchivesSpace as well now, but Fedora data is pulled over into

FedoraBackup (contains all Fedora data)

TomcatConfig (contains the Tomcat configuration and levelDB)

Both sets are needed to do a full restore.

Note: As a reminder while S3's visual interface uses folders, those locations are actually just the first step in a path of individual block stored objects.

Download both sets of data to the new machine. You'll want to make sure FedoraBackup goes to a disk with enough space to handle all of our data.

Stop Tomcat

Replace the default fedora location (/opt/fedora-data) in a new Ansible machine with the copied data from FedoraBackup.

Then go to /var/lib/tomcat7/webapps/ and replace the fedora folder with the fedora folder downloaded from TomcatConfig

This will migrate the Fedora database. You will need to finish bringing over the Users and Minter state for a full restore.

Users:

Go to S3 and download the postgres backup files.

If Tomcat is not stopped, stop Tomcat.

Restart the postgres service, this should remove the default connection to the Hydra database that Hydra has when running so you can change it.

In Postgres delete the automatically generated chf_hydra database

Log in via

psql -u postgres

and run

DROP DATABASE chf_hydra;

make a blank database chf_hydra

CREATE DATABASE chf_hydra;

Then import the downloaded database

Either:

pg_restore -d chf_hydra -U postgres chf_hydra.dump
or
psql chf_hydra < chf_hydra_dump.sql

Then set permissions by logging in

psql -U postgres

Once logged in run

GRANT Create,Connect,Temporary ON DATABASE chf_hydra TO chf_pg_hydra;

The postgres account password is in ansible-vault (groupvars/all)

You may now restart postgres and Tomcat.

Minter:

Go to S3 and download the minter-state backup files.

Move Minter state to it's new location. (/var/sufia) Note: You may need to make /var/sufia with sudo mkdir /var/sufia if it does not exist yet.

Change ownership for the minter state to the owner (hydep) and the group (deploy)

Command: sudo chown -R hydep:deploy /var/sufia

Redis

Redis keeps a database in memory which handles, as far as we currently know (8/1/16), the transaction record data such as the history of edits on a record. It does not contain the actual data, simply the timeline of changes.

redis-dump.rdb must be copied from S3 over to the machine.

It must be changed to be owned by redis: sudo chmod -R redis:redis filename

Then you will need to stop the redis server (sudo service redis-server stop)

Move redis-dump.rdb to /var/lib/redis/dump.rdb (overwriting the file there called dump.rdb)

Restart redis (sudo service redis-server start)

When starting up redis will read the .rdb dump file and copy that data back to the in memory database.

Indexing:

Finally you will need to follow the instructions for Reindexing Solr in Hydra under Application Administration to index all the data so it shows up in Hydra.

Note: I found that a few files with versioning got missed on the first reindex so you may want to run it a second time if some data is missing in Hydra.

Last steps:

Finally restart Tomcat, and Apache.

Snapshots:

If something goes wrong, go to

EC2→Snapshots

Then select the snapshot(s) you want to restore (opt and fed_data) and you have a new opt and fedora setup disk. If the damage is just something local you can reattach these snapshots.

If a new machine is needed

Run Ansible and capistrano to build a fresh machine and when it comes up, turn it off. Detach the automatically generated /opt disk (/dev/xvdg or /dev/sdg by name).

Attach the restored opt and fedora disk.