Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Our backups consist of 1) Postgres database (metadata) and 2) files on S3 (original files, also derivatives for convenience). That’s it!

Original files and derivatives

...

Overview

The Science History Institute's Digital Collections offer highlights from our library, archives, and museum collections.The purpose of our Digital Collections is to manage, preserve, and provide access to our digital assets all in one location. Although the Digital Collections include only a small portion of the Science History Institute’s entire collection, new material is added every day. (See our About and FAQ pages for more details.)

The Digital Collections consists of:

  1. a set of digital representations, intended for a Web audience, of physical objects, which range from museum objects of all descriptions to books to taped audio interviews to VHS tapes. (For a good idea of the range of materials, go to our search results and limit your search by genre, format, or medium.) In the description below, when we talk about “original files” we are talking about these digital representations, which take the form of computer files. We store the original files in Amazon S3.

  2. descriptions of the files above, which allow us to find them, keep them in order, search them, and describe them to the public. We store the descriptions in a PostGreSQL database hosted and managed by Heroku.

Backup summary

We store the backups for the original files in a separate S3 bucket that automatically mirrors the contents of the originals. On a nightly basis, these are copied to a local server, and our IT staff is responsible for making regular copies of these backups to a local disk, and then storing a series of tape copies of them offsite on tape.

Heroku offers a service allowing us to “roll back” or revert the database to its state at any point in the past four days. In addition, we store nightly database backups in a dedicated S3 bucket. Our IT staff also makes nightly copies of the s3 bucket to local disk. From there, it joins backups of the original files in offsite tape storage.

Backup details

The details below are intended for an internal, technical Science History Institute audience, and discuss how we back up the “original files” (hereinafter “originals”) and the PostGreSQL database (hereinafter “database”).

Original files are stored in S3, and are backed up within S3 by a process managed by AWS. The backups are then copied to long-term storage by SyncBackPro, which is Windows software running on Promethium managed by Chuck and Ponce (see https://www.2brightsparks.com/syncback/sbpro.html ). (None of this will change when we get rid of Ansible.)

See more at Digital CollecS3 Bucket Setup and Architecture and https://sciencehistory.atlassian.net/wiki/pages/createpage.action?spaceKey=HDCSD&title=Backups%20and%20Recovery%20%28Historical%20notes%29

Heroku database backups

We have three backup/restore mechanisms under Heroku:

1. Nightly .dump backups

We use heroku’s built-in postgres backup functionality to make regular backups that are stored in heroku’s system. This is the most convenient backup to restore from, when it is available and meets your needs.

...

  • To verify that we have scheduled backups, run heroku pg:backups:schedules --app scihist-digicoll-production, to see that we have a 2AM backup ever night.

  • List what backups exist by running heroku pg:backups -a scihist-digicoll-production Note the first section is “backups” (which may scroll off screen), and the first column is a backup ID, such as a189.

  • With the backup ID, you can restore production to a past backup (eg id a189), with heroku pg:backups:restore a189 -a scihist-digicoll-production

    • Warning: this will overwrite current production data, with the restored backup!

    • Warning: see note below re: --extensions.

  • Maybe instead you want to restore a production backup to staging, to just look at the data, without actually (yet?) restoring to and overwriting current production? You can do this too:

    • heroku pg:backups:restore scihist-digicoll-production::a189 -a scihist-digicoll-staging

  • 💡 Warning: the above command will may fail if the database you are restoring from has extensions installed in the public schema, subsequent to some changes in how Heroku works with extensions). There is a workaround: using the extensions flag as in the example below allows you to pg:restore from a database that has extensions in public(like the current production DB , as of before Sept 2022)

Code Block
heroku pg:backups:restore scihist-digicoll-production::a661 DATABASE_URL \
	--extensions 'public.pg_stat_statements,public.pgcrypto' \
	--app scihist-digicoll-staging

...

You can also download heroku backups to store them in your own location, and then load your local copies into heroku. See Heroku docs for more info.

2. Preservation (logical) backups to s3

We don’t want to rely solely on backups stored inside heroku’s system. We also would like a postgres backup in the more human-readable and transportable plain .sql format, instead of the postgres -Fc .dump format.

...

The more portable .sql format stored and backed up outside of heroku is motivated primarily for preservation purposes, but it can also serve as a last-ditch or alternative disaster recovery. It can be restored to heroku using the heroku pg:psql command to run arbitrary psql commands on the heroku postgres.

Restoring from a logical (.sql) database dump.

In the unlikely event you have to restore from a logical backup:

...

Note: This will overwrite your database, and won’t warn/prompt you about that fact first! It will run in your terminal and take a bit of time.

3. Heroku postgres “rollback”

Heroku can rollback postgres database to an arbitrary moment in time, based on postgres log files. For our current postgres standard-0 plan, there are four days past of logs kept. See: https://devcenter.heroku.com/articles/heroku-postgres-rollback , and the section “Common Use Case: Recovery After Critical Data Loss

...

To do this requires creating a new postgres “rollback” database; switching the app to use it; then deleting the old no-longer in use database. From a terminal with the heroku CLI:

  1. heroku addons:create heroku-postgresql:standard-0 --rollback DATABASE_URL --to '2021-06-02 20:20 America/New_York' --app scihist-digicoll-production

  2. The site remains up. The new database’s name will be printed to the terminal, and you can see it in the Resources section of the Heroku admin. It might be something like postgresql-curly-07169

  3. It might take a few minutes or more for the newly restored database to be ready, you can follow instructions the command gives you to check progress, such as heroku pg:wait

  4. Once the rollback database – which has been restored to a past moment in time – is ready, you can switch the app to use that new restored database by using the database name:
    heroku pg:promote postgresql-curly-07169 --app scihist-digicoll-production

  5. Make sure you have successfully fixed the problem.

  6. Once all is well, don’t forget to get rid of the extra database(s) you are no longer using. Consider leaving this step for the next day; it will only cost a couple dollars over 24 hours.

    1. How do you know which db is the “old” one? Run heroku addons to see all your heroku-postgresql databases; the one currently used by the app is marked as DATABASE. So the other one is the old no longer used one, which also has an AS name.

    2. To remove it run eg heroku addons:destroy HEROKU_POSTGRESQL_YELLOW --app scihist-digicoll-production. Be careful you are removing the correct one!

NOTE: Is it possible to rollback to a past production snapshot, but do it in the staging app first, to see what it looks like without touching production? We need to look into that, it could be a safer way to do it.

Historical notes

Prior to moving off our Ansible-managed servers, we used backup mechanisms that used to be performed by cron jobs installed by Ansible.Backups and Recovery (Historical notes) https://sciencehistory.atlassian.net/wiki/pages/createpage.action?spaceKey=HDCSD&title=Backups%20and%20Recovery%20%28Historical%20notes%29 contains a summary of our pre-Heroku backup infrastructure.

...