Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

These are stored in S3, are backed up within S3 by a process managed by AWS, and are then copied to long-term storage by SyncBackPro, which is Windows software running on Promethium managed by Chuck and Ponce (see https://www.2brightsparks.com/syncback/sbpro.html ). This is not affected by our move to Heroku.

Heroku database backups

We have three backup mechanisms under Heroku:

...

We supplement the above with a regular, 2am, nightly physical database scheduled backup:. These are stored by Heroku, and restoring to them is very fast and convenient.

heroku pg:backups:schedules --app scihist-digicoll-production

...

You can check the metadata on the latest physical backups like this: heroku pg:backups

Restoring from a nightly physical backup

For physical backups retained by Heroku (we retain up to 25) a restore takes about a minute and works like this:

heroku pg:backups:restore --app scihist-digicoll-production

Downloading a physical backup to a local “.dump” file

heroku pg:backups:download a006 will produce a file like:

...

(lightbulb) Note that a physical dump can easily be converted to a garden-variety “logical” .sql database file:

...

$ file logical_database_file.sql
logical_database_file.sql: UTF-8 Unicode text, with very long lines

Restoring from a

...

physical backup

...

For physical backups retained by Heroku (we retain up to 25) a restore takes about a minute and works like this:

heroku pg:backups:restore --app scihist-digicoll-production

stored as a local “.dump” file

If you downloaded a physical backup and have it which is now stored on your local machine, and want to restore from that specific file, you will first will need to upload it to s3, creating a signed URL for the dump, and then run:

...

3. Preservation (logical) backups to s3

We supplement the above with Finally, we maintain a rake task, rake scihist:copy_database_to_s3, which will regularly runs on a one-off Heroku dyno, via the scheduler. This uploads a logical (plain vanilla SQL) database to s3, where it can wait to be harvested and put onto tape.SyncBackPro then syncs to tape (this process, again, is managed by Chuck and Ponce.)

This workflow serves more for preservation than for disaster recovery: logical .sql files offer portability (they’re UTF8), and are useful in a variety of situations, unlike the physical backups; notably, they can be used to reconstruct the database, even on other machines and other architectures using psql -f db.sql.

Given the size of the database in late 2020, the entire job (with the overhead of starting up the dyno and tearing it down) takes a bit under a minute. If However, if our database grows much larger (20GB or more) we will probably have to get rid of these frequent logical backups.

...