Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

a) Rake task: Replace the Ansible-managed scriptpostgres-backup.sh with a rake task run regularly on a one-off Heroku dyno. This would obtain the latest database URL and then push it up to s3, where it can wait to be harvested by the Dubnium script.

Pro:

  • minimal change from our existing workflow;

  • easy to check on by ensuring the date on the appropriate S3 bucket.

Con:

  • requires a part of our code to have S3 credentials that allows it to write to our backup directory;

  • requires the Heroku CLI to be accessible to the rake task (so it can obtain the URL of the latest dump).

b) cron job on Dubnium: Dispense with the S3 portion of the workflow entirely, and set up the cron job on Dubnium to obtain its database backup directly from Heroku

Pro: simpler;

does not require the scihist_digicoll code to know anything about the backup s3 setup; thus safer;

Con:

  • assumes we trust the Heroku database backup workflow;

  • less transparent: it’s more legwork to log into Dubnium and check that the database backed up there is current (Dubnium is only accessible by logging into Citrix Workspace);

  • Dubnium is not managed by Ansible, and needs to be manually updated;

  • One less copy: instead of having copies in the database server, in S3, on Dubnium and on tape, we would only have copies in Heroku, on Dubnium, and on tape;

  • Dubnium needs to have access to the Heroku CLI and the appropriate credentials.