Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. A full recovery from S3 rolls the entire system back to the latest backup. This is usually the prior business day. All changes between this period will be lost.
  2. As noted in the prior documentation, this involves a postgres database change, reindex of solr, and then moving files from backup to production.
    1. Only the originals files are required, but it may be faster to copy derivative files rather than regenerate them. (This requires testing)
  3. Currently you can, in ansible, run the playbook restore_kithe.yml which will automatically handle these steps.

    Code Block
    titleAnsible Playbook
    ansible-playbook --ask-vault-pass restore_kithe.yml --private-key=~/.ssh/chf_prod.pem
    


  4. Do not run this unless you are prepared to lose all changes made to the system during the past 24 hours.
  5. If you cannot or do not want to use the playbook or it does not work, you may manually undertake the following steps. (logged in as ubuntu)
    1. Stop passenger on the web server, this should end connections to the postgres database. It also allows you to avoid honeybadger errors when the database is lost.
      1. sudo systemctl stop passenger
    2. On the database server, restart the postgres service. This will terminate any hanging connections.
      1. sudo systemctl restart postgresql.service
    3. Download to the database server the last postgres backup, found at the s3://chf-hydra-backup bucket under PGSql key as digcol_backup.sql
    4. On the database server drop the existing postgres digcol database.
      1. dropdb -U postgres digcol
        -or-
      2. psql -U postgres
        1. DROP DATABASE digcol;
    5. Import the backup database with

      1. psql -U postgres < BACKUP_LOCATION
    6. You will then need to reindex solr, which can be done remotely and will run on jobs

      Code Block
      languageruby
      titleRake tasks
      bundle exec cap production invoke:rake TASK="scihist:solr:reindex scihist:solr:delete_orphans"


    7. You'll need to move over any original files that are missing with a S3 sync command

      Code Block
      languagebash
      firstline1
      titleSync
      aws s3 sync s3://scihist-digicoll-production-originals-backup/  s3://scihist-digicoll-production-originals/ --source-region us-west-2 --region us-east-1


    8. Then either do the same for the derivative files or regenerate them with a rake task.

      1. If you run the rake task, ssh into the jobs server and move to the current deployed application directory (/opt/scihist_digicoll/current).

      2. Switch to the application's user (digcol) and then run the commands. Since they take a long time to run, it is best left in a screen or tmux session.

        Code Block
        languageruby
        firstline1
        titleDerivative creation
        ./bin/rake kithe:create_derivatives:lazy_defaults
        ./bin/rake scihist:lazy_create_dzi


...