Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Upgrade Steps

Switch the settings for Ruby in the grou_vars file(s) you want to use

Use the build_ami script to make a new ruby AMI.When a new version of ruby is needed, you'll want to make a new AMI with the version you want to upgrade to.

  1. Open to the group_vars files ruby_ami and ruby_java_ami with ansible-vault edit
  2. In the vars files edit the ruby version number and get the new sha256 checksum for the tar.gz version from the ruby site (https://www.ruby-lang.org/en/downloads/releases/)
  3. Also change the ami name to reflect the new version of ruby in use
  4. If it is a major or minor ruby version change (2→3, or 2.5 → 2.6) edit ruby_major_minor_ver with the new version number as well, patch changes do not need this.
  5. Run the build command: ansible-playbook build_ami.yml --ask-vault-pass --private-key=~/.ssh/chf_prod.pem --extra-vars "build={{ BUILD_VERSION }}" with build_version being the ruby_ami or ruby_java_ami
  6. Update the group_vars files to use the new AMI in the app, solr, and jobs files.

Switch Box steps

  1. Leave current prod servers up
  2. Build new prod boxes for app, jobs, and solr. (See how to build boxes Ansible or in the README. You may want to use a new iteration number to keep track of them)
    1. Once boxes are done, alert staff to stop all edits and ingests in that tier of service (ALWAYS TEST ON STAGING FIRST)
  3. Record the new internal IP addresses for each box
  4. Edit the redis_db_ip, postgres_db_ip, and solr_ip, with the new values the fedora_ip should not need to chance as fedora won't need to be rebuilt. DO NOT PUSH THIS UPDATE TO BITBUCKET YET.
  5. Make sure new prod boxes are fully connected by running update-box.yml on the new boxes with the new values from the prior step.
  6. Cap deploy to new prod boxes
    1. Make sure cap gives no errors.
  7. For staging, recover from last night's backups in S3 (see Backups and Recovery (Needs editing))
  8. For prod, you will need a current backup
  9. Tell staff to stop editing/ingesting data
  10. Run a backup of solr by going to the old solr box and running the solr backup script as ubuntu (./~/bin/solr-backup.sh)
  11. Backup app's postgres DB and Redis
  12. On new production boxes
    1. In new-app drop the blank chf_hydra db, create a new blank chf_hydra db and then follow the postgres import instructions in Backups and Recovery (Needs editing)
      1. Stop Redis, delete the existing redis.rdb in var/lib/redis
      2. Copy the backed up redis.rdb over to the folder and make sure it is owned by redis
      3. Restart Redis
    2. In new-solr run the solr restore commands in Application administration (Obsolete)
  13. Edit New-Prod App to allow IP access without directing to actual prod by disabling apache redirect
    1. This is not completely needed, but is a way to be sure there were no problems
    2. Check that New-Prod works properly with solr's index being loaded
    3. Return New-Prod App to normal apache configuration
  14. Move Elastic IP from app-prod to the new app-prod.
  15. Make sure all details in ansible's group_vars are up to date with any internal IP address changes and commit and merge with masterNow commit the changes to Bitbucket with the updated internal networking IPs.

Revert Box steps

Assuming the old boxes were not deleted. Follow these steps.

  1. Start the old servers up.
  2. Double check their internal IP addresses, they should be the same as before but always be sure.
  3. Check netdata to be sure they have resque workers running
  4. Tell users to stop adding new items
  5. Run a backup of solr
  6. Backup app's postgres DB and Redis
  7. On new production boxes
    1. In new-app drop the blank chf_hydra db, create a new blank chf_hydra db and then follow the postgres import instructions in Backups and Recovery (Needs editing)
      1. Stop Redis, delete the existing redis.rdb in var/lib/redis
      2. Copy the backed up redis.rdb over to the folder and make sure it is owned by redis
      3. Restart Redis
    2. In new-solr run the solr restore commands in Application administration (Obsolete)
  8. Edit New-Prod App to allow IP access without directing to actual prod by disabling apache redirect
    1. This is not completely needed, but is a way to be sure there were no problems
    2. Check that New-Prod works properly with solr's index being loaded
    3. Return New-Prod App to normal apache configuration
  9. Move Elastic IP from app-prod to the new app-prod.
  10. Make sure all details in ansible's group_vars files are up to date with any internal IP address changes and commit and merge with master.

...