Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Passenger (web worker) administration

Log into the web server as digcol user. You can find current web server IPs with ./bin/cap production list_ec2_servers from an app checkout. 

Get good status info on passenger workers:

$ PASSENGER_INSTANCE_REGISTRY_DIR=/opt/scihist_digicoll/shared passenger-status

Restart application without restarting apache

This will reload config files.

$ PASSENGER_INSTANCE_REGISTRY_DIR=/opt/scihist_digicoll/shared passenger-status passenger-config restart-app

passenger-config can do some other interesting things as well, such as system-metrics

Resque admin panel

If a file doesn't get characterized correctly the first thing to do is check the resque admin panel. There you can view failures and restart jobs. If you are logged in as an admin user you can view the admin panel at `digital.chemheritage.org/admin/queues`

What version of the app is deployed?

$ cat /opt/sufia-project/current/REVISION 

Restart passenger

$ passenger-config restart-app

Reindex all of solr:

Check README for scihist_digicoll

Rebuild Solr with 0 Downtime tips:

For scihist_digicoll, we can easily build and swap in a new Solr server. This will result in downtime until the index is remade. While reindexing takes only a minute or two, the server changes being applied to jobs and web can take a while, so there may be many minutes between when one of them is connected to the new Solr server and the other does not.  During that time, we can't reindex.

To minimize downtimes during Solr changes, the preferred method is to take a backup of the old Solr version (if it can be used with the new Solr version, test first) and then restore that backup on the new Solr server so that public users will always be able to run searches.

CORENAME is scihist_digicoll
Location is build by ansible for backups, /backups/solr-backup
BACKUPNAME can be anything you like

On the old Solr machines run

Logged in as ubuntu

curl 'http://localhost:8983/solr/CORENAME/replication?command=backup&name=BACKUPNAME&location=/backups/solr-backup'
sudo tar czf /backups/solr-backup


To check the status of the backup run

curl "http://localhost:8983/solr/CORENAME/replication?command=details"

Then tar it up

tar czf ~/solr-backup.tar.gz /backups/solr-backup/snapshot.BACKUPNAME

Then move/copy the backup tar to new server via whatever method you care to use, such as scp

If you are working with Production is a good idea to go onto Ansible and edit the group_var/kithe_production you are working on and put in the new private IP address for solr.

Commit the changes to the staging branch so you can easily merge a PR from staging to master to swap the IP address without waiting for staging to update.

On the new Solr machine

Logged in as ubuntu

Extra the tar tar xzf solr-backup.tar.gz to the /backups/solr-backup spot (or anywhere as long as the Solr user can access it)

You may need to pull the backup file from the extracted tar (i.e. if you extract it directly to /backups/solr-backup you may see it in /backups/solr-backup/backups/solr-backup/BACKUPNAME and wish to move the file to /backups/solr-backup/BACKUPNAME)

Make sure all files are owned by unix account and group solrsudo chown -R solr:solr /backups/solr-backup

Make sure the scihist_digoll application code is deployed to the new Solr server and it is running correctly. Because the config files for solr live in our app repo, and are delivered to the Solr server via capistrano deploy. 

Run:

curl 'http://localhost:8983/solr/CORENAME/replication?command=restore&name=BACKUPNAME&location=/PATH'

PATH should just be the directory the backup is in /backups/solr-backup and not the full path name of the folder.

If you want to check the status of the restore run

curl "http://localhost:8983/solr/CORNAME/replication?command=restorestatus"

Now you should make the Solr server IP change. Either by committing a change to staging if this was a staging swap, or merging the change you already put in staging into master if this is a production swap.

Now the new machine has a recent backup and when you update the server IP address users will always get search results. Staff who have recently added or edited items may notice that they look off if it took place after the backup.

Once the servers are switched, run a reindex to catch any changes made during that time.

Clearing out the tmp directory (removes everything older than 8 days.)

This is invoked by a cron job on app prod, but just in case...

find /tmp/* -mtime +8 -delete