Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

Responding to an error report because you are on-call, and need some ideas for how to get started with some quick actions? We got you.

Check on status of heroku dynos

Using heroku CLI, run:

heroku ps -a scihist-digicoll-production

Look at our logs

Consolidated app logs are avilable on heroku dashboard, on “resources” tab, click on “papertrail” add-on at bottom to get a nice web GUI for our logs, that also lets you search.

Also in addition to general logs, we have errors specifically monitored by http://honeybadger.io , each person has their own individual login.

Is heroku itself having problems? Or are other platforms we use?

Restart heroku dynos

From heroku web GUI, you can restart all dynos from the “More” menu in top right navbar, choose “restart all dynos”.

Using the heroku CLI, you can restart only web or only workers, or even a specific dyno.

heroku ps:restart worker -a scihist-digicoll-production
heroku ps:restart web -a scihist-digicoll-production
heroku ps:restart worker.2 -a scihist-digicoll-production

Note: It’s not clear to me how often this restarting heroku dynos will actually fix a problem, and in some cases it could cause a less stable state, if for instance heroku is having problems.

Restart solr on Searchstax

  1. Login to searchstax

    1. Use shared credentials stored in our credential spot

  2. Click on the instance you want to restart (scihist_digicoll (production), or scihist-digicoll-staging)

  3. At bottom of page there is a single node listed (our plan only has one node), you can click “stop solr”, and then “Start solr”

note: restarting solr will result in the app having some downtime/generating errors while it’s restarting, if it is up and accessible during restart!

Disable autoscaling

We use http://hirefire.io for autoscaling our worker dynos (maybe in future web dynos). Has it gone crazy and you need to just disable it?

No worries, just login to http://hirefire.io (we each have our own login), and you can click the “enable” toggle on or off next to each autoscale worker, right on the initial dashboard. (We may only have one worker).

Put entire app into maintenance mode

Disable our app, it won’t be accessible to anyone, but they’ll get a nice maintainance message.

In heroku web GUI, go to “settings” tab, scroll down to “Maintenance mode” section, toggle switch.

In heroku CLI , run heroku maintenance:on -a scihist-digicoll-production and heroku maintenance:off -a scihist-digicoll-production

(Note: Right now, this is just a generic heroku maintenance message. It is possible to customize/brand this page, we may get to that eventually. https://github.com/sciencehistory/scihist_digicoll/issues/1201 )

Disable staff logins

We can effectively make the app “read-only” but still available to the public by disabling staff logins. So we don’t have a public facing outage, but if we’re dealing with some kind of data corruption issue we’re trying to diagnose, we might want to ‘freeze’ staff out.

In heroku config vars on heroku dashboard settings tab, just set LOGINS_DISABLED to true.

Reindex solr

If search is weird, our Solr index may have gotten out of sync. Fortunately, we can (re-)build a new Solr index in only a couple minutes. Using the heroku CLI to run our rake tasks:

heroku run rake scihist:solr:reindex scihist:solr:delete_orphans -a scihist-digicoll-production

if this results in an error that makes you think the searchstax solr is not properly set up, you could try:

  • heroku run rake scihist:solr_cloud:create_collection -a scihist-digicoll-production. (That should not do any harm in any case, it might just complain telling you “collection already exists”

  • heroku run rake scihist:solr_cloud:sync_configset -a scihist-digicoll-production

And see also restarting Searchstax Solr above.

Restore postgres database from backups

See separate page.

  • No labels