Journal of heroku investigations. Most recent entries on top? See also Heroku Consideration

Tuesday Oct 6

For future: Asset delivery

Needs to be investigated, heroku recommends CDN, we hadn’t accounted for that in cost or complexity of setup. https://github.com/sciencehistory/scihist_digicoll/issues/874

For future: Production vs Staging

While heroku has ways of creating production and staging environments, we aren’t going to worry about that for now, just working on getting a demo app up with a limited staging-like environment, following piece-by-piece plan from Monday.

For future: backups

Heroku postgres has it’s own built-in backups, including ability to easily rollback to previous point in time. Do we still want to do our own backups of postgres? Probably! But we should wrap our head the heroku backups and how they relate to ours, and update our documentation. https://github.com/sciencehistory/scihist_digicoll/issues/876

Software/configuration steps done

By “heroku dashboard” I mean the web GUI.

Install heroku CLI on my Mac
Create scihist-digicoll app in heroku dashboard
Provision heroku postgresadd-on. For now we’re going to do a hobby-basic at $9/month, although this won’t be enough for production, we plan a standard-0 at $50/month eventually. https://elements.heroku.com/addons/heroku-postgresql
Import database from our staging instance to our heroku db (https://devcenter.heroku.com/articles/heroku-postgres-import-export)
1. Do a new export on staging, since heroku asks for a certain format
  1. Tricky cause pg_dump doesn’t live on staging jobs server! Hmm, how do we do backups.

Mon Oct 5

Heroku has a LOT of docs, usually well-written. It is pretty well googled. Some heroku overview and getting started docs:

Intersting heroku add-on I noticed, rails-autoscale – instead of needing to build out as many dynos as we might need to handle maximum traffic or ingest, we can have the add-on scale up automatically with use. Works for both web dynos (with traffic), and background job dynos (when we do a big ingest, it can scale up more workers!). Does cost money, price based on how high you want it to be able to scale I think.

I think I will try to get our app on heroku piece by piece…

Get app deployed to heroku with postgres small web dyno only – no bg jobs yet, no solr yet. (Solr functions won’t work!)
Add in bg jobs – including heroku buildpacks with all the software they need (vips, imagemagick, ffmpeg, egc).
Add in solr – not sure whether to start by trying to have it connect to existing staging solr (which would require a heroku add-on for a static outgoing IP via SOCKS, so we could let it through our solr firewall, and/or other solr changed config), OR move right away to a SaaS solr – which would cost money, have to identify which one we need.
App substantially working at this point, but still lots of little pieces to get in place, such as nightly jobs, and various problem cases (out of memory for PDF generation etc).

For future: Infrastructure as code?

Deploying to Heroku involves configuring some things on the platform. For instance what I know about now includes mainly a list of config variables (such as what we have in our local_env.yml), and add-ons selected and their configuration.

You can do this in the heroku console, but I’m nervous about that living only inside heroku’s system. How do we get it in source code, “infrastructure as code”, as we always tried to do with ansible, having our infrastructure re-runnable from files on disk, not just living in live system? This isn’t something that needs to be solved now, but something I want to attend to as part of this process, ask around for what others are doing.

Looks like one solution might be using terraform with heroku, documented by heroku. To look into more later.

https://github.com/sciencehistory/scihist_digicoll/issues/875

jrochkind Heroku Journal