Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Heroku Overview

https://devcenter.heroku.com/articles/dyno-types

Web workers

Runs what we think of as the actual “our app”. The more workers, the more traffic we can handle. (much of our traffic might be bots, such as googlebot. We like bots like googlebot though).

Current AWS:

  • Run on one t2.medium EC2, which is 4G of RAM, and 2 virtual cores

  • Running 10 web workers (passenger). Looking at passenger-stats, each workers RAM usage can range from 115M to 250M

...

Production

Staging

4 standard-2X @ $50/month

1 standard-1X @ $25/month

1GB RAM * 4 == 4GB RAM

512MB RAM (1-2 workers)

4(?) cores each * 4 == 16(?) cores

4(?) cores

$200/month

$25/month

Background Job Workers

Run any slower “background” work, currently mainly ingests and on-demand expensive derivative creation.

Current AWS

One t2.large EC2 which is 8GB of RAM, and 2 virtual cores. Running 12 seperate jobs workers, some reserved for specialty queues.

...

Production

Staging

8 standard-2x @ $50/month

1 standard-2X @ $50/month

1GB RAM * 8 == 8 GB RAM

1GBGB of RAM. (2-4 workers, ingests will be slow.)

$400/month

$50/month

Postgres

(standard relational database, holds all our metadata)

Current AWS

Runs on a server that also runs redis, and is a t3a.small EC2, with 2GB RAM, and 2 virtual CPUs

SELECT pg_size_pretty( pg_database_size('digcol') ) => 635 MB

Seems to be configured for 100 maximum connections. When I try to figure out how many connections it currently has, seems to be only 4. (I’d expect at least one for every web worker and jobs worker, so 22, so curious. I am not an experienced postgres admin though).

Estimated Heroku

https://elements.heroku.com/addons/heroku-postgresql

Heroku postgres “Standard 0” seems just fine. 4GB RAM (at least 2x current). 64 GB storage capacity (10x our current use). 120 connection limit. (slightly more than our current postgres, should be plenty for our current use). No row limit. Can retain 25 backups for you and rollback to previous states for you. DB-level encryption at rest.

Smaller than Standard 0 is labelled “Hobby” and has limitations making it not great for production, I think it makes sense to spring for Standard 0 when it’s only $50/month.

However, beware if we do need to go up beyond Standard 0 (because we have more data and/or more traffic or more bg workers), the next price point is at $200/month. It would probably take a major change in our usage patterns (beyond the OH transcript storage), or significant increase in ingest rate, to run out of capacity within the next 5 years, but we eventually could.

For this one, it probably makes sense to run the same on staging and production, especially if it’s Standard 0.

Production

Staging

Postgres Standard 0

Postgres Standard 0

$50/month

$50/month

Redis

A faster store than postgres, generally used for temporary/transitory less-structured data. Currently we mainly (only?) use it for storing our background job queues. And make very little use of it.

Another common use would be for caches, including caching rendered HTML, or any other expensive to calculate values that make sense to cache for a certain amount of time. We might want to do that in the future, increasing our redis usage.

Current AWS

Runs on same server as DB, but is using so little resources currently barely worth mentioning. Current used_memory_human == 1.15M

Considerations

If we start putting a lot more things into bg job queue (perhaps to replace current long-running processes with a lot of smaller jobs in queue, see fixity check), it could significantly increase our redis needs.

For heroku redis plans priced in terms of number of connections, somewhere around our anticipated web workers plus bg workers (22-30?) is probably our need. Although depending on what happens if connection limit is exceeded, and if Rails apps and bg workers hold persistent connections or only take out a connection for a quick operation – perhaps we could get by with fewer connections. Unsure.

If we decide to use redis as a cache (not just for bg job queues), we might actually need a SECOND redis instance, and the ability to set the maxmemory-policy(auto-evict when memory capacity is exceeded, which you want for a cache, vs not for queue persistence).

Heroku Estimate

There are a variety of different vendors offering redis as heroku plugins; not sure what the differentiating factors are. We’ll start out looking at heroku’s own in-house redis offering, rather than the third parties. It’s possible other third-party heroku marketplace vendors could give us better price for what we need, but heroku’s seems fine for now.

Heroku redis has a maxmemory-policy default of “noeviction”, but is configurable.

There is a free level “Hobby Dev” with 25MB of RAM, which is enough for our current tiny 1.15M usage. However, the 20 connection limit is tight; and it does not offer persistence which is inappropriate for our bg queue usage.

The Premium 0 pricing level seems appropriate with 50MB of RAM, on-disk persistence, 40 connections. Only $15/month.

Production

Staging

Heroku redis Premium 0

Heroku redis Premium 0

$15/month

$15/month

Solr