Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Current implementation uses existing download_large size derivatives. I thought it was good to include high-res images suitable for printing at high-quality, but this leads to pretty large PDF sizes – and contributes to large RAM sizes. Try with download_medium see if that alone lets us do very large PDFs without worrying about it? Probably good enough.

Tried it: uses signfiicantly less RAM, but our biggest works still use too much, so doesn’t get us all the way there, unless we’re going to limit PDF generation to 500-page0-max or something. (smaller images may be better for users anyway, may do anyway).

  • 50-image work, qf85nc451. Originally 99MB PDF using 440MB RAM. Smaller images, 20MB PDF using 284MB RAM.

  • 100-image work, fx719n43f. Originally 171MB PDF using 549MB RAM. Smaller images, 35MB PDF using 328MB RAM.

  • 325-image work, 1831ck38c. Originally 386MB PDF using 912MB RAM with out of memory errors. Smaller images, 92MB PDF, 472MB RAM.

  • Ramelli, 694 items. Originally 1.8GB PDF(!), did not measure RAM far too much for heroku. Smaller images, 325MB PDF, RAM usage 987MB, with heroku out of memory errors – so this is around the limit for what we can fit on heroku still (and we do have a few larger ones maybe too).

Ruby hexapdf instead of prawn

...

https://github.com/rrthomas/pdfjam

Progress? Merge PDFs?

One problem with those command-line ones is it makes it hard to do a progress bar like we’re doing now, if it requires downloading all the thumbs in advance, then in one command line (with no progress reported) making a PDF.

Is there a way to invoke them to “add one more image on end of PDF”, building it up one image at a time? Then we don’t need to have them all downloaded at once, and can report progress.

Or, should/could we use (any) tool to make a bunch of 1-page PDFs, then some other (command-line?) tool to “combine all these 1-page PDFs into one PDF”, which might be a fast and cheap operation?

pdftk

https://www.pdflabs.com/tools/pdftk-server/

(hmm, can’t add image to pdf i don’t think, although can merge and edit metadata on pdf)

combine_pdf

yet another ruby pdf library. is one thing I found to let us edit metadata (ie Info Dictionary) on existing pdf. Could maybe also do other useful stuff for us.

https://github.com/boazsegev/combine_pdf

nope just tried using it to edit metadata on a very large PDF, it used a ton of RAM.

Uncaching on-demand derivatives

...