Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Types of original images in the digital collections

  • TIFF

    • Black and white colorspace

    • RGB colorspace

  • PDF

Software needed to generate derivatives:

See Aptfile below.

Diagnostics for derivative generation software

Run the following commands in a heroku dyno. Your results may vary slightly, but anything that is way off should be seen as a red flag.

Code Block
heroku run bash

vips --version

# Normal output:
# vips-8.10.6-Tue Mar 23 20:52:58 UTC 2021

vips -l | grep -o '[a-z_]*pdf[a-z_]*'
# Normal output:
# pdfload_base
# pdfload
# pdf
# pdfload_buffer
# pdfload_source

cd tmp

PROFILE=`ls ../vendor/bundle/ruby/*/gems/kithe-*/lib/vendor/icc/sRGB2014.icc`
wget https://digital.sciencehistory.org/downloads/m3zcuho -O b_w.tiff
wget https://digital.sciencehistory.org/downloads/1h16b9n -O color.tiff
wget https://digital.sciencehistory.org/downloads/519ucnx -O normal.pdf

vipsthumbnail  color.tiff  --eprofile $PROFILE
vipsthumbnail  b_w.tiff    --eprofile $PROFILE
vipsthumbnail  normal.pdf

identify *.jpg | grep sRGB

# Normal output (ignore warnings):
# tn_b_w.jpg    JPEG 128x108 128x108+0+0 8-bit sRGB 23603B 0.000u 0:00.000
# tn_color.jpg  JPEG 128x96 128x96+0+0   8-bit sRGB 11880B 0.000u 0:00.000
# tn_normal.jpg JPEG 99x128 99x128+0+0   8-bit sRGB  1619B 0.010u 0:00.000

Software setups

We’ve been through 5 of these since starting to investigate Heroku.

...

#

...

Aptfile

...

Buildpack

...

Results

...

1

...

mediainfo
imagemagick
poppler-utils

heroku-community/apt

These are now automated in a suite of tests you can run with. ./bin/rspec system_env_spec. See https://github.com/sciencehistory/scihist_digicoll/blob/master/system_env_spec/README.md .

See the page’s history for how we did this in the past.


Software setups

#

Aptfile

Buildpack

Results

1

Code Block
libvips-tools
mediainfo
imagemagick
poppler-utils
Code Block
heroku-community/apt
heroku/ruby

vips-8.9.1-Sun Feb 23 08:51:26 UTC 2020

Color TIFFs work

B&W TIFFs do not work

PDFs work

2

Code Block
libvips-tools
mediainfo
imagemagick
poppler-utils
Code Block
heroku-community/apt
heroku/ruby

vips-8.9.1-Sun Feb 23 08:51:26 UTC 2020
PDFs work

All TIFFS work

Removing --eprofile srgb_profile_path from the arguments to vipsthumbnail in the code (docs) avoids the error described in issue 942.

3

Code Block
mediainfo
imagemagick
poppler-utils
Code Block
heroku-community/apt
https://github.com/machinio/heroku-buildpack-vips

heroku/ruby

vips-8.10.2-Mon Oct 12 16:43:59 UTC 2020

At some point in 2021, vips -l | grep -i pdf

is

started returning blank - no poppler support, so PDFs don’t work

(See note 1)

.

All TIFFS work

2

4

libvips-tools
Code Block
mediainfo

imagemagick
poppler-utils

imagemagick
libglib2.0-0
libglib2.0-dev
libpoppler-glib8
Code Block
heroku-community/apt

https://github.com/brandoncc/heroku-buildpack-vips
heroku/ruby

vips-8.

9

10.

1

6-

Sun Feb

Tue Mar 23

08

20:

51

52:

26

58 UTC

2020

2021

Color TIFFs

PDFs work

B&W TIFFs do not

All TIFFS work

PDFs

Combined audio derivatives don’t work

3

5

libvips-tools
mediainfo
imagemagick
poppler-utils

Code Block
mediainfo
imagemagick
libglib2.0-0
libglib2.0-dev
libpoppler-glib8
Code Block
https://github.com/heroku/heroku-buildpack-activestorage-preview
https://buildpack-registry.s3.amazonaws.com/buildpacks/heroku-community/apt
.tgz
https://github.com/brandoncc/heroku-buildpack-vips
heroku/ruby

vips-8.

9

10.

1

6-

Sun Feb

Tue Mar 23

08

20:

51

52:

26

58 UTC

2020

2021

PDFs work

All TIFFS work

Combined audio derivatives work

Removing --eprofile srgb_profile_path from the arguments to vipsthumbnail in the code (docs) avoids the error described in issue 942 (See note 2)

5

again (see issue 1448)

libpoppler-glib8 in the aptfile may not be needed (see issue 1455 )

6

mediainfo
imagemagick
libglib2.0-0
libglib2.0-dev
libpoppler-glib8
poppler-utils

https://github.com/heroku/heroku-buildpack-activestorage-preview

heroku-community/apt

https://github.com/brandoncc/heroku-buildpack-vips

heroku/ruby

vips-8.10.6-Tue Mar 23 20:52:58 UTC 2021

PDFs work

All TIFFS work

...

Note 1: Eddie believes option 1 did work for much of the evaluation period for Heroku, but as of summer 2021 it didn’t.

...

PDF on-demand stopped working in staging, so we added poppler-utils back into aptfile.

7

mediainfo

imagemagick

libglib2.0-0
libglib2.0-dev
libpoppler-glib8
qpdf
tesseract-ocr
tesseract-ocr-eng
tesseract-ocr-deu
tesseract-ocr-fra
tesseract-ocr-spa
tesseract-error-while-loading-shared-libraries-libarchive-so-13-python
libarchive13

heroku/python
https://github.com/heroku/heroku-buildpack-activestorage-preview
https://buildpack-registry.s3.amazonaws.com/buildpacks/heroku-community/apt.tgz
https://github.com/brandoncc/heroku-buildpack-vips
https://github.com/fnando/heroku-buildpack-exiftool
heroku/ruby

OCR.

Note we are removing poppler-utils

  • Note: Row 2 was a band-aid; it violates violated the rule implicit in the code that all TIFF derivatives should have their derivatives encoded as srgb, including the derivatives of B&W originals. I interpret the documentation as meaning that the icc profile of originals is reused in their derivatives, but further research is needed.

  • identify -verbose graphics_file.tiff | grep Colorspace can be used to elucidate what happens to various types of original after being processed by vipsthumbnail.