Types of original images in the digital collections
TIFF
Black and white colorspace
RGB colorspace
PDF
Software needed to generate derivatives:
See Aptfile below.
Diagnostics for derivative generation software
Run the following commands in a heroku dyno. Your results may vary slightly, but anything that is way off should be seen as a red flag.
Code Block |
---|
heroku run bash
vips --version
# Normal output:
# vips-8.10.6-Tue Mar 23 20:52:58 UTC 2021
vips -l | grep -o '[a-z_]*pdf[a-z_]*'
# Normal output:
# pdfload_base
# pdfload
# pdf
# pdfload_buffer
# pdfload_source
cd tmp
PROFILE=`ls ../vendor/bundle/ruby/*/gems/kithe-*/lib/vendor/icc/sRGB2014.icc`
wget https://digital.sciencehistory.org/downloads/m3zcuho -O b_w.tiff
wget https://digital.sciencehistory.org/downloads/1h16b9n -O color.tiff
wget https://digital.sciencehistory.org/downloads/519ucnx -O normal.pdf
vipsthumbnail color.tiff --eprofile $PROFILE
vipsthumbnail b_w.tiff --eprofile $PROFILE
vipsthumbnail normal.pdf
identify *.jpg | grep sRGB
# Normal output (ignore warnings):
# tn_b_w.jpg JPEG 128x108 128x108+0+0 8-bit sRGB 23603B 0.000u 0:00.000
# tn_color.jpg JPEG 128x96 128x96+0+0 8-bit sRGB 11880B 0.000u 0:00.000
# tn_normal.jpg JPEG 99x128 99x128+0+0 8-bit sRGB 1619B 0.010u 0:00.000 |
Software setups
We’ve been through 5 of these since starting to investigate Heroku.
...
#
...
Aptfile
...
Buildpack
...
Results
...
1
...
mediainfo
imagemagick
poppler-utils
heroku-community/apt
These are now automated in a suite of tests you can run with. ./bin/rspec system_env_spec
. See https://github.com/sciencehistory/scihist_digicoll/blob/master/system_env_spec/README.md .
See the page’s history for how we did this in the past.
Software setups
# | Aptfile | Buildpack | Results | ||||
---|---|---|---|---|---|---|---|
1 |
|
|
Color TIFFs work PDFs work | ||||
2 |
|
|
All TIFFS work Removing | ||||
3 |
|
|
|
At some point in 2021, |
started returning blank - no |
. All TIFFS work |
4 |
|
imagemagick
poppler-utils
|
|
|
|
|
|
|
|
|
|
|
PDFs work |
All TIFFS work |
Combined audio derivatives don’t work |
5 |
libvips-tools
mediainfo
imagemagick
poppler-utils
|
|
|
|
|
|
|
|
|
|
PDFs work All TIFFS work Combined audio derivatives work |
Removing --eprofile srgb_profile_path
from the arguments to vipsthumbnail
in the code (docs) avoids the error described in issue 942 (See note 2)
again (see issue 1448)
| ||
6 |
|
|
vips-8.10.6-Tue Mar 23 20:52:58 UTC 2021
PDFs work
All TIFFS work
...
Note 1: Eddie believes option 1 did work for much of the evaluation period for Heroku, but as of summer 2021 it didn’t.
...
PDF on-demand stopped working in staging, so we added poppler-utils back into aptfile. | |||
7 |
|
| OCR. Note we are removing |
Note: Row 2 was a band-aid; it violates violated the rule implicit in the code that all TIFF derivatives should have their derivatives encoded as
srgb
, including the derivatives of B&W originals. I interpret the documentation as meaning that the icc profile of originals is reused in their derivatives, but further research is needed.identify -verbose graphics_file.tiff | grep Colorspace
can be used to elucidate what happens to various types of original after being processed byvipsthumbnail
.