Types of original images in the digital collections
TIFF
Black and white colorspace
RGB colorspace
PDF
Software needed to generate derivatives:
mediainfo
convert
pdfunite
vips
ffmpeg
Diagnostics for derivative generation software
Run the following commands in a heroku dyno. Your results may vary slightly, but anything that is way off should be seen as a red flag.
heroku run bash mediainfo --version # Normal output: # MediaInfo Command line, # MediaInfoLib - v19.09 convert -version # Normal output: # Version: ImageMagick 6.9.10-23 Q16 x86_64 20190101 https://imagemagick.org # Copyright: © 1999-2019 ImageMagick Studio LLC # License: https://imagemagick.org/script/license.php # Features: Cipher DPC Modules OpenMP # Delegates (built-in): bzlib djvu fftw fontconfig freetype jbig jng jpeg lcms lqr ltdl lzma openexr pangocairo png tiff webp wmf x xml zlib pdfunite -v # Normal output: # pdfunite version 0.86.1 vips --version # Normal output: # vips-8.10.6-Tue Mar 23 20:52:58 UTC 2021 ffmpeg -version # Normal output: # ffmpeg version 4.2.3 Copyright (c) 2000-2020 the FFmpeg developers # built with gcc 5.4.0 (Ubuntu 5.4.0-6ubuntu1~16.04.12) 20160609 # configuration: --prefix=/home/work/sffmpeg/build --datadir=/home/work/sffmpeg/build/etc --disable-shared --enable-static --enable-pic --pkg-config-flags=--static --enable-gpl --enable-version3 --disable-doc --disable-debug --disable-ffplay --disable-outdevs --enable-runtime-cpudetect --extra-cflags='-I/home/work/sffmpeg/build/include -static' --extra-ldflags=-L/home/work/sffmpeg/build/lib --extra-ldexeflags=-static --extra-libs='-lstdc++ -lexpat -ldl -lm -lpthread' --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libaom --enable-libmp3lame --enable-libspeex --enable-libtheora --enable-libvorbis --enable-libx264 --enable-libx265 --enable-libvpx --enable-libopus --enable-libfreetype --enable-libass --enable-mbedtls # libavutil 56. 31.100 / 56. 31.100 # libavcodec 58. 54.100 / 58. 54.100 # libavformat 58. 29.100 / 58. 29.100 # libavdevice 58. 8.100 / 58. 8.100 # libavfilter 7. 57.100 / 7. 57.100 # libswscale 5. 5.100 / 5. 5.100 # libswresample 3. 5.100 / 3. 5.100 # libpostproc 55. 5.100 / 55. 5.100 vips -l | grep -o '[a-z_]*pdf[a-z_]*' # Normal output: # pdfload_base # pdfload # pdf # pdfload_buffer # pdfload_source cd tmp PROFILE=`ls ../vendor/bundle/ruby/*/gems/kithe-*/lib/vendor/icc/sRGB2014.icc` wget https://digital.sciencehistory.org/downloads/m3zcuho -O b_w.tiff wget https://digital.sciencehistory.org/downloads/1h16b9n -O color.tiff wget https://digital.sciencehistory.org/downloads/519ucnx -O normal.pdf vipsthumbnail color.tiff --eprofile $PROFILE vipsthumbnail b_w.tiff --eprofile $PROFILE vipsthumbnail normal.pdf identify *.jpg | grep sRGB # Normal output (ignore warnings): # tn_b_w.jpg JPEG 128x108 128x108+0+0 8-bit sRGB 23603B 0.000u 0:00.000 # tn_color.jpg JPEG 128x96 128x96+0+0 8-bit sRGB 11880B 0.000u 0:00.000 # tn_normal.jpg JPEG 99x128 99x128+0+0 8-bit sRGB 1619B 0.010u 0:00.000
Software setups
We’ve been through several of these since starting to investigate Heroku.
# | Aptfile | Buildpack | Results |
---|---|---|---|
1 | libvips-tools mediainfo imagemagick poppler-utils | heroku-community/apt heroku/ruby |
Color TIFFs work PDFs work |
2 | libvips-tools mediainfo imagemagick poppler-utils | heroku-community/apt heroku/ruby |
All TIFFS work Removing |
3 | mediainfo imagemagick poppler-utils | heroku-community/apt https://github.com/machinio/heroku-buildpack-vips heroku/ruby |
At some point in 2021, All TIFFS work |
4 | mediainfo imagemagick libglib2.0-0 libglib2.0-dev libpoppler-glib8 | heroku-community/apt https://github.com/brandoncc/heroku-buildpack-vips heroku/ruby |
PDFs work All TIFFS work Combined audio derivatives don’t work |
5 | mediainfo imagemagick libglib2.0-0 libglib2.0-dev libpoppler-glib8 | https://github.com/heroku/heroku-buildpack-activestorage-preview https://buildpack-registry.s3.amazonaws.com/buildpacks/heroku-community/apt.tgz https://github.com/brandoncc/heroku-buildpack-vips heroku/ruby |
PDFs work All TIFFS work Combined audio derivatives work again (see issue 1448)
|
6 |
|
| PDF on-demand stopped working in staging, so we added poppler-utils back into aptfile. |
Note: Row 2 was a band-aid; it violated the rule implicit in the code that all TIFF derivatives should have their derivatives encoded as
srgb
, including the derivatives of B&W originals. I interpret the documentation as meaning that the icc profile of originals is reused in their derivatives, but further research is needed.identify -verbose graphics_file.tiff | grep Colorspace
can be used to elucidate what happens to various types of original after being processed byvipsthumbnail
.