Overview

The Internet Archive BookReader is a candidate to replace the digital collections' custom-built viewer. See https://sciencehistory.atlassian.net/wiki/spaces/HDC/pages/2206203905/OCR+planning+notes#Search-within--Work for a discussion in the context of other candidates.

See also: https://sciencehistory.atlassian.net/wiki/spaces/HDC/pages/edit-v2/2325807105

A working example: the bird book. Note

A custom-built php script serves the images.

Important links and resources

Search within the book

In December 2023, we ran some experiments to see if we could integrate a modified version of the BookReader demo code with image and text metadata from the digital collections. Interestingly enough, we were able to get the BookReader to consume not only our images but also our HOCR content, which allowed us to demo a simple version of “search inside the book”.

How to use the demo above:

IIIF

IIIF is a standard that could theoretically allow us to use the BookReader to consume our images and metadata. (The APIs that interest us are the image API and the content search API.) This would notably allow us to offer pan-and-zoom functionality, among other useful features. For this to work:

The Internet Archive and IIIF

This blog post describes the history of the Internet Archive and IIIF. The key sentence seems to be: “By making Internet Archive images and texts IIIF-compatible, they may be opened using any number of compatible IIIF viewer apps, each offering their own advantages and unique features”. Tellingly, the post makes no mention of the BookReader.

The Internet Archive does in fact maintain a IIIF server, but its front end is actually Mirador (which itself includes the OpenSeadragon viewer.)

The BookReader and IIIF

Even if I had been able to fix the open issue, the work would have been of little help to us since both the BookReader and the IIIF standard have evolved too much in the intervening 5 years of development.

Conclusion

Eddie’s Feb 7 2024 remarks, after experimenting with IIIF plugin:

I have to conclude that the BookReader is not worth pursuing as a component of the digital collections. While impressive in its current form, it depends on a complex and ill-documented set of interfaces with the Internet Archive’s image and metadata servers, and relies in particular on a home-grown php image server script that looks difficult to maintain.

I certainly understand the IA’s desire (which can be inferred from their blog post) to move to a more interoperable standard for serving images, to and get out of the business of maintaining an image viewer altogether.

Jonathan has a different read, I think – while the IIIF plugin is definitely janky and unsupported, other parts of the IIIF reader appear more polished and working, and while under-documented, definitely appear to me to be more polished, and support configuration rather than depending on the exact API’s of IA servers. I would not personally make any assumptions about the future of the BookReader from the IA’s interest in IIIF (possibly another team/project, etc)