Oral history audio files

As of early 2023, all audio files in the digital collections are associated with the oral history collection.

Archival files outside the digital collections

Note that the COH maintains and preserves archival copies (including at least some large WAV files) outside the digital collections. These are processed and re-encoded as either FLAC or mp3 before they are ingested into the digital collections.

Originals

Oral history audio files ingested into the digital collections by COH staff (we call them “originals” even though they are technically copies) are FLAC or mp3 files. They may vary in their technical specifications depending on COH workflow. We do not guarantee that these files are consistent with each other, but we have code to describe them (bundle exec rake scihist:reports:audio_originals_metadata > report.csv).

Derivatives

We create an aac/m4a derivative for FLAC original assets. We don’t offer this derivative for mp3 originals. This is just offered as a convenient download option for researchers; it’s not used in the OH player. See app/uploaders/asset_uploader.rb for technical details about these derivatives.

Combined audio derivatives

For use in OHMS and in the oral history player, we create an extra file called the CAD (combined audio derivative). This is the file COH staff synchronize the transcript to, and it’s also the audio file most of our users interact with, so it’s important. The combined audio deriv file is:

  • 64k constant bitrate

    • One second of audio is represented in the file by 64,000 bits. (That’s actually not a lot.)

  • Sample rate: unchanged from the original files. Most of the files are either 44,100 or 48,000 samples per second, but there are about fifty outliers with fewer (8,000 or 22,050) or more (96,000).

  • A single mono channel

  • Encoded using aac

  • In an m4a container.

Note that this is a fairly low quality audio file and would be unsuitable for disseminating music. The goal is to use the smallest and simplest file appropriate for spoken audio, without compromising too much quality.

To get the latest data on all combined audio derivs: bundle exec rake scihist:reports:combined_audio_derivs > report.csv

We chose a constant bitrate over a variable bitrate (VBR) because we feel it’s likely to be more reliable for seeking behavior (jumping forward or back to particular timestamp). Users are going to be jumping back and forth in the audio file a lot, so it makes sense to want every second of the interview to be recorded in the same number of audio samples.

For more technical details, see `app/services/combined_audio_derivative_creator.rb`.

If you have technical questions about the CAD for a particular oral history, you can easily take its public url and analyze it on the command line by typing ffprobe URL.

Historical note: before May 2022 these CADs were stored as mp3s. The old files were deleted and CADs were recreated en masse.