Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

ArchivesSpace (or ASpace for short) is a server whose main purpose is to host a software program also named… ArchivesSpace. The program is “an open source archives information management application for managing and providing web access to archives, manuscripts and digital objects”. The server also hosts a few auxiliary programs who take the output from ArchivesSpace and convert it into various other formats, which are then made available via an Apache webserver on the same machine.

Child pages (Children Display)

Background

We store digital descriptions of our archival collections in the following six places:

Location

Format

Number of collections described

Source

Example

Who can see it?

Shared/P/Othmer Library/Archives/Collections Inventories/Archival Finding Aids and Box Lists

Word documents

Roughly 270, dates 1997 – present.

This is the original collection description.

P/Othmer Library/Archives/Collections Inventories/Archival Finding Aids and Box Lists/Labovsky Collection Finding Aid.doc

Institute staff

ArchivesSpace site

MySQL-backed website

Roughly 45 as of 2020

Entered manually based on the P drive Word files.

https://archives.sciencehistory.org/resources/81#tree::resource_81

Only logged in ArchivesSpace users

ArchivesSpace Apache front end

EAD (xml format)

Roughly 45 as of 2020

Generated hourly from ArchivesSpace database

https://archives.sciencehistory.org/ead/scihist-2012-021.xml

Public

ArchivesSpace Apache front end

HTML

Roughly 45 as of 2020

Generated hourly from ArchivesSpace database

https://archives.sciencehistory.org/2012-021.html

Public

OPAC

PDF

?

Exported manually as PDF from the ArchivesSpace site, then attached to the OPAC record for the collection

https://othmerlib.sciencehistory.org/articles/1065801.15134/1.PDF

Public

https://guides.othmerlibrary.sciencehistory.org/friendly.php?s=CHFArchives

LibGuide

Most collections, categorized by subject.

?

Subject: nuclear chemistry

Technically public, but does not appear to be linked from anywhere.

Workflow

  • Finding aids are stored as Word documents at Shared/P/Othmer Library/Archives/Collections Inventories/Archival Finding Aids and Box Lists.

  • Kent enters the data in them, one by one, into ArchivesSpace. He revises them in the process. As of summer 2020 approximately 45 have been entered.

  • Once they are in ArchivesSpace:

  • Finally, the exported EAD files are also ingested by University of Penn Libraries Special Collections and the Center for the History of Science, Technology, and Medicine (CHSTM).

  • Likewise, CHSTM ingests these EADs and makes them searchable at its search portal.

...

The current production version of Aspace is 2.7.1 .

Terminal access: ssh -i /path/to/production/pem_file.pem ubuntu@50.16.132.240

The ubuntu user owns all the admin scripts.

The relevant Ansible role is: /roles/archivesspace/ in the ansible-inventory codebase.

...

The ArchivesSpace EADs are harvested by:

Institution

Liaison

Contact

Center for the History of Science, Technology, and Medicine (CHSTM)

Richard Shrake

shraker13@gmail.com

University of Penn Libraries Special Collections

Holly Mengel

hmengel@pobox.upenn.edu

Both institutions harvest the EADs by automatically scraping https://archives.sciencehistory.org/ead/ . Once harvested, the EADs are added to their aggregated Philly-area EAD search interfaces.

The main export files are located at: /home/ubuntu/archivesspace_scripts . They are checked into code at https://github.com/sciencehistory/archivesspace_scripts .

Important files:

complete_export.sh

Runs the nightly export (called by cron every night at 9 PM). This calls as_export.py and generate.sh below.

local_settings.cfg

Settings

as_export.py

Extracts XML from ArchiveSpace and saves a series of EADs into /exports/data/ead/*/*.xml .

It exports EADs that contains links to the actual digital objects.

generate.sh

Transforms the EADs in /exports/data/ead into HTML and and saves them into var/www/html. See for instance https://archives.sciencehistory.org/beckman e.g.

It relies on files (stylesheets, transformations) in

finding-aid-files
fa-files

xml-validator.sh

Checks that the publicly accessible files in /var/www/html/ead/ are valid.

Once processed by generate.sh, the xml files are publicly accessible at https://archives.sciencehistory.org/ead/

...

These consist of making backups of the sql database used by the ArchivesSpace program.

Place the Mysql database in /backup

mysql-backup.sh

Dumps the mysql database to /backup/aspace-backup.sql.
This script is run as a crontab by user ubuntu : 30 17 * * 1-5 /home/ubuntu/archivesspace_scripts/mysql-backup.sh

Sync /backup to an s3 bucket

s3-backup.sh

Runs an aws s3 sync command to place the contents of /backup at https://s3.console.aws.amazon.com/s3/object/chf-hydra-backup/Aspace/aspace-backup.sql?region=us-west-2&tab=overview.

This script is run as a crontab by user ubuntu : 45 17 * * 1-5 /home/ubuntu/archivesspace_scripts/s3-backup.sh

See Backups and Recovery (Needs updating) for a discussion of how the chf-hydra-backup s3 bucket is then copied to Dubnium and in-house storage.

...