ArchivesSpace (or ASpace for short) is a server whose main purpose is to host a software program also named… ArchivesSpace. The program is “an open source archives information management application for managing and providing web access to archives, manuscripts and digital objects”. The server also hosts a few auxiliary programs who take the output from ArchivesSpace and convert it into various other formats, which are then made available via an Apache webserver on the same machine.

...

Location	Format	Number of collections described	Source	Example	Who can see it?
`Shared/P/Othmer Library/Archives/Collections Inventories/Archival Finding Aids and Box Lists`	Word documents	Roughly 270, dates 1997 – present.	This is the original collection description.	`P/Othmer Library/Archives/Collections Inventories/Archival Finding Aids and Box Lists/Labovsky Collection Finding Aid.doc`	Institute staff
ArchivesSpace site	MySQL-backed website	Roughly 45 120 as of 20202022	Entered manually based on the P drive Word files.	https://archives.sciencehistory.org/resources/81#tree::resource_81	Only logged in ArchivesSpace users
ArchivesSpace Apache front end	EAD (xml format)	Roughly 45 120 as of 20202022	Generated hourly nightly from ArchivesSpace database	httpshttp://archivesead.sciencehistory.org/ead/scihist-2012-021.xml	Public
ArchivesSpace Apache front end	HTML	Roughly 45 120 as of 20202022	Generated hourly nightly from ArchivesSpace database. These will be replaced by pages in the PUI in summer 2022.	https://archives.sciencehistory.org/2012-021.html	Public
OPAC	PDF	460; see complete list	Exported manually as PDF from the ArchivesSpace site, then attached to the OPAC record for the collection	https://othmerlib.sciencehistory.org/articles/1065801.15134/1.PDF	Public
https://guides.othmerlibrary.sciencehistory.org/friendly.php?s=CHFArchives	LibGuide	Most collections, categorized by subject.	?	Subject: nuclear chemistry	Technically public, but does not appear to be linked from anywhere.

...

Finding aids are stored as Word documents at Shared/P/Othmer Library/Archives/Collections Inventories/Archival Finding Aids and Box Lists.
Kent enters the data in them, one by one, into ArchivesSpace. He revises them in the process. As of summer 2020 approximately 45 have been entered.
Once they are in ArchivesSpace:
- They are automatically exported , via an hourly cron job described below, to EAD files https://archivesby https://chemheritage.atlassian.net/wiki/spaces/HDCSD/pages/2151514113/export+archivesspace+xml to EAD files at http://ead.sciencehistory.org/ead/ .
- They are also converted to HTML. Examples: Wotiz; Simon; Fenn; Carbogel; Brody. There is currently no Web page that lists these HTML files, so you have to know the URL beforehand or be directed to them from e.g. Google or the OPAC.
- Kent also exports them to a PDF, which he then sends to VictoriaCaroline. These are entered into the OPAC. (see e.g. https://othmerlib.sciencehistory.org/articles/1065801.15134/1.PDF )
  - Note: the PDF has to be manually updated in the OPAC every time the metadata in ArchivesSpace changes.
- In certain cases the OPAC record also points at the HTML file at https://archives.sciencehistory.org/ , which, of course, is updated nightly .
- Certain works in the Digital Collections also point to these HTML files.
Finally, the exported EAD files are also ingested by University of Penn Libraries Special Collections and the Center for the History of Science, Technology, and Medicine (CHSTM).
- Penn, in turn, processes these EAD files on a nightly basis and adds them to the Philadelphia Area Archives Research Portal (PAARP), a service funded by PACSCL.
  - Example: http://dla.library.upenn.edu/dla/pacscl/detail.html?id=PACSCL_SCIHIST_2012021USpaphchf
  - A conversation with Holly Mengel, the archivist responsible for the process, reassured us that the only thing required for this export to work is for valid EAD files be publicly accessible in the directory at https://archives.sciencehistory.org/ead/ . This URL could be changed as long as we give Holly plenty of notice and coordinate with her, which raises the possibility of us posting them to e.g. an S3 bucket.
  - Notably, Holly assures us that the apparatus at PAARP / PACSCL does not link back to archival descriptions hosted on any of of our domains.
Likewise, CHSTM ingests these EADs and makes them searchable at its search portal.
- Example https://www.chstm.org/collections/search?text=Carbogel
- Attempts to contact our liaison at CHSTM, Richard Shrake, have failed.
Note that external links to our HTML finding aids are rare and can be disregarded. There should be no need to provide redirects to these URLS when we eliminate them.

...

The current production version of Aspace is 23.70.1 .

Terminal access: ssh -i /path/to/production/pem_file.pem ubuntu@50.16.132.240

The ubuntu user owns all the admin scripts.

The relevant Ansible role is: /roles/archivesspace/ in the ansible-inventory codebase.

...

Configuration for the Apache site is at /etc/apache2/sites-available/000-default.conf. It would be a good idea to spend some time drastically simplifying this configuration.

Main users

Kenton Jaenig
Sarah Newhouse
Patrick Shea

Startup

To start Archivesspace: ~~sudo systemctl start archivesspace. You may need to run this several times (just wait 30 seconds between attempts.)~~
- /opt/archivesspace/archivesspace.sh start (as user ubuntu)
You can troubleshoot startup by looking at the start script (invoked by the above
- )
: /opt/archivesspace/archivesspace.sh start
There may be a short delay as the server re-indexes data.

...

Only the most recent set are used by Jetty, but the old ones accumulate rapidly if the server is restarted nightly.

A system to clean these up will be needed – some variation on find /tmp -maxdepth 1 -type d -mtime +20 | grep jetty.*war cron job removes obsolete ones nightly.

Export

The ArchivesSpace EADs are harvested by:

...

Both institutions harvest the EADs by automatically scraping https://archives.sciencehistory.org/ead/ . Once harvested, the EADs are added to their aggregated Philly-area EAD search interfaces.

The main export files are located at: /home/ubuntu/archivesspace_scripts . They are checked into code at https://github.com/sciencehistory/archivesspace_scripts .

Important files:

...

complete_export.sh

...

Runs the nightly export (called by cron every night at 9 PM). This calls as_export.py and generate.sh below.

...

local_settings.cfg

...

Settings

...

as_export.py

...

Extracts XML from ArchiveSpace and saves a series of EADs into /exports/data/ead/*/*.xml .

It exports EADs that contains links to the actual digital objects.

...

generate.sh

...

Transforms the EADs in /exports/data/ead into HTML and and saves them into var/www/html. See for instance https://archives.sciencehistory.org/beckman e.g.

It relies on files (stylesheets, transformations) in

finding-aid-files
fa-files

...

xml-validator.sh

...

Checks that the publicly accessible files in /var/www/html/ead/ are valid.

Once processed by generate.sh, the xml files are publicly accessible at https://archives.sciencehistory.org/ead/

via an Apache web server.

Details about the as_export.py script:

...

This code was adapted from https://github.com/RockefellerArchiveCenter/as_export

...

http://ead.sciencehistory.org/.

Building the server

The server not yet fully ansible-ized.

...

Place the Mysql database in /backup

mysql-backup.sh

Dumps the mysql database to /backup/aspace-backup.sql.
This script is run as a crontab by user ubuntu : 30 17 * * 1-5 /home/ubuntu/archivesspace_scripts/mysql-backup.sh

Sync /backup to an s3 bucket

s3-backup.sh

Runs an aws s3 sync command to place the contents of /backup at https://s3.console.aws.amazon.com/s3/object/chf-hydra-backup/Aspace/aspace-backup.sql?region=us-west-2&tab=overview.

This script is run as a crontab by user ubuntu : 45 17 * * 1-5 /home/ubuntu/archivesspace_scripts/s3-backup.sh

See Backups and Recovery (Needs updating) for a discussion of how the chf-hydra-backup s3 bucket is then copied to Dubnium and in-house storage.

...

Versions Compared

Old Version 69

New Version 70

Key

Main users

Startup

Export

Building the server

Page Comparison

Versions Compared

Old Version 69

New Version 70

Key

Main users

Startup

Export

Building the server