We’ve received two grants to digitize the Beckman and Bredig collections respectively; we have another one coming soon (the Dow collection). This is a quick overview of the current state of metadata and information architecture setup in ArchivesSpace and the Digital Collections, to make future conversations about these two websites easier.
Example 1:This overview of the data models in ArchivesSpace and the Digital Collections should make future conversations about these two websites easier.
Tabular vs. hierarchical metadata
The Digital collection is primarily a table of works, each with complete metadata. Each work has one or more assets, which are computer files stored in S3.
The metadata in ArchivesSpace is hierarchical, not tabular. Each level of description is understood to inherit all the metadata from levels above it, except where that metadata is explicitly overridden.
ArchivesSpace’s primary function is to provide intellectual control over archival collections, but it also keeps track of the physical locations of folders, boxes and so on. (e.g.: inventory of a physical box; inventory of an archival (intellectual) series).
Links between the two sites
The Digital Collections has pointers to both physical locations (boxes and folders, e.g.) and more abstract archival entities like series and sub-series (and of course collections).
ArchivesSpace has many pointers to individual works in the Digital Collections, but these are not currently accessible to the public.
Let’s take a look at a letter in from the Beckman collection as an example.
Digital collections:
In the digital collections, the letter takes the form of a work: https://digital.sciencehistory.org/admin/works/wm117p03j
To place the work in the context of the collection’s archival arrangement, the D.C. gives you the following clues:
Collection
The letter is part of a collection, the Beckman Collection.
💡 A work can be part of more than one collection, but a
A collection cannot be part of another collection.
The letter work is part of a subseries sub-series and a series within the Beckman collection.
This information
Series arrangement
In the digital collections, series and sub-series arrangement is stored as an unordered sequence of
...
strings attached to the work. In this case we have:
Series Arrangement
Series I. Arnold O. Beckman Files
Sub-series 1. Correspondence
Notes re: series arrangement:
Each string concatenates the type of metadata (
There’s no ordering information toSub-series
), the identifier, (I.)
, and the title of the grouping: (Arnold O. Beckman Files
). These are stored separately in ArchivesSpace.
The d.c. does not encode the fact that
a series is more important than a subseries.There’s no way to order the subseries within a given series
There’s no way to order Likewise, there’s no data in the digital collection that would allow you to order the sub-series within a given series, or the series within a collection.
the series contains the sub-series, rather than vice versa.
ASpace Reference Number
The
...
work also has an ASpace Reference Number: 118f36c4c5a373e4b4a81253ebc85fae
.
This ASpace Reference number can tie a work or collection in the D.C. to a file, sub-series or series in ArchivesSpace - any
level of archival arrangement in ArchivesSpace as long as that level is an “archival object”Practically speaking, this means works or collections in the D.C. can be associated with “file”s, subseries, or series in ArchivesSpace.
In this case the Reference Number refers to a file in ArchivesSpace (see below).
The letter’s physical location within the collection is also denoted by metadata, in the form of a Work::PhysicalContainer. This is just a set of seven keys (description level that is an archival object.
Physical Location
A work’s physical location is encoded as set of seven key-value pairs (the keys being
box
;folder
;page
;part
;volume
;shelfmark
; andreel
). Archival records in the D.C. so far have only usedbox
,folder
, andreel
. (The others are in use to catalog rare books and museums items).For this
letterwork, all the keys except
box
andfolder
are blank.“Box” has as its value the string “1”
“Folder” has as its value the string “29”
box
is the string1
;folder
is the string29
.
ArchivesSpace:
Digital object
In ArchivesSpace, the letter takes the form of a “digital object”:a digital object.
ArchivesSpace maintains a distinction between a digital object and an archival object.
Like all digital objects, it has been unpublished since 2022.
Title is the same as the D.C. work title.
Metadata contains a link to the work in the digital collections. (The work does not have a link back to the digital object, but the work does have a link to the file the digital object is part of.
URL: https://sciencehistory.libraryhost.com/admin/digital_objects/247#tree::digital_object_247
The “digital object” contains as Digital objects were not part of its metadata the URL to the “work” in the digital collections.
The “digital object” is not an “archival object”.
It also contains a link to what ArchivesSpace calls a “file”:
The “file”'s URL is https://sciencehistory.libraryhost.com/admin/resources/1#tree::archival_object_10615 .
The “file” The digital object is part of a file.
the earliest versions of ArchivesSpace (item-level description is uncommon in archival practice as it’s unsustainable at scale).
Item
For completeness' sake, note that we do (rarely) describe individual physical objects (e.g. this film reel) as items.
Items are archival objects and thus have Ref IDs. (Technically any archival object can contain another archival object, but in the case of an item this would be very unlikely.)
File
In this case, the file is a digital surrogate for a manila folder (folder 29 in box 1) which contains the letter.
A “file” is an “archival object” (as opposed to a “digital object”).
The “file”’s title is “Los ⚠️ Nothing to do with a file in the operating system sense.
Title: Los Angeles Chamber of Commerce - Air Pollution Committee, 1951-
1954”The “file” has a unique ID which is
118f36c4c5a373e4b4a81253ebc85fae
.The “file” is also part of a subseries.
The subseries the “file” is part of is called “Sub-series 1. Correspondence”.
Like the file it contains, the subseries is an “archival object”.
https://sciencehistory.libraryhost.com/admin/resources/1#tree::archival_object_5
The subseries has a unique ID, Ref ID
66a590971707f99df33fc42be0d0c909
The subseries is part of a series.
The series is called
Series I. Arnold O. Beckman Files, 1918-2009, undated
The series is also an “archival object”
Its URL is https://sciencehistory.libraryhost.com/admin/resources/1#tree::archival_object_1
The series is part of a collection, of course.
Beckman Historical Collection
https://sciencehistory.libraryhost.com/admin/resources/1#tree::resource_URL: https://archives.sciencehistory.org/repositories/3/archival_objects/10615
Is an archival object, as opposed to a digital object.
In theory, any archival object can contain another archival object.
In practice, at least for the Beckman collection, the file is the lowest level of archival description above the digital object.
Items, files, sub-series and series are all considered archival objects. Digital objects and collections are not.
All archival objects have a unique ID called a Ref ID.
Ref ID:
118f36c4c5a373e4b4a81253ebc85fae
.
1954
Sub-series
URL: https://archives.sciencehistory.org/repositories/3/archival_objects/5
Is an archival object.
Ref ID:
66a590971707f99df33fc42be0d0c909
Series
URL: https://archives.sciencehistory.org/repositories/3/archival_objects/1
Is an archival object.
Ref ID:
5575406909262fd92cf89083a49f855b
Collection
URL: https://archives.sciencehistory.org/repositories/3/resources/1
The collection is not an
“archival object”archival object, but a
“resource”resource.
It Hence, it does not (and cannot) have a hexadecimal Ref ID, but Ref ID.
Has an accession number:
2012-002
Accession numbers are arbitrary strings and might contain digits, spaces, letters and punctuation.
Has an internal ID, like all resources: in this case the integer 1, which is at the end of its URL.
We use the ID as part of the file name at the EAD export page.