Presentation to the PREMIS Implementation Fair at iPRES 2016, about how PREMIS in METS metadata is implemented in the Archivematica digital preservation system.
Handwritten Text Recognition for manuscripts and early printed texts
PREMIS in METS in Archivematica
1. PREMIS in METS in
Archivematica
Sarah Romkey, Artefactual Systems
PREMIS Implementation Fair
iPRES 2016
2. Archivematica in brief
● Archivematica is a digital preservation system which
bundles together many open-source tools, and
handshakes with others
● Archivematica is not a repository or a place to store
digital objects
● But it is a place to process and package digital objects and
collect technical/rights metadata (in PREMIS, of course!)
● Archivematica creates AIPs and DIPs and stores
metadata in a PREMIS in METS file in the AIP and DIP.
● Archivematica is free and open-source.
3. “The METS file is the AIP.”
- Marco Klindt, Zuse Institute Berlin, speaking at the recent Archivematica Camp
5. PREMIS agents in Archivematica
● Institution
● Logged-in user
● Digital preservation system
● Soon: external agents (more later)
6. PREMIS in METS
<dmdSec> (descriptive metadata)
<amdSec> (administrative metadata)
<techMD>
PREMIS: object
<digiProvMD>
PREMIS: events
PREMIS: agents
<rightsMD>
PREMIS:rights
<fileSec> (a list of the files and their roles and relationships)
<structMap> (a representation of the physical structure of the
objects)
7. Development in Archivematica
● Institutions propose development projects (usually to
Artefactual Systems)
● New features are rolled into the software when released.
● Most new features include at least a consideration of
how it will impact the PREMIS and/or METS
8. Example 1: AIP re-ingest
● To be released in Archivematica 1.6 (some
implementation in 1.5)
● Ability to take an AIP in archival storage and send it back
to Archivematica to re-do preservation tasks
● PREMIS and METS is versioned to reflect the
re-ingestion event as well as any updated tool output.
9.
10.
11. Example 2: New tool integration
● In an upcoming version of Archivematica, we’ll be
integrating MediaConch as a validation tool.
● New tools mean new tool output, which means new
PREMIS!
● MediaConch will facilitate two kinds of validation events:
○ Validation of the files against the file specification
○ Validation against a local policy
12.
13.
14. Example 3: Dataverse as an agent
● We developed a proof of concept integration with the
research data tool Dataverse
● Dataverse creates derivative files out of certain file types
for purposes of analysis and display. The study can then
be exported as a zipped package, which can be ingested
into Archivematica
● We wanted to reflect the fact that some of the files were
created by Dataverse as a software agent.
15.
16. Example 4: Improving A/V handling
● Currently, Archivematica models objects down to the
level of the file.
● With the introduction of MediaConch, we’re interested
in modeling down to bitstreams to show the different
tracks/stream within an a/v file.