Digital Archiving and Knowledge Organization
Staccioli Digital Archive
The Staccioli Digital Archive is a major cross-departmental and integrated project that demonstrated the ability of the DH Lab group to deliver on a specific project outside the boundaries of the various departments. The goal was to produce a comprehensive catalog of digitized material and semantic metadata on to the life and career of contemporary Italian sculptor Mauro Staccioli which complemented an exhibition of the artist’s work hosted by the Institute in 2023/24. This enhancement of resources that the Photographic Collection, the Digital Library, the Digitization Group and the Institute Photographer, that from the expertise in Knowledge Graphs of the DH Lab core and in liaison with the exhibition curator, and that was undertaken under the direction of Tristan Weddigen, produced the digital counterpart to the exhibition in record time. Underpinned by the shared usage of open standards and systems, the outcome attests to a fruitful collaboration between art history research institutions, individual scholars, archives, and foundations.
On the one hand the resulting application is primed to almost seamlessly integrate new data from the remaining part of the Staccioli physical archive, while on the other the same workflow and infrastructure are easily reusable, owing to its compliance with the open standards of which the group has extensive knowledge (see also: Archivio Staccioli)
Design and implementation: Alessandro Adamou, Klaus Werner, Pietro Liuzzo
{KG}2: The Knowledge Graph for Art History
Progress has been made in consolidating the Linked Data infrastructure of the Bibliotheca Hertziana. The project now encompasses seven datasets amounting to approximately 11 million individual statements, or triples, in the standard Resource Description Framework (RDF) format, covering established legacy data sources as well as sources being actively updated. Separate data integration strategies are in place to cover one-time extract-transform-load (ETL) processes (e.g., for Zuccaro) as well as nightly updates and real-time integration of external RDF stores (e.g., Mapping Sacred Spaces), using cutting-edge software libraries for data transformation like X3ML and SPARQL-Anything. Together, the data and infrastructure form what is now called {KG}2, as in “knowledge graph” and “Kunstgeschichte.” It was successfully employed to power the Mauro Staccioli Digital Archive and exhibition, whose data are dynamically being fed from {KG}2 to the present day.
Following the experience with the Staccioli archive (see above), the DH Lab group will continue to work as a team on a dedicated task of cross-departmental interest. Because of the pivotal role played by the Knowledge Graph in the Staccioli project and the further development of similar resources for the projects Mapping Sacred Spaces, Magnetic Margins, and Pharos, the group’s short-term plan is to focus on developing this resource further and making it part of existing services in production. The improved {KG}2 will sit atop the MPCDF data infrastructure (see "Transfer of Data to Max Planck Servers", below), both as a stand-alone service including data from previous projects, and as an aggregator of the information that is being continuously produced by various running workflows and tools, thus making it the stable basis of any further collaborative research project and public service. Initial applications will feature the involvement of the Photographic Collection and the Digital Library in order to seamlessly integrate their respective catalogs.
Lead: Alessandro Adamou
Support: Pietro Liuzzo, Klaus Werner
Resolver
With the proliferation of online resources from all the departments, it has become increasingly important to assign persistent identifiers to many types of resources in order to improve their citability. The final choice fell upon the ePIC framework, which is maintained by the Gesellschaft für wissenschaftliche Datenverarbeitung Göttingen (GWDG). To improve the discoverability of digital resources and the connections between them, such as the library catalog records with the digitization and transcription of books, the DH Lab set out to develop a Resolver service. This service will store not only the identifiers of each resource but also the connections between them. The Staccioli project, which itself adopts ePICs, also laid the foundation for defining a set of requirements for the Resolver service. These requirements will allow the service to consistently retrieve aggregated identifiers for existing resources from catalogs and research databases, and to centralize the acquisition and administration of permanent identifiers in a coherent and consistent way. The implementation will also help resolve the “identity crisis” of resources, such as artists or artworks that are referenced by multiple projects with different identifiers internal to each project. This paves the way for an advanced citation model and aggregation system: users of resources published online by the Institute will be able to reliably cite web-based resources using stable identifiers whose management does not need to be replicated by each project and department.
Coordination: Elisa Bastianello, Pietro Liuzzo
Development: Chris Tomlinson
EasyDB
EasyDB is a digital slide collection software, developed and hosted by the company Programmfabrik in Berlin. A license was acquired by the BHMPI in 2019. Its original data structure was updated to match the SARI (Swiss Art Research Infrastructure) data model. The database instance contains “Special Collections” of photographs of Art and Architecture that collect photographs from the Department Weddigen of artistic objects and buildings in Argentina, Bolivia, Brazil, Colombia, Mexico, Peru, and the Philippines, as well as image data for two projects of the Department Michalsky, “Conques” and “13th-century Reliquaries”. EasyDB is also intended to serve as an image data repository for younger fellows of the Institute, even if it has not yet been used in this capacity.
Responsible adviser: Martin Raspe
Mapping Sacred Spaces - Digital Archive
In a cross-departmental collaboration with the Department Michalsky, the DH Lab has provided continued support to the development of the digital cataloging platform for the “Mapping Sacred Spaces” project, taking over first-hand development of the platform in the Summer of 2024.
The data produced by the project, which cover liturgical furnishings in Southern Italy from the 11th to the 14th centuries, are implemented as Linked Open Data grounded on version 6 of the CIDOC-CRM ontology. The data are aligned with external datasets and authorities such as the Getty AAT, Geonames, OpenStreetMap, and Zotero, with the goal of aligning the dataset with those present in {KG}2 (see above) – the knowledge graph of Art History in development at the BHMPI. An experimental form of federated data integration with {KG}2 is in place.
The user-facing catalog is implemented in ResearchSpace, offering display-related functionalities—such as geographical visualization and advanced faceted search—to address a wide range of research questions. It also marks the first occasion at the Institute, and one of the few known occasions overall, where ResearchSpace is deployed to manage the entire lifecycle of data, including entry and update by multiple users. The knowledge patterns, entity templates, and third-party service access implemented to that end are aimed to be reusable across projects that intend to use structured material culture data for Art History and are intended to serve as a blueprint for future projects at the BHMPI.
Scientific coordination: Elisabetta Scirocco, Ruggero Longo, Manuela Gianandrea
Technological advisory and implementation: Alessandro Adamou, Polina Voronova
Magnetic Margins
Originating in a collaboration between the Visualizing Science in Media Revolutions research group and the Max Planck Institute for the History of Science, the “Rara Magnetica” project was established as a repository of historical sources on pre-modern research into magnetism. In 2022, the DH Lab was associated with the project to offer advice on the ontologies and data schemas used by the project in the Magnetic Margins census, which collates data on individual copies of relevant works and their annotations worldwide. The DH Lab has since stepped in as an active developer of the data visualization platform upon which the census sits.
The census was implemented using ResearchSpace atop a Linked Data set of 10,000 annotations on over 1,100 copies of nine individual publications between 1269 and 1640, amounting to nearly 400,000 RDF triples represented using primarily CIDOC-CRM, FRBR, and ontologies aligned to them, thus achieving semantic interoperability with {KG}2 and with knowledge graphs in digital humanities worldwide. It marks the first ResearchSpace implementation of an open research data portal by the Bibliotheca Hertziana.
Scientific coordination: Christoph Sander
Technological advisory and implementation at the BHMPI: Alessandro Adamou
Dataset Delivery Service
The DH Lab has started to offer a service of data delivery for external data users. Upon receiving requests for specific research datasets, the group works in synergy with the prospective data providers to deliver a dataset of varying dimension with the necessary metadata responding to the research needs of data users. These products need in-depth knowledge of the data as well as the ability to understand the request of users, who often have uninformed expectations of the data available at the Institute. In this way, two ends of the production and use on both medium and large scales of the curated assets are joined. Packages are delivered via Edmond, the open data repository of the MPG (https://edmond.mpg.de/).
Lead: Pietro Liuzzo
Transfer of Data to Max Planck Servers
With the increasing number of projects and services that needed to be accessed by users outside the local network and the massive addition of digitized images coming from the Library and the Photographic Collection, it has become increasingly important to store the content on reliable servers. The Max Planck Computing and Data Facility worked with the Institute’s IT manager to prepare an environment suitable for all the ongoing projects. This reduced both the risk for the internal infrastructure and the burdens on the internet connection, while also improving the long-term stability and accessibility of all the resources.
Coordination: Alexander Drummer
Data service leads: Alessandro Adamou, Elisa Bastianello, Pietro Liuzzo, Klaus Werner