The mess is the data

Published: 2026-03-05
Updated: 2026-03-05

My research is part of the Partnership for New Uses of Collections in Art Museums, a collaborative initiative bringing together researchers and museum professionals.

Within that framework, the lab called Ouvroir was created to develop four research projects; you can find their descriptions on our website. The Ouvroir is directed by Emmanuel Château-Dutier and Kristine Tanton, I coordinate the projects. We became a lab, holding events and having other research projects with the CRIHN or Forward Linking (Links).

In this context, I am doing my PhD at the Université de Montréal, in art history and museology under the supervision of emmanuel. I am in my third year of a four-year doctoral programme in Québec.

But before all of this, I was an art conservator.

I spent ten years working in contemporary art centres and museums in France, including, for example, being responsible for complex artworks for the opening of Luma Arles, linked to the Luma Westbau in zurich, which some of you may know.

Today I will be presenting some parts of my research and some of the hypotheses I am currently working with.

The Problem

Exhibitions are the primary medium of contemporary art. They are how art circulates, how meaning is made, how institutions define themselves. And yet once dismantled, they almost entirely disappear. What survives is fragmentary: a press release, a few installation photographs, a catalogue if you’re lucky. That information is sometimes on the institution’s website, occasionally linked to collection records if your lucky. But it is scattered, heterogeneous, difficult to consult, every document serving as a silo in its own right.

Usually the documentation exists, but it is trapped. Museums hold rich relational databases internally and have excel and word document on servers as loan records, installation histories, conservation reports, technical riders. All of that information exists somewhere. But when an exhibition is presented to the public (on a website, in a catalogue, in an archive) it reappears as a flat list: a selection of works, some images, a wall text. The relational complexity of the institution’s own knowledge is most of the time stripped away.

Exhibition museum archives includes usually :

Inventory of exhibited works
Spatial design & scenography
Mediation texts
Technical devices informations
Loans, contracts, budgets

The challenge of archives

Always dispersed across departments
Archives are fragmented, sometimes contradictory, hard to gather & interpret

The digital paradox

Most documents now born-digital which seems easier (everything recorded, stored, classified)
But: Emails vanish, Software becomes obsolete, Files not migrated or readable…

Let me give you a concrete example. I worked on an exhibition called Feux pâles (1992). Already at that level, the documentation is unreliable: the layout we have is not the final installation, because curators almost always make last-minute changes during the hang. And the work list printed in the catalogue is not the correct one as catalogue are often printed before the opening of the show. Now add another layer of complexity. In 2014, a version of that exhibition was recreated, or interpreted, I should say, because the curator Paul Bernard does not like the word reconstitution, even though it was very close to the original. That version, titled L’ombre du jaseur d’après Feux pâles, was presented at the MAMCO in Geneve. Different moment, different space, different institutional context.

Paul had to know precisely which works were shown and how they were arranged. The team consulted the CAPC archives: floor plans, work lists, correspondence, installation photographs. And yet there is no single complete record. The sources contradict each other. You have to cross-reference them, interpret gaps, and make hypotheses. You also have to document your own research process, which sources you consulted, what reasoning you followed, what choices you made and why.

This is also why photographs alone are not sufficient. Displays change during an exhibition: what was photographed on opening day may not reflect what visitors saw a month later. And what matters is not only the final result, it is the curatorial process itself: the successive decisions, the adjustments, the works that were considered and refused. The MAMCO, for instance, included works that had not been loaned for the original 1992 show. That is not an error : it is a curatorial choice. But without a model that can express that distinction, it disappears into the record as if it were simply what happened.

So my research try to identify what would be exhibition as data in a conservation/documentation meaning.

What gets lost is not just data. It is the web of relations that made the exhibition what it was: who negotiated what with whom, how the hang evolved from first proposal to opening night, what the artist refused, what the conservator flagged, whose voice shaped the interpretation, what happened. That web is what I want to recover, without accumulating vast new quantities of data.

Why This Is a Conservation Problem

I come to this from conservation-restoration, not from digital humanities, curation or mediation.

Conservation asks: what is this thing, what is its identity, what threatens it, what is it composed of and how do we care for it over time? Applied to exhibitions, which are not only objects but events, networks, processes, those questions become very difficult. An exhibition has no single stable form. It exists across time in multiple versions, iterations, and memories. Its identity is distributed.

Conservation has developed sophisticated methods for paintings, sculptures, and photographs, and has done important work on installation art, performance, and time-based media. It has almost never addressed exhibitions systematically. My research is an attempt to do that and the tools I need are not the ones conservation has traditionally used. I explore open GLAM+, open source, low tech digital tools but as a museum practice professional.

The Current State: Relational Databases, APIs, and Silos

Let me be concrete about what I learned this past years on the technological landscape in museums.

Most museums manage their collections and exhibitions through collection management systems : TMS, MuseumPlus, Adlib, Axiell, or open-source platforms like CollectiveAccess. These are relational databases. They are well-suited to structured records: an artwork has an ID, an artist, a medium, dimensions, a provenance and exhibition history. The relations between tables are defined and effictively queryable.

The problem is that relational databases are designed for stable, well-defined entities. An exhibition is neither. It is a temporary configuration of works, spaces, people, and ideas that changes constantly from conception to opening to touring to institutional memory. A relational schema can record that an exhibition happened, which works were shown, and when. It struggles to capture how the exhibition functioned as a system.

And here comes what I learned being the silo problem. Information exists, but it is locked inside proprietary systems, in formats that cannot be linked, queried, or reused across institutions.

You might say: many museums now have APIs. Isn’t that the solution? The Rijksmuseum API, the Cooper Hewitt API, the Harvard Art Museums API, these are real, well-documented, and genuinely useful. But an API is not a semantic layer. It exposes the relational database in a queryable format, but the underlying data model remains the same. You get cleaner access to the same silos. The information is still structured around objects, not around the events, the exhibitions, that gave those objects meaning in context.

What the Semantic Web Offers — and Where the Gaps Are

[Web semantics refers to the technologies and standards (like RDF, OWL, SKOS) that add machine-readable meaning to web data, enabling computers to understand, link, and reason about information, turning the web into a “web of data” rather than just a “web of documents.” For DH, this means richer data integration, interoperability, and the ability to ask complex questions across diverse cultural, historical, or textual datasets]

The semantic web proposes a genuinely different logic. Instead of storing information in tables and exposing it only through an API, you publish it as linked data — structured descriptions where every entity has a persistent identifier, every relation is named, and the whole forms a graph that can be connected across institutions and queried from outside.

This is not theoretical. Some institutions have already built a semantic layer on top of their relational databases. ResearchSpace at the British Museum is an ambitious example: a semantic layer built over the museum’s collection data using CIDOC-CRM, exposing a SPARQL endpoint for research queries.

Other museums (SFMOMA MOMA TATE) use Wikidata as the most accessible entry point, making cross-institutional queries possible: which exhibitions showed this artist between 1970 and 1990? Which curators worked together across venues?

[principles and technologies of the Semantic Web, such as RDF (Resource Description Framework), SPARQL (query language), and linked data, to enable data to be queried, linked, and reused across the web. This makes Wikidata not just a database, but a key component of the Semantic Web ecosystem, allowing for interoperability and integration with other datasets and ontologies]

But for my specific problem, documenting exhibitions as complex heritage objects, let me map the ontological landscape more carefully :

CIDOC-CRM is the backbone of almost everything serious in semantic heritage. It is a generic, event-oriented ontology designed to describe the life of cultural objects. It includes a class for exhibitions (E7 Activity) but it is very general. It can record that an exhibition happened, who participated, and when. It struggles with contingency, with nuanced participation roles, and with the unstable identity of an itinerant show. **Nicola Carboni’s worked form CIDOC CRM on modelling exhibtions. He tested it against two major datasets (the Artl@s BasArt catalogue and the MoMA Exhibition Index) is the most complete attempt so far to model exhibitions as events rather than containers for objects. It covers temporal duration, spatial extension, mereological structure, participation, contingency, and knowledge sources. But it is for diffusion it does not address the conservation/documentation perspective.

AAAo (Art and Architectural Argumentation Ontology) (Big up to SARI ext door) attempts to model what George Bruseker calls “hard historical data”: contested, incomplete, multi-sourced information that cannot be reduced to clean triples without losing its meaning. This is crucial for exhibition documentation, where what matters is often precisely what is uncertain or disputed.

Onto-Exhibit, developed by Nuria Rodríguez-Ortega’s team, addresses the discursive dimension of exhibitions — the curatorial arguments, the interpretive frameworks, the social and rhetorical layers that shape what an exhibition means. It is the closest existing model to what I need on the interpretive side.

The Display ontology, developed at the Ouvroir, fills a gap none of these models address: the spatial documentation of exhibition displays, where works were placed, in what configuration, in relation to which other works.

EODEM (Exhibition Object Data Exchange Model) data exchange standard developed by the CIDOC Documentation Standards Working Group. It solves a practical logistical problem: museum teams spend days manually copying object data between collection management systems when processing loans. EODEM, built as a LIDO profile, allows that data to be exported and imported automatically. It is genuinely useful — but it is logistical, not documentary. It moves data about objects between institutions; it says nothing about the exhibition as an event, a memory, or a heritage object in its own right. EODEM is precisely the difference between managing an exhibition and documenting one.

What no one has built yet is a model or the link to several different models to integrates all the dimension into a unified framework oriented toward conservation rather than analysis or diffusion alone. A model that asks not only what happened, but what we owe to it, and what we are allowed to let go.

I am not creating more data. The data already exists, it is locked inside museum databases, inaccessible, unconnected, unpublished. The semantic web is not a documentation tool in the accumulation sense. It seems to me more like a way to unlock and connect what institutions already hold, without knowing how to use it.

There is a second approach I have been exploring in practice: documentation built into the making of an exhibition, not applied retroactively to its remains. created on the way by the relational database Instead of reconstructing after the fact, always partial, always too late. I know have to find who to make those Decisions, negotiations, and hanging choices get recorded as they happen, in a structured format that is already linked data from the start.

Where I Am and What I Need

Of what I understand, digital humanities is not only about building tools, it is about critically analysing the tools you use and the choices they encode. That is the spirit in which I approach this research. Semantic web though is really hard to understand, I hope my research could give council to museums interested by the adventure.

My approach is grounded in pragmatic modelling, as developed by Ciula, Eide, and Marras (2023): models are context-specific, iterative, and situated. They are adjusted through observation and experimentation rather than imposed from above. They preserve complexity instead of smoothing it away.

I am actively building my model now, working from the ontological landscape I have just described. What I do not yet have is confirmed case studies and I will be honest about why: every institutional partnership I had lined up has fallen through. I did not anticipate how difficult it would be to convince museums to open their exhibition archives to this kind of research. I suspect part of the problem is the nature of what I am asking for. I am not interested in one exhibition in particular, I am interested in their practice, their documentation habits, their internal mess. That is a much harder ask than “can I study your Beuys show.” It requires institutional trust, staff time, and a willingness to expose processes that were never meant to be visible.

In Switzerland, one example that interests me enormously is the Kunsthalle Bern archive, a century of exhibition history, a carefully designed online interface that visualises exhibitions through document density but also linked and alimented by reserchers visiting, and they have a new Oral Histories project launching in 2026. It is a rare case of an institution that has thought seriously about what an archive is for, not just how to manage it. I am also in conversation with Kunstmuseum Basel and Museum Rietberg.

The mess is the data.