Jo and I recently met with Stuart Jeffrey and Michael Charno at the Archaeology Data Service in York, to discuss a putative third CHALICE use case. The ADS is the main repository for archaeological data in the UK, and thus has many potential crossovers with CHALICE, and faces many comparable issues in terms of delivering the kind of information services its users want.
Much of the ADS’s discovery metadata as far as topography is concerned is based on the National Monument Record (NMR); and therefore on modern placenames. The ADS’s ArchSearch facility is based on a facetted classification principle: users can come into the system from a national perspective, and use parameters of ‘what’, ‘when’ and ‘where’ to pare the data down until they have a result set that conforms to their interests, with the indexing and classification into facets undetaken by ADS staff during the accession process.
In parallel with this, the ADS has experimented with Natural Language Processing (NLP) algorithms to extract place types – types of monument, types of site, types of feature etc from so-called ‘greay Literature’, employing the MIDAS period terms. The principle of using NLP to build metadata is not in itself unproblematic: many depositors prefer to be certain that *they* are responsible for creating, and signing off, the descriptive metadata for their records. As with other organizations that we’ve spoken to, Stuart noted that georeferencing collections according to county > district > parish can create problems due to boundary changes; also many users do not necessarily approach administrative units in a systematic way. For example, most people would not, in their searching behaviour, characterize ‘Blackpool’ as a subunit of ‘Lancashire’. This throws up interesting structural parallels with what we heard from the CCED project.
Another good example the ADS recently encountered, is North Lincolnshire, which is described by Wikipedia as “a unitary authority area in the region of Yorkshire and the Humber in England… [and] for ceremonial purposes it is part of Lincolnshire.” This came up while creating a Web service for the Heritage Gateway for them. It was assumed that users would naturally look for North Lincolnshire in Lincolnshire, however the Heritage Gateway used the official hierarchy, which put North Lincolnshire in Yorkshire and the Humber. They were working on addressing that in the next version of their interface.
It was strongly agreed that there is a very good case to be made for using CHALICE to enrich ADS metadata with historical variants, and that those wishing to search the collections via location would benefit from such enrichment. This view of things sits well alongside the CCED case (which focuses on connections of structure and georeferenceing) and VCH (which focuses on connections between semantic entities). What is interesting is that all three cases have different implications for the technology, costs and research use: in the next three months or so the project will work on describing and addressing these implications.