This is our “final product post” as required by the #jiscexpo project guidelines. Image links somehow got broken, they are fixed now, please re-view.
Chalice – Past Places
Chalice is for anyone working with historic material – be that archives of records, objects, or ideas. Everything happens somewhere. We aimed to provide a historic place-name gazetteer covering a thousand years of history, linked to attestations in old texts and maps.
Place-name scholarship is fascinating; looking at names, a scholar can describe the lay of the land, see political developments. We would like to pursue further funding to work with the English Place-Name Survey on an expert-crowdsourced service consuming the other 80+ volumes and extracting the detailed information – etymology, field-names.
Linked to other archival sources, the place-name record has the potential to reveal connections between them, and in turn feed into deeper coverage in the place-name survey.
There is a Past Places browser to help illustrate the data and provide a Linked Data view of the data.
Stuart Dunn did a series of interviews and case studies with different archival sources, making suggestions for integration. The report on our use case for the Clergy of the Church of England Database may be found here; and that on our study of the Victoria County History is here. We also have valuable discussions with the Archaeology Data Service, which were reported in a previous post.
Rather than a classical ‘user needs’ approach, targeting groups such as historians, linguists and indeed place-name scholars, it was decided to look in detail at other digital resources containing reference material. This allowed us to start considering various ways in which a digitized, linkable EPNS could be automatically related to such resources. The problems are not only the ones we anticipated, of usability and semantic crossover between the placename variants listed in EPNS and elsewhere; but also ones of data structure, domain terminology and the relationship of secondary references acorss such corpora. We hope these considerations will help inform future development of placename digitization.
This covers the work of the four partners in the project.
CeRch at KCL developed use cases through interviews with maintainers of different historic sources. There are blog descriptions of conversations with:
LTG did some visualisations for these use cases, and more seriously text mining the semi-structured text of different sample volumes of the English Place Name Survey.
The extraction of corrected text from previously digitised pages was done by CDDA in Belfast. There is a blog report on the final quality of the work, however the full resulting text is not open licensed nor distributed through Chalice.
EDINA took care of project management and software development. We used the opportunity to try out a Scrum-style “sprint” way of working with a larger team.
TOC to project blog –here is an Atom feed of all the project blog posts and they should be categorised / describe project partners
Project tag: chaliced
Full project name: Connecting Historical Authorities with Links, Contexts and Entities
Short description: Creating and re-using a linked data historic gazetteer through text mining.
Longer description:Text mining volumes of the English Place Name Survey to produce a Linked Data historic gazetteer for areas of England, which can then be used to improve the quality of georeferencing other archives. The gazetteer is linked to other placename sources on the Linked Data web via geonames.org and Ordnance Survey Open Data. Intensive user engagement with archive projects that can benefit from the open data gazetteer and open source text mining tools.
Key deliverables: Open source tools for text mining archives; Linked open data gazetteer, searchable through JISC’s Unlock service; studies of further integration potential.
Lead Institution: University of Edinburgh
Person responsible for documentation: Jo Walsh
Project Team: EDINA: Jo Walsh (Project Manager), Joe Vernon (Software Developer), Jackie Clark (UI design), David Richmond (Infrastructure), CDDA: Paul Ell (WP1 Coordinator), Elaine Yates (Administration), David Hardy (Technician), Karleigh Kelso (Clerical), LTG: Claire Grover (Senior Researcher), Kate Byrne (Researcher), Richard Tobin (Researcher), CeRch: Stuart Dunn (WP3 Coordinator).
Project partners and roles: Centre for Data Digitisation and Analysis, Belfast – preparing digitised text, Centre for e-Research, Kings College London – user engagement and dissemination, Language Technology Group, School of Informatics, Edinburgh – text mining research and tools.
This is the Chalice project blog and you can follow an Atom feed of blog posts (there are more to come).
The code produced during the Chalice project is free software; it is available under the GNU Affero GPL v3 license. You can get the code from our project sourceforge repository. The text mining code is available from LTG – please contact Claire Grover for a distribution…
The Linked Data created by text mining volumes of the English Place Name Survey – mostly covering Cheshire – is available under the
Open Database License – a share-alike license for data by Open Data Commons.
The contents of this blog itself are available under a Creative Commons Attribution-ShareAlike 3.0 Unported license.
Link to technical instructional documentation
Project started: July 15th 2010
Project ended: April 30th 2011
Project budget: £68054
Chalice was supported by JISC as a project in its #jiscexpo programme. See its PIMS project management record for information about where responsibility fits in at JISC.