Random Post: get_results("SELECT ID,post_title,guid FROM $wpdb->posts WHERE post_status= \"publish\" ORDER BY RAND() LIMIT 1"); $p=$post[0]; echo ('' . $p->post_title . ''); ?>
RSS .92| RSS 2.0| ATOM 0.3
  • Home
  • About
  • Team
  •  

    Visiting the English Place Name Survey

    I was in Nottingham for OSGIS at the Centre for Geospatial Sciences on Tuesday; skipped out between lunch and coffee break to visit the English Place Name Survey in the same leafy campus.

    A card file at EPNS

    A card file at EPNS

    Met with Paul Cavill, who dropped me right in to the heart of the operation – stacks of index cards in shoe boxes. Each major name has a set of annotation cards, describing different related names and their associations and sources – which range from Victorian maps to Anglo-Saxon chronicles.

    The editing process takes the card sets and turns them right into print-ready manuscript. The manuscript then has very consistent layout conventions – capitalisation, indentation. This is going to make our work of structure mining a lot easier.

    Another bonus I wasn’t expecting was the presence of OSGB grid references for all the major names. The task of making links becomes a snap – I was imagining a lot of iterative guesswork based on clustering and closeness to names in other sources. (There are four Waltons in the UK in geonames, dozens in the EPNS).

    On this basis I reckon the entity recognition will be a breeze, LTG will hardly have to stretch their muscles, which means we can ask them to work on grammars and machine learning recognisers for parts of other related archives within the scope of CHALICE.

    Pic_0622_026And we would have freedom in the EDINA team’s time to do more – specifically to look at using the National Map Library of Scotland’s map rectifier tools to correlate the gazetteer with detailed line-drawn mapsĀ  also created by the late H. D. G. Foxall. Digitisations of these maps live in the Shropshire Records Office. We must talk with them about their plans (the Records Office holds copyright in the scans).

    The eye-opener for me was the index of sources, or rather the bibliography. Each placename variant is marked with a few letters identifying the source of the name. So the index itself provides a key to old maps and gazetteers and archival records. To use Ant Beck’s phrase the EPNS looks like a “decoupled synthesis” of placename knowledge in all these sources. If we extract its structure, we are recoupling the synthesis and the sources, and now know where to look next to go text mining and digitising.

    Pic_0622_024So we have the Shropshire Hundreds as a great place to start, as this is where the EPNS are working on now and the volumes are “born digital”. Back at CDDA, Paul Ell has some of the very earliest volumes digitised, and if we find a sample from the middle, we can produce grammar rules that we can be pretty confident will extract the right structure from the whole set, when the time comes to digitise and publish the entire 80+ volume, and growing, set.

    But now i’m fascinated by the use of the EPNS derived data as a concordance to so many associated archives documenting historic social patterns. Towards the end of our chat Paul Cavill was speculating about reconstructing Anglo-Saxon England by means of text mining and georeferencing archives – we could provide a reference map to help archaeologists understand what they are finding, or even help them focus on where to look for interesting archaeology.

    Paul had been visited by the time-travelling Mormons digitising everything a couple of weeks previously, and will hopefully offer an introduction – i would really, really like to meet them.

    One response to “Visiting the English Place Name Survey”

    1. wow how many EPS cards are there? How many triples? Which ontology predicates have you mapped for each item on these index cards? Lots of interesting stuff.

      I like the phrase “decoupled synthesis” and that is the majority of what we are doing in linkeddata is recoupling knowledge that it implicit but needs to be explicit! I’ve also been calling this the mudanly boring but essential job of disambiguation (the killer app of LoD IMHO).

      Great pics, really makes the digital physicality of what you are doing stand out.

      Yep Mormons spend big money on the digitisation stuff, but they do tend to have an additional motivator beyond preservation of knowledge :)