add this bookmarking tool

Using large-scale XML corpora in Language and Literature

This one day workshop will introduce the technologies needed to unlock the potential uses of large scale XML-encoded language corpora, with a particular focus on the most recent version of the British National Corpus (BNC). A new XML-encoded version of the BNC, which has established itself as a key reference point in such work, was released in April 2007. Participants will learn how to explore this particular corpus using a variety of generic XML tools, focussing on (but not limited to) XAIRA, a general purpose software architecture for the linguistic analysis of large XML corpora. They will explore the kinds of language learning activities and linguistic analyses best supported by such tools, and discuss the usability of such tools for fundamental linguistic and literary research in large textbases. The course will have a strong practical component, and participants will be encouraged to provide samples of their own textual materials to experiment with corpus construction and analysis.