Previous
Up
Next
1 Introduction

1 Introduction

This document is extracted from the BNC Users Reference Guide issued with the first public release of the British National Corpus. That Reference Guide contains a full description of the design principles underlying the BNC, and detailed information about the way in which it is encoded, stored and distributed. It also contains technical details of use to software developers interested in providing tools for access to the Corpus, in particular full reference information about the SARA package, together with a list giving brief bibliographic details for each text making up the corpus.

The present document includes only information documenting the encoding scheme used for the BNC Sampler, which is almost (but not quite) identical to that used for the full BNC. Information about the design and construction of the BNC proper is omitted, but can easily be found by reference to the original Reference Guide , and is also freely available from the BNC World Wide Web server at http://info.ox.ac.uk/bnc. Background information about the design and construction of the Sampler Corpus is included in the introductory documentation supplied with it.

Full information about the CLAWS part-of-speech tagging applied to the Sampler, including both a description of the CLAWS system itself and the Tagging Manual used at the University of Lancaster, is provided by the following two documents:


Previous
Up
Next