add this bookmarking tool

Exploring the BNC with Xaira - introduction tasks

Below you will find a set of tasks that you can do in order to become familiar with the Xaira and the BNC. Some very brief instructions are provided for each task. Links are also available from this page to relevant sections on other pages, such as 'Using the BNC XML Edition with Xaira' (available at http://www.natcorp.ox.ac.uk/tools/bncXml_search.xml) and the Xaira Reference Manual (similar information as in the Help Files) http://www.oucs.ox.ac.uk/rts/xaira/Doc/refman.xml. To access the Xaira Help files, press F1 on your keyboard (when you are using Xaira).

Getting started

Before you can start exploring the BNC XML with Xaira you need to install the software and the corpus. Information about this process is provided on the Installation page http://www.natcorp.ox.ac.uk/XMLedition/installation.xml.

'Word perfect' - what does it mean?

In this exercise you will make a search and explore the result + filter unwanted instances.

  1. Find all instances of ‘word perfect’. How many are there? (use Quick Query or Phrase Query).
  2. Explore the search result – can you identify any particular use of the phrase? Try toggling between Line mode and Page mode display and increase the scope to understand what's going on (more details).
  3. Remove instances where the expression is used in the word processor sense. How many remain? (Filter the solutions using the Thin option.)

NOTE: You could have used the Case Sensitive Phrase Query option to exclude ‘Word Perfect’(Xaira Reference Guide: Phrase Query).

What did you have for breakfast?

Here you use the Word Query and Phrase Query options and explore the Analysis function. You then try the Query Builder and can experiment with the Partitions feature. Finally you can combine different types of searches in the Query Builder and see what you can do with the Collocations function.

What's the most frequent spelling (and use) of hardboiled/hard-boiled/hard boiled?

  1. Make a separate search for each variant (hardboiled, hard-boiled, hard boiled) and check the frequency.
    • What is the difference between PhraseQuery and Word Query? When can you use only one? (Xaira Reference Guide: Word Query / Phrase Query)
  2. Analyse each result by Text Class. What conclusions can you draw? (Use the Analysis function ).

Can anything apart from eggs be hard boiled?

  1. Make a Query Builder search for all three variants at once (Xaira Reference Guide: Query Builder Query). Illustration. You should find 92 instances in 64 texts.
  2. Sort the concordance on Right 1. (Xaira Reference Guide: Sorting). How many of the instances refer to eggs? What else can be hard boiled?

How do cooks use it?

  1. Restrict the query to the 'Other published' Text class. (Use the partitions drop-down boxes. Select 'Text class' in the first one, 'Other published' in the second [Illustration). Click the Edit. Accept ‘Change the class…’ and click on OK in the Query Builder window to run the new query).
  2. Identify examples from cookery books or recipes (use the Bibliographic data button for information about the source).

What else is boiled?

  1. Make a Query Builder query for 'boiled' immediately followed by 'any noun' (Word/Phrase Query boiled, NEXT, Addkey Query SUBST [Illustration]).
  2. Download 10 random solutions.
  3. To obtain a summary of all 298 cases of nouns following 'boiled' use the collocations function (Collocations function. Untick 'Downloads only'. Set window 0L 1R)
  4. Try sorting the collocations alphabetically, by frequency and by Z-score. What is the difference?
  5. Make a query for one collocation and explore the result.
    • How would you save and/or export these? (Listing command).
    • What does the 'save' option do? Where do you find it? (Xaira Reference Guide: Save)
    • What about 'copy'? (Xaira Reference Guide: Copy)

A nice time was had by all

In addition to searching for phrases (with or without a lemma) you will also use the Query Builder to look for instances uttered by a speaker with certain (socio-linguistic) features.
  • Make a Simple Query or Phrase Query for a nice time.
  • How much more frequent do you think 'nice time' would be? Check by making a search. (Was that search quicker than the previous one? If so, why?)
  • What forms of 'have' are used with 'nice time'? Is anyone ever having a 'very nice' time? (Use the Query Builder and make a search for have [Word Query, lemma 'have'] followed by nice time [Phrase Query] within a span of five words. Illustration).

Who talks about a 'nice time'? Men or women?

  • Use the Query Builder to find 'nice time' produced by men and women respectively. (In Query Builder, change the left-hand box to XML ->U -> who.sex -> Add from list. More details. Illustration).
  • What about 'a good time'?

NOTE: If you want to find out how often a type of speaker uses HAVE together with "nice time", you will find you can't do it with Query Builder. You can use the CQL query option to design queries with nested scopes - in this case HAVE followed by "nice time" within 5 words within an utterance by a male (or female) speaker. (See CQL Query in Xaira Reference Guide)

What can you address (apart from an envelope)?

Here you look for a specified lemma and explore its collocations and colligations.
  • Find 'address' in all its verb uses (Word Query) and download 1 solution.
  • Use the Colligation function to find which word class co-occurs most with 'address'. (In Collocation function: Untick 'Downloads only'. Set 'Window' to L3 R3. Under the 'Colligation' tab, FIRST select 'pos' and then tick 'Colligation' ).
  • What kind of nouns are used with 'address'? Try these two ways to find out:
    • Via the colligations: Click on SUBST in the list of colligations and then on the Query button to retrieve concordances of all instances. Explore these.
    • Via the collocations: Untick 'Colligation' (or make new collocation search). Under Lemmata tab, activate 'BNC'. Sort list of collocates on 'pos' and explore the SUBST ones.
Can you think of another way to find nouns that co-occur with 'address'? What are the differences between the ways to search? Which one do you prefer? Can you think of when one is more suitable than another?

Up: Contents