BNC Sampler Corpus Statistics

Size in Kbyte 43946
Size in w-units 1993525
Size in s-units 126976

Text typewordss-unitstexts
Spoken demographic 493852 (24.77%) 52144 (41.06%) 47
Spoken context-governed 496852 (24.92%) 24174 (19.03%) 51
Written books and periodicals 888522 (44.57%) 44852 (35.32%) 69
Written-to-be-spoken 18121 (0.90%) 1056 (0.83%) 3
Written miscellaneous 96178 (4.82%) 4750 (3.74%) 14
Total 496852 24174 184

Spoken Texts

Domain for context-governed spoken materialwordss-unitstexts
Educational/Informative 80463 (16.19%) 7322 (30.28%) 9
Business 134275 (27.02%) 5673 (23.46%) 13
Public/Institutional 145508 (29.28%) 4816 (19.92%) 14
Leisure 136606 (27.49%) 6363 (26.32%) 15
Total 493852 52144 51

Age band of demographic respondentwordss-unitstexts
0-14 22387 (4.53%) 1254 (2.40%) 2
15-24 64652 (13.09%) 7471 (14.32%) 6
25-34 135973 (27.53%) 11640 (22.32%) 12
35-44 97834 (19.81%) 13724 (26.31%) 10
45-59 107112 (21.68%) 12619 (24.20%) 11
60+ 65894 (13.34%) 5436 (10.42%) 6
Total 493852 52144 47

Social class of demographic repondentwordss-unitstexts
AB 164933 (33.39%) 13383 (25.66%) 16
C1 98700 (19.98%) 9641 (18.48%) 9
C2 137686 (27.88%) 18619 (35.70%) 14
DE 92533 (18.73%) 10501 (20.13%) 8
Total 493852 52144 47

Sex of demographic respondentwordss-unitstexts
Male 241493 (48.89%) 24183 (46.37%) 23
Female 252359 (51.10%) 27961 (53.62%) 24
Total 990704 76318 47

Spoken interaction typewordss-unitstexts
Monologue 167714 (16.92%) 7196 (9.42%) 18
Dialogue 822990 (83.07%) 69122 (90.57%) 80
Total 990704 76318 98

Region where spokenwordss-unitstexts
Unknown 54129 (5.46%) 1164 (1.52%) 6
South 375312 (37.88%) 27688 (36.27%) 37
Midlands 199666 (20.15%) 14988 (19.63%) 19
North 361597 (36.49%) 32478 (42.55%) 36
Total 1002821 50658 98

Written texts

Author age bandwordss-unitstexts
Unknown 935786 (93.31%) 49128 (96.97%) 81
35-44 26550 (2.64%) 0 (0%) 1
45-59 7629 (0.76%) 232 (0.45%) 2
60+ 32856 (3.27%) 1298 (2.56%) 2
Total 1002821 50658 86

Author sexwordss-unitstexts
Unknown 405633 (40.44%) 20091 (39.66%) 39
Male 396786 (39.56%) 22145 (43.71%) 35
Female 195581 (19.50%) 8142 (16.07%) 11
Unknown 4821 (0.48%) 280 (0.55%) 1
Total 1002821 50658 86

Author typewordss-unitstexts
Corporate 79369 (7.91%) 4285 (8.45%) 9
Multiple 368323 (36.72%) 17458 (34.46%) 32
Sole 550136 (54.85%) 28446 (56.15%) 43
Unknown 4993 (0.49%) 469 (0.92%) 2
Total 1002821 50658 86

Audience agewordss-unitstexts
Child 23700 (2.36%) 2326 (4.59%) 3
Teenager 30110 (3.00%) 3673 (7.25%) 4
Adult 946106 (94.34%) 44449 (87.74%) 78
Any 2905 (0.28%) 210 (0.41%) 1
Total 1002821 50658 86

Domain for written textswordss-unitstexts
Imaginative 233774 (23.31%) 21332 (42.10%) 18
natural & pure science 35456 (3.53%) 774 (1.52%) 5
applied science 106193 (10.58%) 5494 (10.84%) 10
social science 76211 (7.59%) 3438 (6.78%) 10
world affairs 306921 (30.60%) 9201 (18.16%) 23
commerce & finance 60270 (6.01%) 3613 (7.13%) 6
arts 58318 (5.81%) 3049 (6.01%) 3
belief & thought 43626 (4.35%) 1225 (2.41%) 4
leisure 82052 (8.18%) 2532 (4.99%) 7
Total 1002821 50658 86

Audience levelwordss-unitstexts
Unknown 9505 (0.94%) 363 (0.71%) 1
Low 172777 (17.22%) 11564 (22.82%) 22
Medium 568876 (56.72%) 29136 (57.51%) 44
High 251663 (25.09%) 9595 (18.94%) 19
Total 1002821 50658 86

Written Mediumwordss-unitstexts
Book 616213 (61.44%) 31927 (63.02%) 45
Periodical 272309 (27.15%) 12925 (25.51%) 24
Miscellaneous -- published 59145 (5.89%) 3368 (6.64%) 8
Miscellaneous -- unpublished 37033 (3.69%) 1382 (2.72%) 6
To-be-spoken 18121 (1.80%) 1056 (2.08%) 3
Total 1002821 50658 86

Place of publicationwordss-unitstexts
Unknown 81999 (8.17%) 3950 (7.79%) 11
UK 258098 (25.73%) 11855 (23.40%) 23
North 8580 (0.85%) 1493 (2.94%) 1
Midland 18749 (1.86%) 0 (0%) 1
South 635395 (63.36%) 33360 (65.85%) 50
Total 1002821 50658 86

Written sample typewordss-unitstexts
Unknown 430623 (42.94%) 22146 (43.71%) 43
Whole text 187955 (18.74%) 7987 (15.76%) 15
Beginning sample 170767 (17.02%) 12078 (23.84%) 14
Middle sample 151063 (15.06%) 7772 (15.34%) 11
End sample 26550 (2.64%) 0 (0%) 1
Composite 35863 (3.57%) 675 (1.33%) 2
Total 1002821 50658 86

Written reception statuswordss-unitstexts
Unknown 262460 (26.17%) 12646 (24.96%) 24
Low 226382 (22.57%) 9537 (18.82%) 19
Medium 256448 (25.57%) 13622 (26.89%) 19
High 257531 (25.68%) 14853 (29.32%) 24
Total 1002821 50658 86

Target audience sexwordss-unitstexts
Unknown 280387 (27.95%) 14950 (29.51%) 28
Male 20002 (1.99%) 0 (0%) 1
Female 40288 (4.01%) 2227 (4.39%) 3
Mixed 662144 (66.02%) 33481 (66.09%) 54
Total 1002821 50658 86

Written text time periodwordss-unitstexts
1975-1993 1002821 (100%) 50658 (100%) 86