GloWbE (pronounced like "globe") is related to
other large corpora
that we have created, including the 450 million word
Corpus of
Contemporary American English (COCA) and the 400 million word
Corpus of
Historical American English (COHA). Together, these three
corpora allow researchers to examine variation in English -- by
dialect,
genre, and over time -- in ways that are not possible
with any other large corpora of English.
SIZE: At the most basic level, GloWbE allows
you to search through a corpus that is more than four times as large as COCA (and nearly twenty times as large as the
British National
Corpus). This
means that where you might only have 10-12 tokens in the BNC and
50-60 in COCA, you might have 250-300 in GloWbE.
DIALECTS:
The real power of GloWbE, though, is
the ability to see the frequency of any word, phrase, or grammatical
construction in each of the 20 different countries. You can also
compare any features in two sets of dialects, such as British and
American English (in more than 775 million words of text for just
these two dialects). Or you could just limit your
search to one or two countries (e.g. Australia (148 million words),
South Africa (45 million), or Singapore (43 million)), and
you'll still be searching the largest online corpus for most of
these twenty countries.
|
In terms of searches, with GloWbE you can study an extremely wide range of phenomena (the same as with all of the other corpora from corpus.byu.edu): words, phrases, grammatical constructions, synonyms, customized lists, and collocates (nearby words, which provide insight into meaning and usage). In addition, for many of these searches, they are 5-6 times as fast as with other corpus architectures like Sketch Engine / CQPWeb.
To see a number of examples of what you can do with
the corpus, feel free to take a quick five minute
tour.
Source:http://corpus.byu.edu
No comments:
Post a Comment