Sketch Engine - English corpora available online

  •  
  • 1825
  • 2
  • 1
  • English 
Apr 24, 2011 22:56
I've just found a great web site called "Sketch Engine."
http://the.sketchengine.co.uk/open/

This site is an interface to a great collections of enormous databases (corpora) for real English usage.
It seems that you need to pay to get full access to the corpora (58.60 euro per year), but you can access to four open corpora, which are free to use. This site may be for professional linguists, and I probably need a bit of practise to get used to this. But I think this is useful for English writing in my job.

So what's great about this?

It provides many different data such as concordance (the left picture), collocations, and frequency list for each word or phrase, but I found "Word Sketch" function is very cool.

In Word Sketch results of "information" (the right picture) shows lists of words classified by part of speech.
http://the.sketchengine.co.uk/open/corpus/aclarc/ske/wsketch?lemma=information&lpos=-n&minfreq=auto&minscore=0.0&maxitems=25&sort_ws_columns=s&clustercolls=0&minsim=0.15

If you see the "adj_subject_of" section, you can see that the word "available" has the highest score. This means phrase "available information" is the most commonly used word pair which comprise an adjective and "information".


Now "Word Sketch" function of the Sketch Engine (for ACL Anthology Reference Corpus) is available from Firefox search box (or Internet Explorer, Google Chrome).
Follow the link below and click "OpenSearch プラグイン Word Sketch" button with Firefox icon (or other icons).


(日本語版)
http://p.tl/RBvh


(English version)
http://p.tl/cyj2