Surf these sites: British National Corpus -- A balanced synchronic text corpus containing 100 million words with morphosyntactic annotation. EuroWordNet:Building a multilingual database with wordnets for several European languages. -- A big lexical database/thesaurus containing 8 languages. Free downloadable data samples. Linguistic Data Consortium -- An open consortium which creates, collects and distributes speech and text databases, lexicons, and other resources for research and development purposes. Roget''s Thesaurus -- Project Gutenberg''s interface to Roget''s Thesaurus, an English thesaurus containing around 35,000 lexemes.
Help build the largest human-edited
directory on the web.