Corpora as Digital Humanities Tools for Learning Foreign Languages

Presented by: Iryna Dilay

One of the major advantages of a language corpus as a digital humanities tool is the natural language occurrences stored and preprocessed for future applications. Sinclair’s claim that “one does not study all of botany by making artificial flowers” (1991, 6) has become the quintessence of a corpus methodology as purely empirical. Corpus resources are particularly beneficial for those who rely on descriptive, rather than prescriptive approaches to language.

The focus of the current research is on the efficiency of the electronic corpora for learning foreign languages. The essence of the methodology is in building up one’s own rules and inferences given vast empirical evidence. A number of user-friendly corpus tools, such as concordancers, taggers, parsers, frequency counts can facilitate, accelerate and validate an information search for a learner-centered data-driven language classroom as well as for self-study. Annotated language corpora contain valuable samples of both written and oral phonetic, semantic, morphological, syntactic and pragmatic information. Among the most problematic language issues that can be tackled with the help of corpora are learning synonyms, word polysemy, collocations, semantic prosody, prepositions, word order, register and genre peculiarities, and language variation. Attested e-corpora as open sources are especially beneficial for the foreign language learners who lack direct exposure to the natural language they study.

Overall, the corpus-driven language learning proves to be a dynamic and largely cognitive method whereby the learners’ motivation can be significantly enhanced. Nonetheless, the limitations of using corpora in language learning, and the potential pitfalls arising from their uncritical use will be also addressed in the presentation.

Speak Your Mind

*