BLOG POST III – CLUVI

The CLUVI (Linguistic Corpus of the University of Vigo) is an open set of parallel textual corpora of specialized registers of contemporary Galician language developed by the SLI (Computational Linguistics Group of the University of Vigo) and publicly available in its website since September 2003. The CLUVI Corpus contains over 22 million words, and its main components are the TECTRA Corpus of English-Galician literary texts, the FEGA Corpus of French-Galician literary texts, the LEGA Corpus of Galician-Spanish legal texts, the UNESCO Corpus of English-Galician-French-Spanish scientific-technical divulgation texts, the LOGALIZA Corpus of English-Galician software localization, and the CONSUMER Corpus of Spanish-Galician-Catalan-Basque consumer information. The public searching and browsing tool designed by the SLI is available at http://sli.uvigo.es/CLUVI/.

This web application permits both simple and very complex searches of isolated words or sequences of words, and shows the multilingual equivalences of the terms in context, as found in real and referenced translations. The terms searched can correspond to either of the languages of the translation, but it is also possible to carry out true multilingual searches, that is, to simultaneously search one term from each of the languages of translation. The number of aligned works and language pairs available in the website increases regularly, since the CLUVI is a academic research project in progress and with great vitality. At the moment, the CLUVI Parallel Corpus webpage permits to search five major corpora -TECTRA, FEGA, LEGA, UNESCO and LOGALIZA-, as well as other minor parallel corpora now in progress. It should be pointed out that the CLUVI interface also permits to browse the TURIGAL Corpus of Portuguese-English tourism texts, and the Legebiduna Corpus of Basque-Spanish administrative texts developed by the DEL group at the U. of Deusto.

CLUVI

Post a Comment