In the process of learning a new language, it is very important that learners start to observe the language themselves and discover the regularities that exist in it since textbooks and grammars often lack authentic material. Although there is a lot of information on the internet, much of it is too complicated for language learners. In this respect, students and teachers can benefit from the newly developed pedagogic corpus and learner corpus (representing different levels of language competence), which can be searched independently. Learners of Lithuanian will also find a lot of useful information in the corpus-based lexical database and other tools.

The website provides six digital resources for language learning. 

  1. The Pedagogic Corpus of Lithuanian is an electronic database of texts representing levels A1, A2, B1, and B2; it includes texts which are relevant to language learners and can be understood by them. The corpus can be useful for both Lithuanian language learners and teachers. By examining a search word in the corpus, users will be able to understand how this word is used in current Lithuanian and what it means. The authentic examples of language use obtained from the corpus can be used when preparing teaching materials targeting different learner competencies in the areas of vocabulary and grammar.
  2. The Lithuanian Learner Corpus consists of four components, each representing a different language level (A1, A2, B1, and B2). Such a corpus is especially important for language teachers since it provides evidence as to which categories are more difficult or easier to master for learners with different L1 backgrounds. The data (including concordance lines, frequency information, and frequency lists) can be used for research and teaching purposes. The corpus can be browsed online, and some data can be downloaded (e.g. concordance lines and frequency lists).
  3. Morphologically Annotated Corpus. Learners of Lithuanian, which is as a highly morphologically complex language, often encounter problems with inflectional forms. Therefore, it is important to have access to a large body of usage examples of a certain grammatical category in one place. Using this morphologically annotated corpus, different grammatical categories (e.g. gender, case, tense, person, etc.) can be found. Users can also search for different combinations of grammatical features (e.g. one can search for a specific adjective within two words of a noun); the search results can be downloaded. This corpus will help to better understand the grammar system of the Lithuanian language.
  4. The Lexical Database of Lithuanian Language Usage is an electronic resource developed on the basis of the written data of the Pedagogic Corpus. The lexical database contains 3,700 lexical items: 700 most frequent words in the pedagogic corpus as well as derivatives and multi-word units related to these most frequent words. The lexical and grammatical regularities of the most common words are represented in usage patterns and examples, and the use of derivatives and fixed expressions associated with the most frequent words is illustrated by examples only. In this database, language learners, teachers, and developers of teaching materials will find information on word pronunciation, stress, inflectional forms, as well as derivational and lexical interrelations between different items. Usage patterns and examples will be useful in developing learners’ productive skills.
  5. The pronunciation resources consist of four components: (1) the Pronunciation Dictionary of Contemporary Lithuanian, which contains around 150,000 accentuated and transcribed headwords; (2) the Pronunciation Dictionary for Non-native Speakers, which includes almost 6,000 most frequent words in the Pedagogic Corpus, (3) the transcriber, which automatically transcribes words, their combinations, or entire clauses; and (4) the sound inventory, where users can listen to the actual pronunciation of some words. All examples are transcribed in the characters of the International Phonetic Alphabet and presented in Computer Font Palemonas.
  6. The Automated Accentuation Programme can be used to accentuate words in a coherent text. This is especially important for language teachers preparing reading or other teaching tasks for classroom activities. This tool can also help language learners in reading texts correctly or in checking the word stress. To meet the needs of Lithuanian language teachers, stressed forms of homonymous words (i.e. words which are written identically but pronounced differently) are provided. The programme includes almost all the variants of stress recommended by the State Commission of the Lithuanian Language and those represented in the Dictionary of Current Lithuanian.

This portal was developed within the framework of the project “Lithuanian Academic Scheme for International Cooperation in Baltic Studies” (No. 09.3.1-ESFA-V-709-01-0002).