School of Applied Language and Intercultural Studies
Dublin City University, Dublin, Ireland sylvie.thouesny2@mail.dcu.ie, francoise.blin@dcu.ie
Abstract
Learner models enable intelligent tutoring systems to observe, record and analyse language learner input. A diagnostic element provides the system with information about the learner’s knowledge state, i.e. the individual’s weaknesses and strengths in the target language, and information on learners’ progress is normally collected via answers to predefined written production types. However storing information on language learners’ knowledge level from free written productions is more complex as this requires the use of instruments that can discriminate between errors and mistakes, where errors represent gaps in a learner's knowledge, and mistakes occasional lapses in performance.
Following an overview of instruments normally used to “measure” language learners’ weaknesses and strengths, this paper argues that identifying correct as well as incorrect forms provides us with a better insight into the language learner's knowledge state at any given time. It describes how a computer assisted error encoding program, Markin, can be used in conjunction with a probabilistic part-of-speech tagging tool, TreeTagger to tag a corpus of free texts produced by language learners. It explicates how a detailed analysis of such tagged corpora can assist in the discrimination between errors and mistakes. The results of a preliminary analysis focusing on morpho-syntactic errors produced by learners of French in a range of free texts are then presented and the reliability and validity of the instruments used are discussed.