alwaysaditi's picture
End of training
dc78b20 verified
an unsupervised method for detecting grammatical errors we present an unsupervised method for detecting grammatical errors by inferring negative evidence from edited textual corpora. the system was developed and tested using essay-length responses to prompts on the test of english as a foreign language (toefl). the error-recognition system, alek, performs with about 80% precision and 20% recall. we attempt to identify errors on the basis of context -- more specifically a 2 word window around the word of interest, from which we consider function words and pos tags. we use a mutual information measure in addition to raw frequency of n grams. the grammar feature covers errors such as sentence fragments, verb form errors and pronoun errors. we utilize mutual information and chi-square statistics to identify typical contexts for a small set of targeted words from a large well-formed corpus.