Казанский (Приволжский) федеральный университет, КФУ
КАЗАНСКИЙ
ФЕДЕРАЛЬНЫЙ УНИВЕРСИТЕТ
 
RSS Ins Вконтакте twitter facebook
COMPUTING SYNTACTIC PARAMETERS FOR AUTOMATED TEXT COMPLEXITY ASSESSMENT
Форма представленияСтатьи в зарубежных журналах и сборниках
Год публикации2019
Языканглийский
  • Солнышкина Марина Ивановна, автор
  • Соловьев Валерий Дмитриевич, автор
  • Библиографическое описание на языке оригинала Solovyev V, Solnyshkina M, Ivanov V, Computing syntactic parameters for automated text complexity assessment//CEUR Workshop Proceedings. - 2019. - Vol.2475, Is.. - P.62-71.
    Аннотация The article focuses on identifying, extracting and evaluating syntactic parameters influencing the complexity of Russian academic texts. Our ultimate goal is to select a set of text features effectively measuring text complexity and build an automatic tool able to rank Russian academic texts according to grade levels. models based on the most promising features by using machine learning methods The innovative algorithm of designing a predictive model of text complexity is based on a training text corpus and a set of previously proposed and new syntactic features (average sentence length, average number of syllables per word, the number of adjectives, average number of participial constructions, average number of coordinating chains, path number, i.e. average number of sub-trees). Our best model achieves an MSE of 1.15. Our experiments indicate that by adding the abovementioned syntactic features, namely the average number of participial constructions, average number of coordinating chains, and the average number of sub-trees, the text complexity model performance will increase substantially
    Ключевые слова reading comprehension
    Название журнала CEUR Workshop Proceedings
    URL https://www.scopus.com/inward/record.uri?eid=2-s2.0-85074072027&partnerID=40&md5=b6a5b738cb7fe812903fb259b711dbe4
    Пожалуйста, используйте этот идентификатор, чтобы цитировать или ссылаться на эту карточку https://repository.kpfu.ru/?p_id=214668

    Полная запись метаданных