Форма представления | Статьи в зарубежных журналах и сборниках |
Год публикации | 2017 |
Язык | английский |
|
Гатауллин Рамиль Раисович, автор
Гильмуллин Ринат Абрекович, автор
Сулейманов Джавдет Шевкетович, автор
Хакимов Булат Эрнстович, автор
|
Библиографическое описание на языке оригинала |
Gataullin R, Khakimov B, Suleymanov D, Context-Based Rules for Grammatical Disambiguation in the Tatar Language//Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - 2017. - Vol.10449 LNAI, Is.. - P.529-537. |
Аннотация |
The paper is dedicated to the problem of grammatical ambiguity in the Tatar National Corpus and describes the methodology and software used for automation of the disambiguation process. Grammatical ambiguity is widely represented in agglutinative languages like Turkic or Finno-Ugric. Disambiguation in the corpus is based on the context-oriented classification of ambiguity types which has been carried out on corpus data in the Tatar language for the first time. In this study the corpus is used as a source for the research and at the same time as a destination for implementing the results. The grammatical ambiguity types are detected automatically using the finite-state morphological analyzer and then classified. In order to build up the grammatically disambiguated subcorpus, a special software module was developed. It searches for ambiguous tokens in the corpus, collects statistical information and allows creating and implementing the formal context-based disambiguation rules. |
Ключевые слова |
Disambiguation, Grammatical Homonymy, Context-based Rules, Linguistic Software, Turkic Languages, Corpus Linguistics |
Название журнала |
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
|
URL |
https://www.scopus.com/inward/record.uri?eid=2-s2.0-85030853927&doi=10.1007%2f978-3-319-67077-5_51&partnerID=40&md5=7b27e8905839820c051a4e511460c4a3 |
Пожалуйста, используйте этот идентификатор, чтобы цитировать или ссылаться на эту карточку |
https://repository.kpfu.ru/?p_id=166145 |
Полная запись метаданных |
Поле DC |
Значение |
Язык |
dc.contributor.author |
Гатауллин Рамиль Раисович |
ru_RU |
dc.contributor.author |
Гильмуллин Ринат Абрекович |
ru_RU |
dc.contributor.author |
Сулейманов Джавдет Шевкетович |
ru_RU |
dc.contributor.author |
Хакимов Булат Эрнстович |
ru_RU |
dc.date.accessioned |
2017-01-01T00:00:00Z |
ru_RU |
dc.date.available |
2017-01-01T00:00:00Z |
ru_RU |
dc.date.issued |
2017 |
ru_RU |
dc.identifier.citation |
Gataullin R, Khakimov B, Suleymanov D, Context-Based Rules for Grammatical Disambiguation in the Tatar Language//Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). - 2017. - Vol.10449 LNAI, Is.. - P.529-537. |
ru_RU |
dc.identifier.uri |
https://repository.kpfu.ru/?p_id=166145 |
ru_RU |
dc.description.abstract |
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
ru_RU |
dc.description.abstract |
The paper is dedicated to the problem of grammatical ambiguity in the Tatar National Corpus and describes the methodology and software used for automation of the disambiguation process. Grammatical ambiguity is widely represented in agglutinative languages like Turkic or Finno-Ugric. Disambiguation in the corpus is based on the context-oriented classification of ambiguity types which has been carried out on corpus data in the Tatar language for the first time. In this study the corpus is used as a source for the research and at the same time as a destination for implementing the results. The grammatical ambiguity types are detected automatically using the finite-state morphological analyzer and then classified. In order to build up the grammatically disambiguated subcorpus, a special software module was developed. It searches for ambiguous tokens in the corpus, collects statistical information and allows creating and implementing the formal context-based disambiguation rules. |
ru_RU |
dc.language.iso |
ru |
ru_RU |
dc.subject |
Disambiguation |
ru_RU |
dc.subject |
Grammatical Homonymy |
ru_RU |
dc.subject |
Context-based Rules |
ru_RU |
dc.subject |
Linguistic Software |
ru_RU |
dc.subject |
Turkic Languages |
ru_RU |
dc.subject |
Corpus Linguistics |
ru_RU |
dc.title |
Context-Based Rules for Grammatical Disambiguation in the Tatar Language |
ru_RU |
dc.type |
Статьи в зарубежных журналах и сборниках |
ru_RU |
|