Building Sense Representations in Danish by Combining Word Embeddings with Lexical Resources

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt


Our aim is to identify suitable sense representations for NLP in Danish. We investigate sense inventories that correlate with human interpretations of word meaning and ambiguity as typically described in dictionaries and wordnets and that are well reflected distributionally
as expressed in word embeddings. To this end, we study a number of highly ambiguous Danish nouns and examine the effectiveness of
sense representations constructed by combining vectors from a distributional model with the information from a wordnet. We establish
representations based on centroids obtained from wordnet synsets and example sentences as well as representations established via
a clustering approach; these representations are tested in a word sense disambiguation task. We conclude that the more information
extracted from the wordnet entries (example sentence, definition, semantic relations) the more successful the sense representation vector.
TitelGlobalex Workshop on Linked Lexicography : LREC 2020 Workshop Language Resources and Evaluation Conference
Antal sider7
UdgivelsesstedMarseille, France
ForlagEuropean Language Resources Association
ISBN (Elektronisk)979-10-95546-46-7
StatusUdgivet - 2020

Antal downloads er baseret på statistik fra Google Scholar og

Ingen data tilgængelig

ID: 241359613