Retrieving radio news broadcasts in Danish: accuracy and categorization of unrecognized words
Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt
Standard
Retrieving radio news broadcasts in Danish : accuracy and categorization of unrecognized words. / Hertzum, Morten; Lund, Haakon; Troelsgård, Rasmus.
OzCHI'16: The 28th Australian Conference on Compute-Human Interaction. New York : ACM, 2016. s. 160-164.Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Retrieving radio news broadcasts in Danish
T2 - Australian Conference on Human-Computer Interaction
AU - Hertzum, Morten
AU - Lund, Haakon
AU - Troelsgård, Rasmus
N1 - Conference code: 28
PY - 2016
Y1 - 2016
N2 - Digital archives of radio news broadcasts can possibly be made searchable by combining speech recognition with information retrieval. We explore this possibility for the retrieval of news broadcasts in Danish. An average of 84% of the words in the broadcasts was recognized. Most of the unrecognized words were compounds, names, and other words that appear of value to retrieval. Thus, the set of words describing a broadcast has to be expanded to compensate for the recognition errors. We discuss doing this by exploiting the alternative matches from the speech recognizer and by extracting words from a related corpus
AB - Digital archives of radio news broadcasts can possibly be made searchable by combining speech recognition with information retrieval. We explore this possibility for the retrieval of news broadcasts in Danish. An average of 84% of the words in the broadcasts was recognized. Most of the unrecognized words were compounds, names, and other words that appear of value to retrieval. Thus, the set of words describing a broadcast has to be expanded to compensate for the recognition errors. We discuss doing this by exploiting the alternative matches from the speech recognizer and by extracting words from a related corpus
U2 - 10.1145/3010915.3010972
DO - 10.1145/3010915.3010972
M3 - Article in proceedings
SP - 160
EP - 164
BT - OzCHI'16
PB - ACM
CY - New York
Y2 - 29 November 2016 through 2 December 2016
ER -
ID: 168296307