Fixed versus dynamic co-occurrence windows in TextRank term weights for information retrieval

Research output: Chapter in Book/Report/Conference proceedingArticle in proceedingsResearchpeer-review

TextRank is a variant of PageRank typically used in graphs that represent documents, and where vertices denote terms and edges denote relations between terms. Quite often the relation between terms is simple term co-occurrence within a fixed window of k terms. The output of TextRank when applied iteratively is a score for each vertex, i.e. a term weight, that can be used for information retrieval (IR) just like conventional term frequency based term weights.

So far, when computing TextRank term weights over co-occurrence graphs, the window of term co-occurrence is always fixed. This work departs from this, and considers dynamically adjusted windows of term co-occurrence that follow the document structure on a sentence- and paragraph-level. The resulting TextRank term weights are used in a ranking function that re-ranks 1000 initially returned search results in order to improve the precision of the ranking. Experiments with two IR collections show that adjusting the vicinity of term co-occurrence when computing TextRank term weights can lead to gains in early precision.
Original languageEnglish
Title of host publicationProceedings of the 35th international ACM SIGIR Conference on Research and Development in Information Retrieval
Number of pages2
PublisherAssociation for Computing Machinery
Publication date2012
Pages1079-1080
ISBN (Electronic)978-1-4503-1472-5
DOIs
Publication statusPublished - 2012
Event35th International ACM SIGIR Conference on Research and Development in Information Retrieval - Oregon, United States
Duration: 12 Aug 201216 Aug 2012
Conference number: 35

Conference

Conference35th International ACM SIGIR Conference on Research and Development in Information Retrieval
Nummer35
LandUnited States
ByOregon
Periode12/08/201216/08/2012

ID: 38239990