Visual recognition with humans in the loop

Research output: Contribution to journal › Conference article › Research › peer-review

Steve Branson
Catherine Wah
Florian Schroff
Boris Babenko
Peter Welinder
Pietro Perona
Belongie, Serge

We present an interactive, hybrid human-computer method for object classification. The method applies to classes of objects that are recognizable by people with appropriate expertise (e.g., animal species or airplane model), but not (in general) by people without such expertise. It can be seen as a visual version of the 20 questions game, where questions based on simple visual attributes are posed interactively. The goal is to identify the true class while minimizing the number of questions asked, using the visual content of the image. We introduce a general framework for incorporating almost any off-the-shelf multi-class object recognition algorithm into the visual 20 questions game, and provide methodologies to account for imperfect user responses and unreliable computer vision algorithms. We evaluate our methods on Birds-200, a difficult dataset of 200 tightly-related bird species, and on the Animals With Attributes dataset. Our results demonstrate that incorporating user input drives up recognition accuracy to levels that are good enough for practical applications, while at the same time, computer vision reduces the amount of human interaction required.

Original language	English
Journal	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Issue number	PART 4
Pages (from-to)	438-451
Number of pages	14
ISSN	0302-9743
DOIs	https://doi.org/10.1007/978-3-642-15561-1_32
Publication status	Published - 2010
Externally published	Yes
Event	11th European Conference on Computer Vision, ECCV 2010 - Heraklion, Crete, Greece Duration: 10 Sep 2010 → 11 Sep 2010

Conference

Conference	11th European Conference on Computer Vision, ECCV 2010
Country	Greece
City	Heraklion, Crete
Period	10/09/2010 → 11/09/2010
Sponsor	DAGM, IBM, NICTA

ID: 302048098

Forskning

Visual recognition with humans in the loop

Conference