Learning deep representations for ground-to-aerial geolocalization

Research output: Contribution to journal › Conference article › Research › peer-review

Standard

Learning deep representations for ground-to-aerial geolocalization. / Lin, Tsung Yi; Cui, Yin; Belongie, Serge; Hays, James.

In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 14.10.2015, p. 5007-5015.

Research output: Contribution to journal › Conference article › Research › peer-review

Harvard

Lin, TY, Cui, Y, Belongie, S & Hays, J 2015, 'Learning deep representations for ground-to-aerial geolocalization', Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 5007-5015. https://doi.org/10.1109/CVPR.2015.7299135

APA

Lin, T. Y., Cui, Y., Belongie, S., & Hays, J. (2015). Learning deep representations for ground-to-aerial geolocalization. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 5007-5015. https://doi.org/10.1109/CVPR.2015.7299135

Vancouver

Lin TY, Cui Y, Belongie S, Hays J. Learning deep representations for ground-to-aerial geolocalization. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2015 Oct 14;5007-5015. https://doi.org/10.1109/CVPR.2015.7299135

Author

Lin, Tsung Yi ; Cui, Yin ; Belongie, Serge ; Hays, James. / Learning deep representations for ground-to-aerial geolocalization. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2015 ; pp. 5007-5015.

Bibtex

@inproceedings{fb142a22c46c4709b68091f5f7f512ce,

title = "Learning deep representations for ground-to-aerial geolocalization",

abstract = "The recent availability of geo-tagged images and rich geospatial data has inspired a number of algorithms for image based geolocalization. Most approaches predict the location of a query image by matching to ground-level images with known locations (e.g., street-view data). However, most of the Earth does not have ground-level reference photos available. Fortunately, more complete coverage is provided by oblique aerial or 'bird's eye' imagery. In this work, we localize a ground-level query image by matching it to a reference database of aerial imagery. We use publicly available data to build a dataset of 78K aligned crossview image pairs. The primary challenge for this task is that traditional computer vision approaches cannot handle the wide baseline and appearance variation of these cross-view pairs. We use our dataset to learn a feature representation in which matching views are near one another and mismatched views are far apart. Our proposed approach, Where-CNN, is inspired by deep learning success in face verification and achieves significant improvements over traditional hand-crafted features and existing deep features learned from other large-scale databases. We show the effectiveness of Where-CNN in finding matches between street view and aerial view imagery and demonstrate the ability of our learned features to generalize to novel locations.",

author = "Lin, {Tsung Yi} and Yin Cui and Serge Belongie and James Hays",

note = "Publisher Copyright: {\textcopyright} 2015 IEEE.; IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015 ; Conference date: 07-06-2015 Through 12-06-2015",

year = "2015",

month = oct,

day = "14",

doi = "10.1109/CVPR.2015.7299135",

language = "English",

pages = "5007--5015",

journal = "I E E E Conference on Computer Vision and Pattern Recognition. Proceedings",

issn = "1063-6919",

publisher = "Institute of Electrical and Electronics Engineers",

}

RIS

TY - GEN

T1 - Learning deep representations for ground-to-aerial geolocalization

AU - Lin, Tsung Yi

AU - Cui, Yin

AU - Belongie, Serge

AU - Hays, James

PY - 2015/10/14

Y1 - 2015/10/14

N2 - The recent availability of geo-tagged images and rich geospatial data has inspired a number of algorithms for image based geolocalization. Most approaches predict the location of a query image by matching to ground-level images with known locations (e.g., street-view data). However, most of the Earth does not have ground-level reference photos available. Fortunately, more complete coverage is provided by oblique aerial or 'bird's eye' imagery. In this work, we localize a ground-level query image by matching it to a reference database of aerial imagery. We use publicly available data to build a dataset of 78K aligned crossview image pairs. The primary challenge for this task is that traditional computer vision approaches cannot handle the wide baseline and appearance variation of these cross-view pairs. We use our dataset to learn a feature representation in which matching views are near one another and mismatched views are far apart. Our proposed approach, Where-CNN, is inspired by deep learning success in face verification and achieves significant improvements over traditional hand-crafted features and existing deep features learned from other large-scale databases. We show the effectiveness of Where-CNN in finding matches between street view and aerial view imagery and demonstrate the ability of our learned features to generalize to novel locations.

AB - The recent availability of geo-tagged images and rich geospatial data has inspired a number of algorithms for image based geolocalization. Most approaches predict the location of a query image by matching to ground-level images with known locations (e.g., street-view data). However, most of the Earth does not have ground-level reference photos available. Fortunately, more complete coverage is provided by oblique aerial or 'bird's eye' imagery. In this work, we localize a ground-level query image by matching it to a reference database of aerial imagery. We use publicly available data to build a dataset of 78K aligned crossview image pairs. The primary challenge for this task is that traditional computer vision approaches cannot handle the wide baseline and appearance variation of these cross-view pairs. We use our dataset to learn a feature representation in which matching views are near one another and mismatched views are far apart. Our proposed approach, Where-CNN, is inspired by deep learning success in face verification and achieves significant improvements over traditional hand-crafted features and existing deep features learned from other large-scale databases. We show the effectiveness of Where-CNN in finding matches between street view and aerial view imagery and demonstrate the ability of our learned features to generalize to novel locations.

UR - http://www.scopus.com/inward/record.url?scp=84959245070&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2015.7299135

DO - 10.1109/CVPR.2015.7299135

M3 - Conference article

AN - SCOPUS:84959245070

SP - 5007

EP - 5015

JO - I E E E Conference on Computer Vision and Pattern Recognition. Proceedings

JF - I E E E Conference on Computer Vision and Pattern Recognition. Proceedings

SN - 1063-6919

T2 - IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015

Y2 - 7 June 2015 through 12 June 2015

ER -

ID: 301829041

Forskning