An algorithm competition for automatic species identification from herbarium specimens
Research output: Contribution to journal › Journal article › Research › peer-review
Premise: Plant biodiversity is threatened, yet many species remain undescribed. It is estimated that >50% of undescribed species have already been collected and are awaiting discovery in herbaria. Robust automatic species identification algorithms using machine learning could accelerate species discovery. Methods: To encourage the development of an automatic species identification algorithm, we submitted our Herbarium 2019 data set to the Fine-Grained Visual Categorization sub-competition (FGVC6) hosted on the Kaggle platform. We chose to focus on the flowering plant family Melastomataceae because we have a large collection of imaged herbarium specimens (46,469 specimens representing 683 species) and taxonomic expertise in the family. As is common for herbarium collections, some species in this data set are represented by few specimens and others by many. Results: In less than three months, the FGVC6 Herbarium 2019 Challenge drew 22 teams who entered 254 models for Melastomataceae species identification. The four best algorithms identified species with >88% accuracy. Discussion: The FGVC competitions provide a unique opportunity for computer vision and machine learning experts to address difficult species-recognition problems. The Herbarium 2019 Challenge brought together a novel combination of collections resources, taxonomic expertise, and collaboration between botanists and computer scientists.
Original language | English |
---|---|
Article number | e11365 |
Journal | Applications in Plant Sciences |
Volume | 8 |
Issue number | 6 |
ISSN | 2168-0450 |
DOIs | |
Publication status | Published - 1 Jun 2020 |
Externally published | Yes |
Bibliographical note
Funding Information:
We thank the New York Botanical Garden for support and funding from the National Science Foundation (IAA‐1444192, DEB‐1343612 and DEB‐0818399 to F.A.M.). Special thanks to the staff of the New York Botanical Garden, particularly Kim Watson and Nichole Tiernan for all the specimen digitization work. We also thank the organizers of FGVC, the Kaggle platform, and all the Herbarium 2019 competitors for taking on the challenge of this data set.
Funding Information:
We thank the New York Botanical Garden for support and funding from the National Science Foundation (IAA-1444192, DEB-1343612 and DEB-0818399 to F.A.M.). Special thanks to the staff of the New York Botanical Garden, particularly Kim Watson and Nichole Tiernan for all the specimen digitization work. We also thank the organizers of FGVC, the Kaggle platform, and all the Herbarium 2019 competitors for taking on the challenge of this data set.
Publisher Copyright:
© 2020 Little et al. Applications in Plant Sciences is published by Wiley Periodicals, LLC on behalf of the Botanical Society of America
- artificial intelligence, computer vision, FGVC, herbarium specimen, Kaggle, machine learning, Melastomataceae
Research areas
ID: 301822923