Class-balanced loss based on effective number of samples
Research output: Contribution to journal › Conference article › Research › peer-review
Standard
Class-balanced loss based on effective number of samples. / Cui, Yin; Jia, Menglin; Lin, Tsung Yi; Song, Yang; Belongie, Serge.
In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 06.2019, p. 9260-9269.Research output: Contribution to journal › Conference article › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Class-balanced loss based on effective number of samples
AU - Cui, Yin
AU - Jia, Menglin
AU - Lin, Tsung Yi
AU - Song, Yang
AU - Belongie, Serge
N1 - Funding Information: Our proposed framework provides a non-parametric means of quantifying data overlap, since we don’t make any assumptions about the data distribution. This makes our loss generally applicable to a wide range of existing models and loss functions. Intuitively, a better estimation of the effective number of samples could be obtained if we know the data distribution. In the future, we plan to extend our frame-work by incorporating reasonable assumptions on the data distribution or designing learning-based, adaptive methods. Acknowledgment. This work was supported in part by a Google Focused Research Award. Publisher Copyright: © 2019 IEEE.
PY - 2019/6
Y1 - 2019/6
N2 - With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula (1-beta {n})/(1-beta), where n is the number of samples and beta in [0,1) is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.
AB - With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula (1-beta {n})/(1-beta), where n is the number of samples and beta in [0,1) is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.
KW - Categorization
KW - Computer Vision Theory
KW - Deep Learning
KW - Recognition: Detection
KW - Retrieval
UR - http://www.scopus.com/inward/record.url?scp=85078723293&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2019.00949
DO - 10.1109/CVPR.2019.00949
M3 - Conference article
AN - SCOPUS:85078723293
SP - 9260
EP - 9269
JO - I E E E Conference on Computer Vision and Pattern Recognition. Proceedings
JF - I E E E Conference on Computer Vision and Pattern Recognition. Proceedings
SN - 1063-6919
T2 - 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
Y2 - 16 June 2019 through 20 June 2019
ER -
ID: 301824490