CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Standard

CPSeg : Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting. / Li, Lei.

2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE, 2024. s. 502-511.

Publikation: Bidrag til bog/antologi/rapportKonferencebidrag i proceedingsForskningfagfællebedømt

Harvard

Li, L 2024, CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting. i 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE, s. 502-511, WACV 2024 - IEEE/CVF Winter Conference on Applications of Computer Vision , Waikola, Hawaii, USA, 04/01/2024. https://doi.org/10.1109/WACV57701.2024.00057

APA

Li, L. (2024). CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting. I 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (s. 502-511). IEEE. https://doi.org/10.1109/WACV57701.2024.00057

Vancouver

Li L. CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting. I 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE. 2024. s. 502-511 https://doi.org/10.1109/WACV57701.2024.00057

Author

Li, Lei. / CPSeg : Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting. 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). IEEE, 2024. s. 502-511

Bibtex

@inproceedings{3f02def42ef141358cb67de9ee1a1a71,
title = "CPSeg: Finer-grained Image Semantic Segmentation via Chain-of-Thought Language Prompting",
abstract = "Natural scene analysis and remote sensing imagery offer immense potential for advancements in large-scale language-guided context-aware data utilization. This potential is particularly significant for enhancing performance in downstream tasks such as object detection and segmentation with designed language prompting. In light of this, we introduce the CPSeg (Chain-of-Thought Language Prompting for Finer-grained Semantic Segmentation), an innovative framework designed to augment image segmentation performance by integrating a novel {"}Chain-of-Thought{"} process that harnesses textual information associated with images. This groundbreaking approach has been applied to a flood disaster scenario. CPSeg encodes prompt texts derived from various sentences to formulate a coherent chain-of-thought. We use a new vision-language dataset, FloodPrompt, which includes images, semantic masks, and corresponding text information. This not only strengthens the semantic understanding of the scenario but also aids in the key task of semantic segmentation through an interplay of pixel and text matching maps. Our qualitative and quantitative analyses validate the effectiveness of CPSeg.",
author = "Lei Li",
year = "2024",
doi = "10.1109/WACV57701.2024.00057",
language = "English",
pages = "502--511",
booktitle = "2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)",
publisher = "IEEE",
note = "WACV 2024 - IEEE/CVF Winter Conference on Applications of Computer Vision ; Conference date: 04-01-2024 Through 08-01-2024",

}

RIS

TY - GEN

T1 - CPSeg

T2 - WACV 2024 - IEEE/CVF Winter Conference on Applications of Computer Vision

AU - Li, Lei

PY - 2024

Y1 - 2024

N2 - Natural scene analysis and remote sensing imagery offer immense potential for advancements in large-scale language-guided context-aware data utilization. This potential is particularly significant for enhancing performance in downstream tasks such as object detection and segmentation with designed language prompting. In light of this, we introduce the CPSeg (Chain-of-Thought Language Prompting for Finer-grained Semantic Segmentation), an innovative framework designed to augment image segmentation performance by integrating a novel "Chain-of-Thought" process that harnesses textual information associated with images. This groundbreaking approach has been applied to a flood disaster scenario. CPSeg encodes prompt texts derived from various sentences to formulate a coherent chain-of-thought. We use a new vision-language dataset, FloodPrompt, which includes images, semantic masks, and corresponding text information. This not only strengthens the semantic understanding of the scenario but also aids in the key task of semantic segmentation through an interplay of pixel and text matching maps. Our qualitative and quantitative analyses validate the effectiveness of CPSeg.

AB - Natural scene analysis and remote sensing imagery offer immense potential for advancements in large-scale language-guided context-aware data utilization. This potential is particularly significant for enhancing performance in downstream tasks such as object detection and segmentation with designed language prompting. In light of this, we introduce the CPSeg (Chain-of-Thought Language Prompting for Finer-grained Semantic Segmentation), an innovative framework designed to augment image segmentation performance by integrating a novel "Chain-of-Thought" process that harnesses textual information associated with images. This groundbreaking approach has been applied to a flood disaster scenario. CPSeg encodes prompt texts derived from various sentences to formulate a coherent chain-of-thought. We use a new vision-language dataset, FloodPrompt, which includes images, semantic masks, and corresponding text information. This not only strengthens the semantic understanding of the scenario but also aids in the key task of semantic segmentation through an interplay of pixel and text matching maps. Our qualitative and quantitative analyses validate the effectiveness of CPSeg.

U2 - 10.1109/WACV57701.2024.00057

DO - 10.1109/WACV57701.2024.00057

M3 - Article in proceedings

SP - 502

EP - 511

BT - 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

PB - IEEE

Y2 - 4 January 2024 through 8 January 2024

ER -

ID: 378943255