Machine learning for financial transaction classification across companies using character-level word embeddings of text fields

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Standard

Machine learning for financial transaction classification across companies using character-level word embeddings of text fields. / Jørgensen, Rasmus Kær; Igel, Christian.

I: Intelligent Systems in Accounting, Finance and Management, Bind 28, Nr. 3, 2021, s. 159-172.

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Harvard

Jørgensen, RK & Igel, C 2021, 'Machine learning for financial transaction classification across companies using character-level word embeddings of text fields', Intelligent Systems in Accounting, Finance and Management, bind 28, nr. 3, s. 159-172. https://doi.org/10.1002/isaf.1500

APA

Jørgensen, R. K., & Igel, C. (2021). Machine learning for financial transaction classification across companies using character-level word embeddings of text fields. Intelligent Systems in Accounting, Finance and Management, 28(3), 159-172. https://doi.org/10.1002/isaf.1500

Vancouver

Jørgensen RK, Igel C. Machine learning for financial transaction classification across companies using character-level word embeddings of text fields. Intelligent Systems in Accounting, Finance and Management. 2021;28(3):159-172. https://doi.org/10.1002/isaf.1500

Author

Jørgensen, Rasmus Kær ; Igel, Christian. / Machine learning for financial transaction classification across companies using character-level word embeddings of text fields. I: Intelligent Systems in Accounting, Finance and Management. 2021 ; Bind 28, Nr. 3. s. 159-172.

Bibtex

@article{9aa241abc5154e1599591a24e2f501c5,
title = "Machine learning for financial transaction classification across companies using character-level word embeddings of text fields",
abstract = "An important initial step in accounting is mapping financial transfers to the corresponding accounts. We devised machine-learning-based systems that automate this process. They use word embeddings with character-level features to process transaction texts. When considering 473 companies independently, our approach achieved an average top-1 accuracy of 80.50%, outperforming baselines that exclude the transaction texts or rely on a lexical bag-of-words text representation. We extended the approach to generalizes across companies and even across different corporate sectors. After standardization of the account structures and careful feature engineering, a single classifier trained on 44 companies from 28 sectors achieved a test accuracy of more than 80%. When trained on 43 companies and tested on the remaining one, the system achieved an average performance of 64.62%. This rate increased to nearly 70% when considering only the largest sector.",
keywords = "accounting, finance, financial transactions, multiclass classification, random forest, word embedding",
author = "J{\o}rgensen, {Rasmus K{\ae}r} and Christian Igel",
note = "Publisher Copyright: {\textcopyright} 2021 John Wiley & Sons, Ltd.",
year = "2021",
doi = "10.1002/isaf.1500",
language = "English",
volume = "28",
pages = "159--172",
journal = "Intelligent Systems in Accounting, Finance and Management",
issn = "1550-1949",
publisher = "Wiley",
number = "3",

}

RIS

TY - JOUR

T1 - Machine learning for financial transaction classification across companies using character-level word embeddings of text fields

AU - Jørgensen, Rasmus Kær

AU - Igel, Christian

N1 - Publisher Copyright: © 2021 John Wiley & Sons, Ltd.

PY - 2021

Y1 - 2021

N2 - An important initial step in accounting is mapping financial transfers to the corresponding accounts. We devised machine-learning-based systems that automate this process. They use word embeddings with character-level features to process transaction texts. When considering 473 companies independently, our approach achieved an average top-1 accuracy of 80.50%, outperforming baselines that exclude the transaction texts or rely on a lexical bag-of-words text representation. We extended the approach to generalizes across companies and even across different corporate sectors. After standardization of the account structures and careful feature engineering, a single classifier trained on 44 companies from 28 sectors achieved a test accuracy of more than 80%. When trained on 43 companies and tested on the remaining one, the system achieved an average performance of 64.62%. This rate increased to nearly 70% when considering only the largest sector.

AB - An important initial step in accounting is mapping financial transfers to the corresponding accounts. We devised machine-learning-based systems that automate this process. They use word embeddings with character-level features to process transaction texts. When considering 473 companies independently, our approach achieved an average top-1 accuracy of 80.50%, outperforming baselines that exclude the transaction texts or rely on a lexical bag-of-words text representation. We extended the approach to generalizes across companies and even across different corporate sectors. After standardization of the account structures and careful feature engineering, a single classifier trained on 44 companies from 28 sectors achieved a test accuracy of more than 80%. When trained on 43 companies and tested on the remaining one, the system achieved an average performance of 64.62%. This rate increased to nearly 70% when considering only the largest sector.

KW - accounting

KW - finance

KW - financial transactions

KW - multiclass classification

KW - random forest

KW - word embedding

U2 - 10.1002/isaf.1500

DO - 10.1002/isaf.1500

M3 - Journal article

AN - SCOPUS:85114320023

VL - 28

SP - 159

EP - 172

JO - Intelligent Systems in Accounting, Finance and Management

JF - Intelligent Systems in Accounting, Finance and Management

SN - 1550-1949

IS - 3

ER -

ID: 280029661