Evaluation of Agreement-Related E-mail Classification Models with Unbalanced Classes

Marcin Hernes; Artur Rot; Ewa Walaszczyk; Janusz Tyburcy; Abigail Hanczyk

Evaluation of Agreement-Related E-mail Classification Models with Unbalanced Classes

Marcin Hernes, Artur Rot, Ewa Walaszczyk, Janusz Tyburcy, Abigail Hanczyk

European Research Studies Journal, Volume XXIX, Issue 2, 345-356, 2026

DOI: 10.35808/ersj/4363

Abstract:

Purpose: The aim of the research is to evaluate the effectiveness of classification models of agreement-related emails with imbalanced classes, which allows for a more comprehensive assessment of their performance under severely imbalanced data and a better understanding of their behaviour in practical applications. Design/Methodology/Approach: The following machine learning classification methods have been used: Complement Naive Bayes, Logistic Regression, Random Forest, and Support Vector Machine. Findings: This research evaluated the effectiveness of classification models for agreement-related emails with imbalanced classes. Random Forest and Support Vector Machine achieve high values for both Accuracy and balanced Accuracy, demonstrating their strong classification performance. Practical Implications: Random Forest and Support Vector Machine can be implemented in intelligent information systems for a mail dispatcher. Correspondence can be automatically routed to the person responsible for handling the inquiry. This speeds up the process and minimises the risk of an inquiry being overlooked or left unanswered. Originality/Value: Despite a large body of research on email classification, there is still a lack of studies focused on specific applications, such as agreement document classification. In particular, it is rare to simultaneously examine different models and compare their performance using multiple metrics within a single real-world problem.

Download Article

Cite Article (APA Style)

EUROPEAN RESEARCH STUDIES JOURNAL ISSN: 1108-2976p / 3057-4331e

Evaluation of Agreement-Related E-mail Classification Models with Unbalanced Classes