Method for combining CNN-based features with geometric facial descriptors in emotion recognition

Вантажиться...
Ескіз

Дата

2025

Науковий керівник

Назва журналу

Номер ISSN

Назва тому

Видавець

National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute"

Анотація

This study presents a method for combining CNN-based visual features with geometric facial descriptors to improve the accuracy of emotion recognition in static images. The method integrates deep convolutional embeddings extracted from a pre-trained ResNetV2_101 model within the ML.NET framework with handcrafted geometric features computed from facial landmarks. Open-source datasets containing labeled emotional categories were used for experiments. At the first stage, deep image embeddings were obtained through transfer learning. At the second stage, 68 facial landmarks were detected to calculate distances and proportional relationships such as interocular distance, mouth width, eyebrow height, and other geometry-based indicators. These visual and geometric representations were concatenated into a unified feature space and classified using a multiclass linear model. The hybrid method achieved approximately 4% higher accuracy than the baseline CNN model relying solely on pixel-level features (from about 63% to 67%), confirming that combining heterogeneous features enhances generalization and robustness. The results also highlight that geometric descriptors act as stabilizing factors, compensating for noise, occlusions, and lighting variations that degrade CNN-only models. The developed pipeline demonstrates the feasibility of integrating interpretable geometric cues with deep embeddings directly in C# using ML.NET. The research novelty lies in proposing an interpretable hybrid model for emotion recognition that improves reliability while maintaining compatibility with .NET-based applications. The approach offers an accessible solution for developers working within enterprise .NET ecosystems, enabling direct deployment without cross-language integration. Future research will focus on extending the model toward multimodal emotion analysis that incorporates speech, gesture, and physiological signals to enhance contextual understanding of affective states. Additionally, the hybrid model can serve as a diagnostic tool for studying emotion dynamics in psychological or behavioral research.

Опис

Ключові слова

emotion recognition, facial landmarks, convolutional neural networks, .NET framework, featurefusion, розпiзнавання емоцiй, ключовi точки обличчя, згортковi нейроннi мережi, .NET фреймворк, метод поєднання ознак

Бібліографічний опис

Zinchenko, L. Method for combining CNN-based features with geometric facial descriptors in emotion recognition / Liudmyla Zinchenko // Information, Computing and Intelligent systems. – 2025. – No. 7. – P. 127-141. – Bibliogr.: 16 ref.

ORCID