Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data
Вантажиться...
Дата
2022
Автори
Науковий керівник
Назва журналу
Номер ISSN
Назва тому
Видавець
КПІ ім. Ігоря Сікорського
Анотація
The paper is devoted to improving semi-supervised clustering methods
and comparing their accuracy and robustness. The proposed approach is based on
expanding a clustering algorithm for using an available set of labels by replacing the
distance function. Using the distance function considers not only spatial data but
also available labels. Moreover, the proposed distance function could be adopted for
working with ordinal variables as labels. An extended approach is also considered,
based on a combination of unsupervised k-medoids methods, modified for using
only labeled data during the medoids calculation step, supervised method of k nearest
neighbor, and unsupervised k-means. The learning algorithm uses information about
the nearest points and classes’ centers of mass. The results demonstrate that even a
small amount of labeled data allows us to use semi-supervised learning, and proposed
modifications improve accuracy and algorithm performance, which was found
during experiments.
Опис
Ключові слова
center of mass, clustering, distance function, medoids, nearest neighbor, semi-supervised learning, центр мас, кластеризація, функція відстані, найближчий сусід, навчання з частковим залученням вчителя, медоід
Бібліографічний опис
Lyubchyk, L. M. Comparative analysis of modified semi-supervised learning algorithms on a small amount of labeled data / L. M. Lyubchyk, K. S. Yamkovyi // Системні дослідження та інформаційні технології : міжнародний науково-технічний журнал. – 2022. – № 4. – С. 34-43. – Бібліогр.: 11 назв.