A Formal Model for Constructing Sensitive Data Graphs from Cyber Reports using Large Language Models
| dc.contributor.author | Turskyi, Viktor | |
| dc.date.accessioned | 2026-02-26T09:54:36Z | |
| dc.date.available | 2026-02-26T09:54:36Z | |
| dc.date.issued | 2025 | |
| dc.description.abstract | Unstructured cyber threat intelligence (CTI) reports present major challenges for systematic analysis, particularly when accuracy and reliability are critical. This paper introduces a formal, four-stage mathematical model for constructing canonical knowledge graphs from sensitive textual data. The model integrates the advanced extraction and reasoning capabilities of GPT-5 with deterministic rule-based inference and network analysis to bridge the “formalization gap” between probabilistic large language model (LLM) outputs and verifiable analytical structures. Using a corpus of 204 official CERT-UA incident reports as a test case, the methodology successfully normalized thousands of raw entities, identified central threat actors and high-value targets, and revealed distinct operational ecosystems within Ukraine’s cyber threat landscape. Theoretically, the study contributes a replicable and mathematically defined framework for integrating next-generation LLMs into formalized knowledge graph pipelines. Practically, it provides a scalable and reliable tool for analysts in cybersecurity, national security, and related fields, enabling the transformation of unstructured reports into actionable intelligence | |
| dc.format.pagerange | P. 98-107 | |
| dc.identifier.citation | Turskyi, V. A Formal Model for Constructing Sensitive Data Graphs from Cyber Reports using Large Language Models / Viktor Turskyi // Theoretical and Applied Cybersecurity: scientific journal. – 2025. – Vol. 7, No. 2. – P. 98-107. – Bibliogr.: 10 ref. | |
| dc.identifier.doi | https://doi.org/10.20535/tacs.2664-29132025.2.338785 | |
| dc.identifier.uri | https://ela.kpi.ua/handle/123456789/79076 | |
| dc.language.iso | en | |
| dc.publisher | Igor Sikorsky Kyiv Polytechnic Institute | |
| dc.publisher.place | Kyiv | |
| dc.relation.ispartof | Theoretical and Applied Cybersecurity: scientific journal, Vol. 7, No. 2 | |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/deed.uk | |
| dc.subject | Large Language Models (LLM) | |
| dc.subject | Cyber Threat Intelligence (CTI) | |
| dc.subject | Sensitive Data Analysis | |
| dc.subject | Network Analysis | |
| dc.subject | Entity Resolution | |
| dc.subject | CERT-UA | |
| dc.subject.udc | 004.89 | |
| dc.title | A Formal Model for Constructing Sensitive Data Graphs from Cyber Reports using Large Language Models | |
| dc.type | Article |
Файли
Контейнер файлів
1 - 1 з 1
Ліцензійна угода
1 - 1 з 1
Ескіз недоступний
- Назва:
- license.txt
- Розмір:
- 8.98 KB
- Формат:
- Item-specific license agreed upon to submission
- Опис: