student
employee
UDC 336.2
The paper examines the application of the RAG (Retrieval-Augmented Generation) architecture to corporate financial analysis tasks. A prototype system combining document retrieval mechanisms with a generative natural language processing model is described. The system operates on a demonstration corpus of financial documents including corporate annual reports, regulatory materials, and sector benchmarks. A hybrid retrieval mechanism is implemented that combines keyword-based search and vector representations of text with domain-specific prioritization of documents by company and reporting period. The architecture of the prototype includes document indexing, retrieval of relevant fragments, and generation of analytical responses based on the retrieved context. The results indicate that integrating a retrieval component improves interpretability and ensures that generated analytical conclusions are grounded in source documents. The limitations of the prototype are related to the small size of the document corpus and the use of heuristic procedures for verifying factual grounding of generated responses.
artificial intelligence, large language models, Retrieval-Augmented Generation, corporate finance, financial analysis, financial reporting analysis, credit risk, leverage analysis, return on equity, model interpretability, hybrid retrieval, vector embeddings, information retrieval
1. Generaciya, dopolnennaya poiskom (Retrieval-Augmented Generation) // Vikipediya: svobodnaya enciklopediya. — Rezhim dostupa: https://ru.wikipedia.org/wiki/Generaciya,_dopolnennaya_poiskom?ysclid=mlwjqeon6i847408939 (data obrascheniya: 13.02.2026).
2. Polozhenie Banka Rossii ot 28.06.2017 № 590-P «O poryadke formirovaniya kreditnymi organizaciyami rezervov na vozmozhnye poteri po ssudam…» // Konsul'tantPlyus. — Rezhim dostupa: https://www.consultant.ru/document/cons_doc_LAW_220089/ (data obrascheniya: 13.02.2026).
3. Polozhenie Banka Rossii ot 28 iyunya 2017 g. № 590-P «O poryadke formirovaniya kreditnymi organizaciyami rezervov na vozmozhnye poteri po ssudam…» // Oficial'nyy sayt Banka Rossii. — Rezhim dostupa: https://www.cbr.ru/explan/590-p/ (data obrascheniya: 13.02.2026).
4. MSFO 9 «Finansovye instrumenty» // Vikipediya: svobodnaya enciklopediya. — Rezhim dostupa: https://ru.wikipedia.org/wiki/IFRS_9 (data obrascheniya: 13.02.2026).
5. Rahaev V. A. Razvitie metodov ocenki kreditnogo riska dlya formirovaniya rezervov na vozmozhnye poteri po ssudam // Finansovyy zhurnal. — 2020. — № 24. — S. 82–91. — DOI:https://doi.org/10.26794/2587-5671-2020-24-6-82-91. EDN: https://elibrary.ru/JHVTBS
6. Mitichkin O. S. Osnovnye principy ocenki kreditnogo riska v sootvetstvii so standartom MSFO 9 // Elektronnyy nauchnyy zhurnal «Dnevnik nauki». — 2019. — S.
7. Podhody k postroeniyu EAD-modeley na dlinnyh vremennyh gorizontah // Finansovyy zhurnal•Financial Journal•№ 4•2021. — S. 91–109. — DOI:https://doi.org/10.31107/2075-1990-2021-4-91-109. EDN: https://elibrary.ru/GUEBGY
8. Lewis P., Perez E., Piktus A., Petroni F., Karpukhin V., et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks / NeurIPS 2020 Proceedings. — 2020. — URL: https://arxiv.org/abs/2005.11401 (data obrascheniya: 14.02.2026). EDN: https://elibrary.ru/JPMMQE
9. Gao Y., Xiong Y., Gao X., Jia K., Pan J., Bi Y. Retrieval-Augmented Generation for Large Language Models: A Survey / arXiv preprint. — 2023. — URL: https://arxiv.org/abs/2312.10997 (data obrascheniya: 14.02.2026).
10. Huang L. A Survey on Hallucination in Large Language Models / arXiv preprint. — 2023. — URL: https://arxiv.org/abs/2311.05232 (data obrascheniya: 14.02.2026).
11. Alansari A., Luqman H. A Comprehensive Survey of Hallucination in Large Language Models: Causes, Detection, and Mitigation / arXiv preprint. — 2025. — URL: https://arxiv.org/abs/2510.06265 (data obrascheniya: 14.02.2026).
12. Karakurt E., Akbulut A. Retrieval Augmented Generation (RAG) and Large Language Models for Enterprise Knowledge Management and Document Automation / Applied Sciences (MDPI). — 2025. — URL: https://www.mdpi.com/2076-3417/16/1/368 (data obrascheniya: 14.02.2026). DOI: https://doi.org/10.3390/app16010368
13. Farquhar S., et al. Detecting Hallucinations in Large Language Models Using Entropy-Based Methods / Nature. — 2024. — URL: https://www.nature.com/articles/s41586-024-07421-0 (data obrascheniya: 15.02.2026).
14. Arslan M. A Survey on Retrieval-Augmented Generation and Its Applications / Procedia Computer Science. — 2024. — URL: https://www.sciencedirect.com/science/article/pii/S1877050924021860 (data obrascheniya: 15.02.2026).
15. Lykov A.V. Corporate RAG System Prototype [Elektronnyy resurs]. — Rezhim dostupa: https://github.com/MrMixaDj32/corporate-rag-finance (data obrascheniya: 15.02.2026).



