Potential of Artificial Intelligence Tools for Text Evaluation and Feedback Provision
https://doi.org/10.24833/2687-0126-2025-7-1-70-88
Abstract
The article aims to explore the potential of generative artificial intelligence (AI) for assessing written work and providing feedback on it. The goal of this research is to determine the possibilities and limitations of generative AI when used for evaluating students’ written production and providing feedback. To accomplish the aim, a systematic review of twenty-two original studies was conducted. The selected studies were carried out in both Russian and international contexts, with results published between 2022 and 2025. It was found that the criteria-based assessments made by generative models align with those of instructors, and that generative AI surpasses human evaluators in its ability to assess language and argumentation. However, the reliability of this evaluation is negatively affected by the instability of sequential assessments, the hallucinations of generative models, and their limited ability to account for contextual nuances. Despite the detailisation and constructive nature of feedback from generative AI, it is often insufficiently specific and overly verbose, which can hinder student comprehension. Feedback from generative models primarily targets local deficiencies, while human evaluators pay attention to global issues, such as the incomplete alignment of content with the assigned topic. Unlike instructors, generative AI provides template-based feedback, avoiding indirect phrasing and leading questions contributing to the development of self-regulation skills. Nevertheless, these shortcomings can be addressed through subsequent queries to the generative model. It was also found that students are open to receiving feedback from generative AI; however, they prefer to receive it from instructors and peers. The results are discussed in the context of using generative models for evaluating written work and formulating feedback by foreign language instructors. The conclusion emphasises the necessity of a critical approach to using generative models in the assessment of written work and the importance of training instructors for effective interaction with these technologies.
About the Author
S. V. BogolepovaRussian Federation
Svetlana V. Bogolepova, Cand. Sci. (Philology), is an Associate Professor
Moscow
References
1. Almassaad, A., Alajlan, H., & Alebaikan, R. (2024). Student Perceptions of Generative Artificial Intelligence: Investigating Utilization, Benefits, and Challenges in Higher Education. Systems, 12(10), 385. https://doi.org/10.3390/systems12100385
2. Amobonye, A., Lalung, J., Mheta, G., & Pillai, S. (2024). Writing a Scientific Review Article: Comprehensive Insights for Beginners. The Scientific World Journal, 7822269. https://doi.org/10.1155/2024/7822269
3. Annamalai, N., Rashid, R.A., Munir Hashmi, U., Mohamed, M.H., Alqaryouti, M., & Sadeq, A. (2023). Using chatbots for English language learning in higher education. Computers and Education: Artificial Intelligence, 5, 100153. https://doi.org/10.1016/j.caeai.2023.100153
4. Awidi, I.T. (2024). Comparing expert tutor evaluation of reflective essays with marking by generative artificial intelligence (AI) tool. Computers and Education: Artificial Intelligence, 6(3), 100226. https://doi.org/10.1016/j.caeai.2024.100226
5. Bahroun, Z., Anane, C., Ahmed, V., & Zacca, A. (2023). Transforming education: A comprehensive review of generative artificial intelligence in educational settings through bibliometric and content analysis. Sustainability, 15(17), 12983. https://doi.org/10.3390/su151712983
6. Banihashem, S.K., Kerman, N.T., Noroozi, O., Moon, J., & Drachsler, H. (2024). Feedback sources in essay writing: peer-generated or AI-generated feedback? International Journal of Educational Technology in Higher Education, 21(1), 23. https://doi.org/10.1186/s41239-024-00455-4
7. Benedetto, L., Gaudeau, G., Caines, A., & Buttery, P. (2025). Assessing how accurately large language models encode and apply the common European framework of reference for languages. Computers and Education: Artificial Intelligence, 8, 100353. https://doi.org/10.1016/j.caeai.2024.100353
8. Bogolepova, S.V., & Zharkova, M.G. (2024). Issledovanie potentsiala generativnykh modelei dlya otsenivaniya esse i obespecheniya obratnoy svyazi [Researching the potential of generative language models for essay evaluation and feedback provision]. Domestic and Foreign Pedagogy, 1(5), 123–137 (in Russian). https://doi.org/10.24412/2224-0772-2024-101-123-137
9. Bogolepova, S.V., Kirsanova, M.A., & Pivovarova, A.A. (2023). Kak auknetsya, tak i otkliknetsya? Otnoshenie studentov-lingvistov k vzaimootsenivaniyu i obratnoy svyazi [As you give, you will receive? Language students’ attitudes to peer-assessment and feedback]. Vestnik Tomskogo gosudarstvennogo universiteta [Tomsk State University Journal], 487, 118–128 (in Russian). https://doi.org/10.17223/15617793/487/14
10. Bouziane, K., & Bouziane, A. (2024). AI versus human effectiveness in essay evaluation. Discover Education, 3(1), 201. https://doi.org/10.1007/s44217-024-00320-6
11. Chan, C.K.Y., & Hu, W. (2023). Students’ voices on generative AI: perceptions, benefits, and challenges in higher education. International Journal of Educational Technology in Higher Education, 20(1), 43. https://doi.org/10.1186/s41239-023-00411-8
12. Chan, S., Lo, N., & Wong, A. (2024). Generative AI and Essay Writing: Impacts of Automated Feedback on Revision Performance and Engagement. rEFLections, 31(3), 1249–1284. https://doi.org/10.61508/refl.v31i3.277514
13. Crompton, H., & Burke, D. (2023). Artificial intelligence in higher education: the state of the field. International Journal of Educational Technology in Higher Education, 20(1), 22. https://doi.org/10.1186/s41239-023-00392-8
14. Crompton, H., Edmett, A., Ichaporia, N., & Burke, D. (2024). AI and English language teaching: Affordances and challenges. British Journal of Educational Technology, 55(6), 2503- 2529. https://doi.org/10.1111/bjet.13460
15. Dai, W., Lin, J., Jin, F., Li, T., Tsai, Y., Gasevic, D., & Chen, G. (2023). Can large language models provide feedback to students? A case study on ChatGPT. EdArXiv Preprint. https://doi.org/10.35542/osf.io/hcgzj
16. Escalante, J., Pack, A., & Barrett, A. (2023). AI-generated feedback on writing: Insights into efficacy and ENL student preference. International Journal of Educational Technology in Higher Education, 20, 57. https://doi.org/10.1186/s41239-023-00425-2
17. Evans, J., & Benefield, P. (2001). Systematic Reviews of Educational Research: Does the Medical Model Fit? British Educational Research Journal, 27(5), 527-541. https://doi.org/10.1080/01411920120095717
18. Fan, L., Li, L., Ma, Z., Lee, S., Yu, H., & Hemphill, L. (2024). A bibliometric review of large language models research from 2017 to 2023. ACM Transactions on Intelligent Systems and Technology, 15(5), 91, 1-25. https://doi.org/10.1145/3664930
19. Gombert, S., Fink, A., Giorgashvili, T., Jivet, I., Di Mitri, D., Yau, J., Frey, A., Drachsler, H. (2024). From the automated assessment of student essay content to highly informative feedback: A case study. International Journal of Artificial Intelligence in Education, 34, 1378-1416. https://doi.org/10.1007/s40593-023-00387-6
20. Guo, K., & Wang, D. (2024). To resist it or to embrace it? Examining ChatGPT’s potential to support teacher feedback in EFL writing. Education and Information Technologies, 29(7), 8435–8463. https://doi.org/10.1007/s10639-023-12146-0
21. Gupta, P., Ding, B., Guan, C., & Ding, D. (2024). Generative AI: A systematic review using topic modelling techniques. Data and Information Management, 8(2), 100066. https://doi.org/10.1016/j.dim.2024.100066
22. Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81-112. https://doi.org/10.3102/003465430298487
23. Hirunyasiri, D., Thomas, D.R., Lin, J., Koedinger, K.R., & Aleven, V. (2023). Comparative analysis of GPT-4 and human graders in evaluating praise given to students in synthetic dialogues. EdArXiv Preprint. https://doi.org/10.48550/arXiv.2307.02018
24. Hussein, M.A., Hassan, H., & Nassef, M. (2019). Automated language essay scoring systems: A literature review. PeerJ Computer Science, 5, e208. http://doi.org/10.7717/peerj-cs.208
25. Iorliam, A., & Ingio, J.A. (2024). A comparative analysis of generative artificial intelligence tools for natural language processing. Journal of Computing Theories and Applications, 1(3), 311– 325. https://doi.org/10.62411/jcta.9447
26. Jansen, T., Höft, L., Bahr, L., Fleckenstein, J., Möller, J., Köller, O., & Meyer, J. (2024). Comparing generative AI and expert feedback to students’ writing: Insights from student teachers. Psychologie in Erziehung und Unterricht, 71(2), 80–92. https://doi.org/10.2378/peu2024.art08d
27. Jauhiainen, J.S., & Garagorry Guerra, A. (2024). Generative AI in education: ChatGPT-4 in evaluating students’ written responses. Innovations in Education and Teaching International, 1-18. https://doi.org/10.1080/14703297.2024.2422337
28. Jiang, Z., Xu, Z., Pan, Z., He, J., & Xie, K. (2023). Exploring the role of artificial intelligence in facilitating assessment of writing performance in second language learning. Languages, 8(4), 247. https://doi.org/10.3390/languages8040247
29. Kasneci, E., Seßler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., ... & Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, 102274. https://doi.org/10.1016/j.lindif.2023.102274
30. Kinder, A., Briese, F.J., Jacobs, M., Dern, N., Glodny, N., Jacobs, S., Leßmann, S. (2025). Effects of adaptive feedback generated by a large language model: A case study in teacher education. Computers and Education: Artificial Intelligence, 8, 100349. https://doi.org/10.1016/j.caeai.2024.100349
31. Konstantinova, L.V., Vorozhikhin, V.V., Petrov, A.M., Titova, E.S., & Shtykhno, D.A. (2023). Generativnyy iskusstvennyy intellekt v obrazovanii: diskussii i prognozy [Generative Artificial Intelligence in Education: Discussions and Forecasts]. Open Education, 27(2), 36-48. Retrieved 2024, November 18, from https://cyberleninka.ru/article/n/generativnyy-iskusstvennyy-intellekt-v-obrazovanii-diskussii-i-prognozy (in Russian).
32. Korenev, A.A. (2018). Obratnaya svyaz’ v obucheni i pedagogicheskom obshchenii [Feedback in learning, teaching and educational communication]. Rhema, 2, 112–127. Retrieved 2024, November 18, from https://cyberleninka.ru/article/n/obratnaya-svyaz-v-obuchenii-i-pedagogicheskom-obschenii/viewer (in Russian).
33. Kumar, R. (2023). Faculty members’ use of artificial intelligence to grade student papers: A case of implications. International Journal for Educational Integrity, 19(1), 9. https://doi.org/10.1007/s40979-023-00130-7
34. Li, B., Lowell, V.L., Wang, Ch., & Li, X. (2024). A systematic review of the first year of publications on ChatGPT and language education: Examining research on ChatGPT’s use in language learning and teaching. Computers and Education: Artificial Intelligence, 7, 100266. https://doi.org/10.1016/j.caeai.2024.100266
35. Li, W., & Liu, H. (2024). Applying large language models for automated essay scoring for non-native Japanese. Humanities and Social Sciences Communications, 1, 723. https://doi.org/10.1057/s41599-024-03209-9
36. Lin, Sh., & Crosthwaite, P. (2024). The grass is not always greener: Teacher vs. GPT-assisted written corrective feedback. System, 127, 103529. https://doi.org/10.1016/j.system.2024.103529
37. Lye, C.Y., & Lim, L. (2024). Generative artificial intelligence in tertiary education: Assessment redesign principles and considerations. Education Sciences, 14(6), 569. https://doi.org/10.3390/educsci14060569
38. Meyer, J., Jansen, T., Schiller, R., Liebenow, L.W., Steinbach, M., Horbach, A., & Fleckenstein, J. (2024). Using LLMs to bring evidence-based feedback into the classroom: AI-generated feedback increases secondary students’ text revision, motivation, and positive emotions. Computers and Education: Artificial Intelligence, 6, 100199. https://doi.org/10.1016/j.caeai.2023.100199
39. Mizumoto, A., & Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring. Research Methods in Applied Linguistics, 2(2), 100050. https://doi.org/10.1016/j.rmal.2023.100050
40. Mizumoto, A., Shintani, N., Sasaki, M., & Feng Teng, M. (2024). Testing the viability of ChatGPT as a companion in L2 writing accuracy assessment. Research Methods in Applied Linguistics, 3(2), 100116. https://doi.org/10.1016/j.rmal.2024.100116
41. Pack, A., Barrett, A., & Escalante, J. (2024). Large language models and automated essay scoring of English language learner writing: Insights into validity and reliability. Computers and Education: Artificial Intelligence, 6, 100234. https://doi.org/10.1016/j.caeai.2024.100234
42. Saini, A.K., Cope, B., Kalantzis, M., & Zapata, G.C. (2024). The Future of Feedback: Integrating Peer and Generative AI Reviews to Support Student Work. EdArXiv preprint. https://doi. org/10.35542/osf.io/x3dct
43. Sidorkin, A.M. (2024). Embracing chatbots in higher education: The use of artificial intelligence in teaching, administration, and scholarship. New York: Routledge. https://doi.org/10.4324/9781032686028
44. Stahl, M., Biermann, L., Nehring, A., & Wachsmuth, H. (2024). Exploring LLM prompting strategies for joint essay scoring and feedback generation. In E. Kochmar et al. (Eds.), Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024) (pp. 283–298). Association for Computational Linguistics.
45. Statista Research Department. (2024, December 9). AI tool user numbers worldwide from 2020-2030. Retrieved 2024, December 26, from https://www.statista.com/forecasts/1449844/aitool-users-worldwide
46. Steiss, J., Tate, T., Graham, S., Cruz, J., Hebert, M., Wang, J., Moon, Y., Tseng, W., Warschauer, M., & Olson, C.B. (2024). Comparing the quality of human and ChatGPT feedback of students’ writing. Learning and Instruction, 91, 101894. https://doi.org/10.1016/j.learninstruc.2024.101894
47. Sysoyev, P.V. (2024). The use of artificial intelligence technologies in foreign language teaching: the subject of methodological works for 2023 and prospects for further research. Tambov University Review. Series: Humanities, 29(2), 294-308 (in Russian). https://doi.org/10.20310/1810- 0201-2024-29-2-294-308
48. Sysoyev, P.V., & Filatov, E.M. (2024). Metodika obucheniya uchashchikhsya i studentov napisaniyu esse v triade «obuchayushchiesya — prepodavatel’ — iskusstvennyy intellekt» [Amethod for teaching foreign language creative writing to students within the framework “Learner — Teacher — Artificial Intelligence.”] Lomonosov Linguistics and Intercultural Communication Journal, 27(2), 38–54 (in Russian). https://doi.org/10.55959/MSU-2074-1588-19-27-2-3
49. Sysoyev, P.V., Filatov, E.M., Khmarenko, N.I., & Murunov, S.S. (2024). Prepodavatel’ vs iskusstvennyy intellekt: sravnenie kachestva predostavlyaemoy prepodavatlem i generativnym iskusstvennym intellektom obratnoy svyazi pri otsenke pismennykh tvorcheskikh rabot studentov [Teacher vs. Artificial Intelligence: A comparison of the quality of feedback provided by a teacher and generative artificial intelligence in assessing students’ creative writing]. Perspektivy nauki i obrazovania [Perspectives of Science and Education], 71(5), 694–712 (in Russian). https://doi.org/10.32744/pse.2024.5.41
50. Titova, S. V. (2024). Tekhnologicheskie resheniya na baze iskussvennogo intellekta v obuchenii inostrannym yazykam: analiticheskiy obzor [Technological solutions based on artificial intelligence in teaching foreign languages: An analytical review]. Lomonosov Linguistics and Intercultural Communication Journal, 27(2), 18–37 (in Russian). https://doi.org/10.55959/MSU2074-1588-19-27-2-2
51. Wang, N., Wang, X., & Su, Y.S. (2024). Critical analysis of the technological affordances, challenges and future directions of generative AI in education: A systematic review. Asia Pacific Journal of Education, 44(1), 139–155. https://doi.org/10.1080/02188791.2024.2305156
52. Winstone, N., Boud, D., Dawson, P., & Heron, M. (2021). From feedback-as-information to feedback-as-process: A linguistic analysis of the feedback literature. Assessment & Evaluation in Higher Education, 47(2), 213-230. https://doi.org/10.1080/02602938.2021.1902467
53. Wu, H., Wang, W., Wan, Y., Jiao, W., & Lyu, M. (2023). ChatGPT or Grammarly? Evaluating ChatGPT on Grammatical Error Correction Benchmark. arXiv preprint, arXiv:2303.13648. https://doi.org/10.48550/arxiv.2303.13648
54. Yu, H., & Guo, Y. (2023). Generative artificial intelligence empowers educational reform: current status, issues, and prospects. Frontiers in Education, 8, 1183162. doi: 10.3389/feduc.2023.1183162
Review
For citations:
Bogolepova S.V. Potential of Artificial Intelligence Tools for Text Evaluation and Feedback Provision. Professional Discourse & Communication. 2025;7(1):70-88. (In Russ.) https://doi.org/10.24833/2687-0126-2025-7-1-70-88