Preview

Professional Discourse & Communication

Advanced search

Potential of Artificial Intelligence Tools for Text Evaluation and Feedback Provision

https://doi.org/10.24833/2687-0126-2025-7-1-70-88

Abstract

The article aims to explore the potential of generative artificial intelligence (AI) for assessing written work and providing feedback on it. The goal of this research is to determine the possibilities and limitations of generative AI when used for evaluating students’ written production and providing feedback. To accomplish the aim, a systematic review of twenty-two original studies was conducted. The selected studies were carried out in both Russian and international contexts, with results published between 2022 and 2025. It was found that the criteria-based assessments made by generative models align with those of instructors, and that generative AI surpasses human evaluators in its ability to assess language and argumentation. However, the reliability of this evaluation is negatively affected by the instability of sequential assessments, the hallucinations of generative models, and their limited ability to account for contextual nuances. Despite the detailisation and constructive nature of feedback from generative AI, it is often insufficiently specific and overly verbose, which can hinder student comprehension. Feedback from generative models primarily targets local deficiencies, while human evaluators pay attention to global issues, such as the incomplete alignment of content with the assigned topic. Unlike instructors, generative AI provides template-based feedback, avoiding indirect phrasing and leading questions contributing to the development of self-regulation skills. Nevertheless, these shortcomings can be addressed through subsequent queries to the generative model. It was also found that students are open to receiving feedback from generative AI; however, they prefer to receive it from instructors and peers. The results are discussed in the context of using generative models for evaluating written work and formulating feedback by foreign language instructors. The conclusion emphasises the necessity of a critical approach to using generative models in the assessment of written work and the importance of training instructors for effective interaction with these technologies. 

About the Author

S. V. Bogolepova
National Research University Higher School of Economics
Russian Federation

Svetlana V. Bogolepova, Cand. Sci. (Philology), is an Associate Professor

Moscow



References

1. Almassaad, A., Alajlan, H., & Alebaikan, R. (2024). Student Perceptions of Generative Artificial Intelligence: Investigating Utilization, Benefits, and Challenges in Higher Education. Systems, 12(10), 385. https://doi.org/10.3390/systems12100385

2. Amobonye, A., Lalung, J., Mheta, G., & Pillai, S. (2024). Writing a Scientific Review Article: Comprehensive Insights for Beginners. The Scientific World Journal, 7822269. https://doi.org/10.1155/2024/7822269

3. Annamalai, N., Rashid, R.A., Munir Hashmi, U., Mohamed, M.H., Alqaryouti, M., & Sadeq, A. (2023). Using chatbots for English language learning in higher education. Computers and Education: Artificial Intelligence, 5, 100153. https://doi.org/10.1016/j.caeai.2023.100153

4. Awidi, I.T. (2024). Comparing expert tutor evaluation of reflective essays with marking by generative artificial intelligence (AI) tool. Computers and Education: Artificial Intelligence, 6(3), 100226. https://doi.org/10.1016/j.caeai.2024.100226

5. Bahroun, Z., Anane, C., Ahmed, V., & Zacca, A. (2023). Transforming education: A comprehensive review of generative artificial intelligence in educational settings through bibliometric and content analysis. Sustainability, 15(17), 12983. https://doi.org/10.3390/su151712983

6. Banihashem, S.K., Kerman, N.T., Noroozi, O., Moon, J., & Drachsler, H. (2024). Feedback sources in essay writing: peer-generated or AI-generated feedback? International Journal of Educational Technology in Higher Education, 21(1), 23. https://doi.org/10.1186/s41239-024-00455-4

7. Benedetto, L., Gaudeau, G., Caines, A., & Buttery, P. (2025). Assessing how accurately large language models encode and apply the common European framework of reference for languages. Computers and Education: Artificial Intelligence, 8, 100353. https://doi.org/10.1016/j.caeai.2024.100353

8. Bogolepova, S.V., & Zharkova, M.G. (2024). Issledovanie potentsiala generativnykh modelei dlya otsenivaniya esse i obespecheniya obratnoy svyazi [Researching the potential of generative language models for essay evaluation and feedback provision]. Domestic and Foreign Pedagogy, 1(5), 123–137 (in Russian). https://doi.org/10.24412/2224-0772-2024-101-123-137

9. Bogolepova, S.V., Kirsanova, M.A., & Pivovarova, A.A. (2023). Kak auknetsya, tak i otkliknetsya? Otnoshenie studentov-lingvistov k vzaimootsenivaniyu i obratnoy svyazi [As you give, you will receive? Language students’ attitudes to peer-assessment and feedback]. Vestnik Tomskogo gosudarstvennogo universiteta [Tomsk State University Journal], 487, 118–128 (in Russian). https://doi.org/10.17223/15617793/487/14

10. Bouziane, K., & Bouziane, A. (2024). AI versus human effectiveness in essay evaluation. Discover Education, 3(1), 201. https://doi.org/10.1007/s44217-024-00320-6

11. Chan, C.K.Y., & Hu, W. (2023). Students’ voices on generative AI: perceptions, benefits, and challenges in higher education. International Journal of Educational Technology in Higher Education, 20(1), 43. https://doi.org/10.1186/s41239-023-00411-8

12. Chan, S., Lo, N., & Wong, A. (2024). Generative AI and Essay Writing: Impacts of Automated Feedback on Revision Performance and Engagement. rEFLections, 31(3), 1249–1284. https://doi.org/10.61508/refl.v31i3.277514

13. Crompton, H., & Burke, D. (2023). Artificial intelligence in higher education: the state of the field. International Journal of Educational Technology in Higher Education, 20(1), 22. https://doi.org/10.1186/s41239-023-00392-8

14. Crompton, H., Edmett, A., Ichaporia, N., & Burke, D. (2024). AI and English language teaching: Affordances and challenges. British Journal of Educational Technology, 55(6), 2503- 2529. https://doi.org/10.1111/bjet.13460

15. Dai, W., Lin, J., Jin, F., Li, T., Tsai, Y., Gasevic, D., & Chen, G. (2023). Can large language models provide feedback to students? A case study on ChatGPT. EdArXiv Preprint. https://doi.org/10.35542/osf.io/hcgzj

16. Escalante, J., Pack, A., & Barrett, A. (2023). AI-generated feedback on writing: Insights into efficacy and ENL student preference. International Journal of Educational Technology in Higher Education, 20, 57. https://doi.org/10.1186/s41239-023-00425-2

17. Evans, J., & Benefield, P. (2001). Systematic Reviews of Educational Research: Does the Medical Model Fit? British Educational Research Journal, 27(5), 527-541. https://doi.org/10.1080/01411920120095717

18. Fan, L., Li, L., Ma, Z., Lee, S., Yu, H., & Hemphill, L. (2024). A bibliometric review of large language models research from 2017 to 2023. ACM Transactions on Intelligent Systems and Technology, 15(5), 91, 1-25. https://doi.org/10.1145/3664930

19. Gombert, S., Fink, A., Giorgashvili, T., Jivet, I., Di Mitri, D., Yau, J., Frey, A., Drachsler, H. (2024). From the automated assessment of student essay content to highly informative feedback: A case study. International Journal of Artificial Intelligence in Education, 34, 1378-1416. https://doi.org/10.1007/s40593-023-00387-6

20. Guo, K., & Wang, D. (2024). To resist it or to embrace it? Examining ChatGPT’s potential to support teacher feedback in EFL writing. Education and Information Technologies, 29(7), 8435–8463. https://doi.org/10.1007/s10639-023-12146-0

21. Gupta, P., Ding, B., Guan, C., & Ding, D. (2024). Generative AI: A systematic review using topic modelling techniques. Data and Information Management, 8(2), 100066. https://doi.org/10.1016/j.dim.2024.100066

22. Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81-112. https://doi.org/10.3102/003465430298487

23. Hirunyasiri, D., Thomas, D.R., Lin, J., Koedinger, K.R., & Aleven, V. (2023). Comparative analysis of GPT-4 and human graders in evaluating praise given to students in synthetic dialogues. EdArXiv Preprint. https://doi.org/10.48550/arXiv.2307.02018

24. Hussein, M.A., Hassan, H., & Nassef, M. (2019). Automated language essay scoring systems: A literature review. PeerJ Computer Science, 5, e208. http://doi.org/10.7717/peerj-cs.208

25. Iorliam, A., & Ingio, J.A. (2024). A comparative analysis of generative artificial intelligence tools for natural language processing. Journal of Computing Theories and Applications, 1(3), 311– 325. https://doi.org/10.62411/jcta.9447

26. Jansen, T., Höft, L., Bahr, L., Fleckenstein, J., Möller, J., Köller, O., & Meyer, J. (2024). Comparing generative AI and expert feedback to students’ writing: Insights from student teachers. Psychologie in Erziehung und Unterricht, 71(2), 80–92. https://doi.org/10.2378/peu2024.art08d

27. Jauhiainen, J.S., & Garagorry Guerra, A. (2024). Generative AI in education: ChatGPT-4 in evaluating students’ written responses. Innovations in Education and Teaching International, 1-18. https://doi.org/10.1080/14703297.2024.2422337

28. Jiang, Z., Xu, Z., Pan, Z., He, J., & Xie, K. (2023). Exploring the role of artificial intelligence in facilitating assessment of writing performance in second language learning. Languages, 8(4), 247. https://doi.org/10.3390/languages8040247

29. Kasneci, E., Seßler, K., Küchemann, S., Bannert, M., Dementieva, D., Fischer, F., ... & Kasneci, G. (2023). ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences, 103, 102274. https://doi.org/10.1016/j.lindif.2023.102274

30. Kinder, A., Briese, F.J., Jacobs, M., Dern, N., Glodny, N., Jacobs, S., Leßmann, S. (2025). Effects of adaptive feedback generated by a large language model: A case study in teacher education. Computers and Education: Artificial Intelligence, 8, 100349. https://doi.org/10.1016/j.caeai.2024.100349

31. Konstantinova, L.V., Vorozhikhin, V.V., Petrov, A.M., Titova, E.S., & Shtykhno, D.A. (2023). Generativnyy iskusstvennyy intellekt v obrazovanii: diskussii i prognozy [Generative Artificial Intelligence in Education: Discussions and Forecasts]. Open Education, 27(2), 36-48. Retrieved 2024, November 18, from https://cyberleninka.ru/article/n/generativnyy-iskusstvennyy-intellekt-v-obrazovanii-diskussii-i-prognozy (in Russian).

32. Korenev, A.A. (2018). Obratnaya svyaz’ v obucheni i pedagogicheskom obshchenii [Feedback in learning, teaching and educational communication]. Rhema, 2, 112–127. Retrieved 2024, November 18, from https://cyberleninka.ru/article/n/obratnaya-svyaz-v-obuchenii-i-pedagogicheskom-obschenii/viewer (in Russian).

33. Kumar, R. (2023). Faculty members’ use of artificial intelligence to grade student papers: A case of implications. International Journal for Educational Integrity, 19(1), 9. https://doi.org/10.1007/s40979-023-00130-7

34. Li, B., Lowell, V.L., Wang, Ch., & Li, X. (2024). A systematic review of the first year of publications on ChatGPT and language education: Examining research on ChatGPT’s use in language learning and teaching. Computers and Education: Artificial Intelligence, 7, 100266. https://doi.org/10.1016/j.caeai.2024.100266

35. Li, W., & Liu, H. (2024). Applying large language models for automated essay scoring for non-native Japanese. Humanities and Social Sciences Communications, 1, 723. https://doi.org/10.1057/s41599-024-03209-9

36. Lin, Sh., & Crosthwaite, P. (2024). The grass is not always greener: Teacher vs. GPT-assisted written corrective feedback. System, 127, 103529. https://doi.org/10.1016/j.system.2024.103529

37. Lye, C.Y., & Lim, L. (2024). Generative artificial intelligence in tertiary education: Assessment redesign principles and considerations. Education Sciences, 14(6), 569. https://doi.org/10.3390/educsci14060569

38. Meyer, J., Jansen, T., Schiller, R., Liebenow, L.W., Steinbach, M., Horbach, A., & Fleckenstein, J. (2024). Using LLMs to bring evidence-based feedback into the classroom: AI-generated feedback increases secondary students’ text revision, motivation, and positive emotions. Computers and Education: Artificial Intelligence, 6, 100199. https://doi.org/10.1016/j.caeai.2023.100199

39. Mizumoto, A., & Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring. Research Methods in Applied Linguistics, 2(2), 100050. https://doi.org/10.1016/j.rmal.2023.100050

40. Mizumoto, A., Shintani, N., Sasaki, M., & Feng Teng, M. (2024). Testing the viability of ChatGPT as a companion in L2 writing accuracy assessment. Research Methods in Applied Linguistics, 3(2), 100116. https://doi.org/10.1016/j.rmal.2024.100116

41. Pack, A., Barrett, A., & Escalante, J. (2024). Large language models and automated essay scoring of English language learner writing: Insights into validity and reliability. Computers and Education: Artificial Intelligence, 6, 100234. https://doi.org/10.1016/j.caeai.2024.100234

42. Saini, A.K., Cope, B., Kalantzis, M., & Zapata, G.C. (2024). The Future of Feedback: Integrating Peer and Generative AI Reviews to Support Student Work. EdArXiv preprint. https://doi. org/10.35542/osf.io/x3dct

43. Sidorkin, A.M. (2024). Embracing chatbots in higher education: The use of artificial intelligence in teaching, administration, and scholarship. New York: Routledge. https://doi.org/10.4324/9781032686028

44. Stahl, M., Biermann, L., Nehring, A., & Wachsmuth, H. (2024). Exploring LLM prompting strategies for joint essay scoring and feedback generation. In E. Kochmar et al. (Eds.), Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2024) (pp. 283–298). Association for Computational Linguistics.

45. Statista Research Department. (2024, December 9). AI tool user numbers worldwide from 2020-2030. Retrieved 2024, December 26, from https://www.statista.com/forecasts/1449844/aitool-users-worldwide

46. Steiss, J., Tate, T., Graham, S., Cruz, J., Hebert, M., Wang, J., Moon, Y., Tseng, W., Warschauer, M., & Olson, C.B. (2024). Comparing the quality of human and ChatGPT feedback of students’ writing. Learning and Instruction, 91, 101894. https://doi.org/10.1016/j.learninstruc.2024.101894

47. Sysoyev, P.V. (2024). The use of artificial intelligence technologies in foreign language teaching: the subject of methodological works for 2023 and prospects for further research. Tambov University Review. Series: Humanities, 29(2), 294-308 (in Russian). https://doi.org/10.20310/1810- 0201-2024-29-2-294-308

48. Sysoyev, P.V., & Filatov, E.M. (2024). Metodika obucheniya uchashchikhsya i studentov napisaniyu esse v triade «obuchayushchiesya — prepodavatel’ — iskusstvennyy intellekt» [Amethod for teaching foreign language creative writing to students within the framework “Learner — Teacher — Artificial Intelligence.”] Lomonosov Linguistics and Intercultural Communication Journal, 27(2), 38–54 (in Russian). https://doi.org/10.55959/MSU-2074-1588-19-27-2-3

49. Sysoyev, P.V., Filatov, E.M., Khmarenko, N.I., & Murunov, S.S. (2024). Prepodavatel’ vs iskusstvennyy intellekt: sravnenie kachestva predostavlyaemoy prepodavatlem i generativnym iskusstvennym intellektom obratnoy svyazi pri otsenke pismennykh tvorcheskikh rabot studentov [Teacher vs. Artificial Intelligence: A comparison of the quality of feedback provided by a teacher and generative artificial intelligence in assessing students’ creative writing]. Perspektivy nauki i obrazovania [Perspectives of Science and Education], 71(5), 694–712 (in Russian). https://doi.org/10.32744/pse.2024.5.41

50. Titova, S. V. (2024). Tekhnologicheskie resheniya na baze iskussvennogo intellekta v obuchenii inostrannym yazykam: analiticheskiy obzor [Technological solutions based on artificial intelligence in teaching foreign languages: An analytical review]. Lomonosov Linguistics and Intercultural Communication Journal, 27(2), 18–37 (in Russian). https://doi.org/10.55959/MSU2074-1588-19-27-2-2

51. Wang, N., Wang, X., & Su, Y.S. (2024). Critical analysis of the technological affordances, challenges and future directions of generative AI in education: A systematic review. Asia Pacific Journal of Education, 44(1), 139–155. https://doi.org/10.1080/02188791.2024.2305156

52. Winstone, N., Boud, D., Dawson, P., & Heron, M. (2021). From feedback-as-information to feedback-as-process: A linguistic analysis of the feedback literature. Assessment & Evaluation in Higher Education, 47(2), 213-230. https://doi.org/10.1080/02602938.2021.1902467

53. Wu, H., Wang, W., Wan, Y., Jiao, W., & Lyu, M. (2023). ChatGPT or Grammarly? Evaluating ChatGPT on Grammatical Error Correction Benchmark. arXiv preprint, arXiv:2303.13648. https://doi.org/10.48550/arxiv.2303.13648

54. Yu, H., & Guo, Y. (2023). Generative artificial intelligence empowers educational reform: current status, issues, and prospects. Frontiers in Education, 8, 1183162. doi: 10.3389/feduc.2023.1183162


Review

For citations:


Bogolepova S.V. Potential of Artificial Intelligence Tools for Text Evaluation and Feedback Provision. Professional Discourse & Communication. 2025;7(1):70-88. (In Russ.) https://doi.org/10.24833/2687-0126-2025-7-1-70-88



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2687-0126 (Online)