Abstract
The recent advances in technology, such as the emergence of generative artificial intelligence (GenAI) tools, warrant careful integration into education. In particular, exploring feedback and scores generated by both human raters and GenAI tools is crucial for assessing feedback alignment and score validity in L2 writing assessment. Moreover, L2 writing teachers’ agency in collaborating with these tools is a notable area of research. Given the importance of the topic, this mixed-methods research design aims to address three research questions: The alignment of GenAI and human scores and feedback on the same writing task responses; the justifications for scoring and feedback; and teachers’ agency in negotiating their roles in GenAI-supported assessment contexts. For that purpose, fifty essays (an IELTS retired task for Academic Writing Task 2) were rated by a human rater and ChatGPT-5 using the IELTS Task 2 criteria. The results displayed a strong correlation between human and ChatGPT-5 scores, confirming the scoring validity. Then, the rater was asked, and ChatGPT-5 was prompted to investigate the justifications for their scoring decisions. The findings yielded a contrast between the human rater and ChatGPT-5. These findings were also carefully interpreted following Kane’s argument-based approach to validity. Lastly, the thematic analysis of the semi-structured interview to navigate teachers’ agency in GenAI-mediated writing assessment was in accord with Priestley’s ecological model of agency. Overall, the findings illustrate the need for a hybrid model since blending GenAI-led surface-level evaluation with human-led cognitive, critical, and contextual evaluation is essential for a comprehensive and valid writing assessment.
| Original language | English |
|---|---|
| Journal | Educational Assessment, Evaluation and Accountability |
| DOIs | |
| Publication status | Published - 31 Mar 2026 |
Keywords
- Feedback alignment
- Generative artificial intelligence
- L2 writing
- Scoring validity
- Teacher agency
ASJC Scopus subject areas
- Education
- Organizational Behavior and Human Resource Management
Fingerprint
Dive into the research topics of 'Human vs. generative artificial intelligence in writing assessment: investigating feedback alignment, score validity, and teacher agency'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver