Using AI in Examinations
As a teacher, you can naturally benefit from generative AI when conducting examinations. Below are some examples.
For all examples, as always with AI-generated material, you must check it, assess its reliability and suitability, and often adjust it before potentially using it.
Getting suggestions for exam questions/tasks
Formulating good exam questions, additional questions for retakes, and new questions that are not the same as last term’s can take a lot of time. Here, generative AI can help.
Write a prompt asking for a specified number of suggestions for exam questions or tasks. Specify as precisely as possible the area the questions pertain to, that it concerns university studies, and whether it involves beginner or more experienced students. Specify other parameters if necessary, such as excluding certain knowledge areas, or conversely, including them, if specific aids are allowed, the maximum number of characters or pages, etc. Naturally, you can also ask for example answers.
Remember that you don’t have to settle for the first suggestion that comes up; you can continue with new prompts in the same conversation and request adjustments, additions, etc., to the suggestions provided.
Getting suggestions for MCQ questions
Building a question bank of multiple-choice questions for use in, for example, online quizzes can be laborious. Here, generative AI can do the groundwork.
Write a prompt asking for a specified number of MCQ questions. Specify as precisely as possible the knowledge area the questions pertain to, that it concerns a university-level course, how many answer options should be available for each question, and whether you want single-choice, multiple-choice, or a mix of these. Also, ask for the correct answers to be marked, and specify any other parameters.
In less than ten minutes, you can have suggestions for 40-50 questions with the specified number of answer options, which you can then sift through and possibly adjust or correct.
Getting suggestions for rubrics
Based on the goals in a course syllabus, generative AI can provide suggestions for rubrics. The result may be insufficiently detailed and will undoubtedly need to be adjusted, adapted, and refined, but it can serve as a good starting point for developing an assessment matrix and save a lot of time.
Write a prompt asking for an assessment matrix with suggested criteria for all specified grade levels for each course goal in the attached syllabus. Paste the course goals from the syllabus and send the request. Possibly ask for further specifications of the suggested rubrics before you take over and process the result.
Uncertain ground 1: Getting suggestions for feedback on students’ answers
Good feedback, given reasonably close to the examination, can be a powerful support in students’ studies. At the same time, it takes time to provide qualified, written feedback, and with large student groups, it can mean that feedback on an exam comes with perhaps weeks of delay - when students are so busy with the next course and upcoming exam that they do not pay the feedback the attention it deserves.
At some American universities, generative AI is already used to provide feedback to students, and several studies have been conducted, also in European contexts, to compare the quality of feedback from AI with feedback from teachers. Most studies so far seem to show that experienced teachers still provide somewhat better (in some sense) feedback than AI, but that AI provides (in some sense) better feedback than less experienced teachers.
Write a prompt asking for constructive feedback on the attached text, which can be pasted in or attached as a file. Specify any parameters that detail how the feedback should be formatted: how long it should be, whether it should start by highlighting the best parts of the text (and explain why they are the best) and how the improvement suggestions should be formulated. Also, specify if AI should focus on certain things or completely ignore something, such as spelling errors.
Always review the result - you must be able to stand by everything sent to the student! Change, correct, remove, add, etc. Be transparent with the student and state that you have used AI to quickly provide feedback, but that you have reviewed and edited everything and that what they read is your opinion
Why is it uncertain ground?
It is likely that, for copyright reasons, you should not upload students’ texts without their permission to a digital tool. In the long run, this issue may be resolved, similar to plagiarism checks.
Three final points to consider:
- Use UU’s Copilot if you upload students’ texts - not just any tools. And remove all identifying information from the text before uploading it.
- Consider possible, unwanted side effects of using AI-generated feedback, even if it has been reviewed. For example, does the student-teacher relationship deteriorate in the long run if practically all feedback is AI-generated? And how will new teachers become skilled at providing feedback if they never have to do the work from scratch?
- Also, consider what you do with the time you save! If you spend it on research or other tasks unrelated to the course, students may perceive it as a loss of teacher contact. But perhaps it is a time-saving measure that can allow for more contact with students, individually and/or in small groups? In that case, the combination of quick feedback and more meetings with teachers could mean a real improvement in teaching.
Uncertain Ground 2: Using Generative AI to Assess Exams
Just like with feedback, researchers and teachers have experimented with letting generative AI also “grade exams” and assess them. Here too, the results indicate that AI can quickly and competently go through large amounts of student responses, correct and assess them, although the results must, naturally, be checked.
Why is it Uncertain Ground?
Here too, the uncertainty about uploading students’ texts applies - permission from the students is likely required, and this issue may possibly be resolved.
When it comes to something as delicate as examination, which in Sweden is also an exercise of public authority, it is necessary to exercise great caution in using generative AI for assessment, even if the examiner reviews and approves the assessment. The EU AI Act considers examination to be a critical area where extra high demands are placed on users of generative AI, and for now, it is wise to refrain from letting AI grade exams.