Why High-Performing AI Fails the Human Test

With AI technology (particularly language models) performing increasingly well in traditional measures of expert knowledge such as medical licensing exams or the assessment of research environments, many are now considering how to deploy “out in the world” so that they can assist customers, patients, public services users, and so on. If yes, there is the potential to move … Continue reading Why High-Performing AI Fails the Human Test

How can we get Gen AI users to not fall asleep at the wheel?

Last month, I attended a training session, at Sussex, about Generative AI and assessment. We had to create a custom GPT for our modules and, then, use it to complete an assessment task for that module, using various levels and types of prompts. Afterwards, we marked the outputs and reflected on how “automatable” the different … Continue reading How can we get Gen AI users to not fall asleep at the wheel?