For Kurt Gray, a social psychologist at the University of North Carolina at Chapel Hill, conducting experiments comes with certain chores. Before embarking on any study, his lab must get ethical approval from an institutional review board, which can take weeks or months. Then his team has to recruit online participants—easier than bringing people into the lab, but Gray says the online subjects are often distracted or lazy. Then the researchers spend hours cleaning the data. But earlier this year, Gray accidentally saw an alternative way to do things.
He was working with computer scientists at the Allen Institute for Artificial Intelligence to see whether they could develop an AI system that made moral judgments like humans. But first they figured they’d see if a system from the startup OpenAI could already do the job. The team asked GPT-3.5, which produces eerily humanlike text, to judge the ethics of 464 scenarios, previously appraised by human subjects, on a scale from –4 (unethical) to 4 (ethical)—scenarios such as selling your house to fund a program for the needy or having an affair with your best friend’s spouse. The system’s answers, it turned out, were nearly identical to human responses, with a correlation coefficient of 0.95.
“I was like, ‘Whoa, we need to back up, because this is crazy,’” Gray says. “If you can just ask GPT to make these judgments, and they align, well, why don’t you just ask GPT instead of asking people, at least sometimes?” The results were published this month in Trends in Cognitive Science in an article titled “Can AI Language Models Replace Human Participants?”
To read more, click here.