GPT-4 does not reach Turing threshold

On its path to AI superstardom, ChatGPT has been haunted by one question: Has it passed the Turing Test at producing output indistinguishable from human response?

It’s close, but not quite, say two researchers at the University of California, San Diego. ChatGPT can be smart, fast and impressive. It does a good job of showing clear intelligence. When talking to people, he speaks like a human being and can even make jokes, imitate the narrative style of young people, and pass law school exams.

But sometimes it turns out that it provides completely false information. These are hallucinations. This is not reflected in its own results. Cameron Jones, an expert on language, semantics and machine learning, and cognitive science professor Benjamin Bergen drew on the work of Alan Turing, who 70 years ago developed a process to determine whether a machine could reach the point of intelligence. communication. A skill with which he can trick someone into believing they are human.

“Does GPT-4 Pass the Turing Test?” Reports titled are available on the preprint server arXiv. They recruited 650 participants and created 1,400 “games” in which participants had short conversations with another person or a GPT model. Participants were asked to indicate who they were talking to.

Researchers found that GPT-4 models deceived participants 41 percent of the time, while GPT-3.5 only deceived participants 5-14 percent of the time. Interestingly, in only 63% of the trials, humans managed to convince the participants that they were not machines. The researchers concluded: “We found no evidence that GPT-4 passes the Turing test.” But they noted that the Turing Test still had value as a measure of the effectiveness of machine dialogue.

“The test remains valid as a basis for measuring free social interaction and deception and understanding human adaptation strategies to these devices,” they said.

They warned that in many cases, chatbots can communicate convincingly enough to deceive users in many cases.

“The 41% success rate shows that AI models are already likely to deceive, especially in contexts where interlocutors are less careful that they are not talking to a human,” they said. “AI models that can reliably mimic humans could have broad social and economic implications.”

Researchers observed that participants who made correct identifications focused on several factors.

Models that were too formal or too informal raised red flags for participants. If they were too long or too short, their grammar or punctuation use was exceptionally good, or “unconvincingly” bad, their usage became an important factor in determining whether participants were dealing with humans or machines.

Test takers were also responsive to general responses.

“LLM programs are taught to produce high-probability outcomes and are fine-tuned to avoid conflicting views. “These processes, the researchers say, can promote generalized responses that are typical of the whole but lack the idiosyncrasies that are unique to the individual: a kind of ecological error.”

The researchers suggested that tracking will be important as AI models become more fluent and more people notice their quirks in conversation.

“It is becoming increasingly important to identify factors that lead to fraud and strategies to mitigate it,” they said. Source

Source: Port Altele

Mary

As an experienced journalist and author, Mary has been reporting on the latest news and trends for over 5 years. With a passion for uncovering the stories behind the headlines, Mary has earned a reputation as a trusted voice in the world of journalism. Her writing style is insightful, engaging and thought-provoking, as she takes a deep dive into the most pressing issues of our time.

Recent Posts

An unprecedented ecosystem found under the ice of an Antarctic lake December 24, 2024

Vivo Y29 5G officially launched: entry level with LED Dynamic Light December 24, 2024

Can animals have schizophrenia like humans? December 24, 2024