ChatGPT passes the five-minute Turing test

People mistook the ChatGPT-4 bot for a person 54% of the time during a five-minute text conversation. However, the results of the experiment have a number of limitations.

The Turing test is considered a way to test the similarity of artificial intelligence to humans. Its essence is that if, during a conversation with a language model, people confuse it with a person or cannot clearly say who the interlocutor is, then the model’s behavior is close to that of a human.

At the same time, mathematician Alan Turing, who developed the test in the early 1950s, also highlighted the test’s flaws. The scientist said that the best way for artificial intelligence to imitate a person is not to answer the interlocutor’s questions, but to evade them, make jokes and interrogate them. Such behavior seems more natural to humans, but in such cases the complexity and versatile thinking of “non-artificial” intelligence are not clearly demonstrated.

However, the Turing test remains an important tool for assessing the capabilities of artificial intelligence, which is one of the goals of new language models. Several varieties have even emerged: they differ in the tasks that are put in front of the model and the experimenters, as well as their duration. According to the work of scientists from the University of California, San Diego (USA), ChatGPT-4 from OpenAI managed to cope with the five-minute Turing test. A preprint with the results of the relevant experiment was published on the website arxiv.org.

The study involved 402 people with an average age of 35. They were offered five minutes to correspond with a randomly selected interlocutor. It could be another person or one of three language models: ELIZA (created in 1966, able to mimic the behavior of a psychologist and called the “ELIZA effect”), interfaces where a person assigns human qualities to a program with a text, such as the ability to think, understand, empathize. approx. Ed.The experiment participants’ task was to identify who they were communicating with and explain their decisions, while the language models’ task was to mimic people.

While the ChatGPT-4 robot deceived people 54% of the time, the result of ChatGPT-3.5 was 50% and ELIZA was 22%. Study participants were able to correctly identify a person in 67% of cases.

The success of ChatGPT-4 points to significant advances in the field of artificial intelligence and the potential problems that may arise if speech patterns become more difficult to distinguish from those of humans, the researchers said. On the one hand, it will be possible to delegate part of the work (for example, customer service) to machines, and on the other hand, with the help of technology, misinformation and fraud will become more frequent.

But the results of the latest experiment, scientists say, not only show the rather high complexity and flexibility of modern language models, but also remind us of the limitations of the Turing test. The study participants judged the “humanity” of the interlocutor much more often not by the completeness and correctness of the answer, but by the style of communication, sense of humor and other socio-emotional characteristics that do not always correspond to traditional ideas about intelligence functions. Also, the results of a longer experiment could be different.

Source: Port Altele

Mary

As an experienced journalist and author, Mary has been reporting on the latest news and trends for over 5 years. With a passion for uncovering the stories behind the headlines, Mary has earned a reputation as a trusted voice in the world of journalism. Her writing style is insightful, engaging and thought-provoking, as she takes a deep dive into the most pressing issues of our time.

Recent Posts

An unprecedented ecosystem found under the ice of an Antarctic lake December 24, 2024

Vivo Y29 5G officially launched: entry level with LED Dynamic Light December 24, 2024

Can animals have schizophrenia like humans? December 24, 2024