An illustration of using AI applications on a mobile phone Photo: VCG
Can artificial intelligence (AI) recognize and understand things like human beings? Chinese scientific teams, by analyzing behavioral experiments with neuroimaging, have for the first time confirmed that multimodal large language models (LLM) based on AI technology can spontaneously form an object concept representation system highly similar to that of humans. To put it simply, AI can spontaneously develop human-level cognition, according to the scientists.
The study was conducted by research teams from Institute of Automation, Chinese Academy of Sciences (CAS); Institute of Neuroscience, CAS, and other collaborators.
The research paper was published online on Nature Machine Intelligence on June 9. The paper states that the findings advance the understanding of machine intelligence and inform the development of more human-like artificial cognitive systems.
Humans can conceptualize objects in the natural world, and this cognitive ability has long been regarded as the core of human intelligence. When people see a "dog," "car" or "apple," they can not only recognize their physical characteristics (size, color, shape, etc.), but also understand their functions, emotional values and cultural meanings, said Du Changde from Institute of Automation of CAS, also first author of the paper, the China News reported.
Du said such multidimensional conceptual representations form the cornerstone of human cognition.
In recent years, with the explosive development of LLM such as ChatGPT, the fundamental question of whether these large language models can develop object concept representations similar to humans from language and multimodal data has emerged and attracted widespread attention.
He Huiguang, a researcher at Institute of Automation of CAS and the corresponding author of the paper, pointed out that traditional AI research has focused on the accuracy of object recognition, but seldom explored whether models truly "understand" the meaning of objects. "Current AI can distinguish images of cats and dogs, but the essential difference between this 'recognition' and human 'understanding' still needs to be clarified," according to a press release sent to the Global Times by the CAS on Tuesday.
The Chinese team combined behavioral experiments and neuroimaging analyses to explore the relationship between object-concept representations in LLMs and human cognition.
They collected 4.7 million triplet judgements from LLMs and multimodal LLMs to derive low-dimensional embeddings capturing the similarity structure of 1,854 natural objects.
The resulting 66-dimensional embeddings were stable predictive, exhibiting semantic clustering similar to human mental representations.
Remarkably, the underlying dimensions were interpretable, suggesting that LLMs and multimodal LLMs develop human-like conceptual representations of objects.
The team then compared the consistency between LLMs and humans in behavioral selection patterns, and the results showed that LLMs performed better in terms of consistency. The study also revealed that humans tend to combine visual features and semantic information when making decisions, whereas LLMs rely more on semantic labels and abstract concepts.
He said the study has made the leap from "machine recognition" to "machine understanding." The result shows that LLMs are not "stochastic parrots." Instead, these models have an internal understanding of real-world concepts much like humans. The core finding is that the "mental dimension" arrives at similar cognitive destinations via different routes.