AI show signs of cognitive decline similar to humans

Related stories

A recent study reveals that nearly all leading large language models, or “chatbots,” display signs of mild cognitive impairment in tests designed to detect early dementia. The findings also indicate that “older” chatbot versions, similar to older patients, perform worse on these tests. The authors argue that these results challenge the belief that artificial intelligence will soon replace human doctors.

All of the leading chatbots show signs of mild cognitive impairment on dementia tests, with older versions performing worse, mirroring patterns seen in aging humans.

The results “challenge the assumption that artificial intelligence (AI) will soon replace human doctors,” according to the authors of the study published in The BMJ, a peer-reviewed medical journal.

While chatbots have excelled in medical diagnostics, fueling speculation that AI might eventually surpass human physicians, their vulnerability to human-like impairments, such as cognitive decline, had not been examined until now.

Researchers assessed the cognitive abilities of the most popular publicly available large language models (LLMs), including OpenAI’s GPT-4 and GPT 4.0, which power ChatGPT, Anthropic’s Claude 3.5, and Alphabet’s Gemini versions 1.0 and 1.5.

They found that almost all showed signs of mild cognitive impairment, while older versions, like older patients, tended to perform worse on the tests.

The models were examined using the Montreal Cognitive Assessment (MoCA) test. This test detects cognitive impairment and early signs of dementia, usually in older adults.

Through a number of short tasks and questions, it assesses abilities including attention, memory, language, visuospatial skills, and executive functions. The maximum score is 30 points, with a score of 26 or above generally considered normal.

The examined LLMs were given the same instructions for each task as those given to human patients. Scoring followed official guidelines and was evaluated by a practicing neurologist.

GPT-4o achieved the highest score, 26 out of 30, followed by GPT-4 and Claude with 25 out of 30 each. Gemini 1.0 scored the lowest, with 16 out of 30.

According to the study, all chatbots performed poorly in visuospatial skills and executive tasks, such as the trail-making task, which involves connecting encircled numbers and letters in ascending order.

They also struggled with the clock-drawing test, which requires drawing a clock face showing a specific time, while Gemini models additionally failed the delayed recall task, which tests memory of a five-word sequence.

Most other tasks, including naming, attention, language, and abstraction, were performed well by all chatbots.

While these are observational findings, and the authors acknowledge key differences between the human brain and large language models, they note that the “uniform” failure in tasks requiring visual abstraction and executive function highlights a significant weakness for chatbot use in clinical settings.

“Not only are neurologists unlikely to be replaced by large language models any time soon, but our findings suggest that they may soon find themselves treating new, virtual patients – artificial intelligence models presenting with cognitive impairment,” the authors said.

Nelson Saliu https://techpolyp.com/

Subscribe

- Never miss a story with notifications

- Gain full access to our premium content

- Browse free from up to 5 devices at once

Uganda May Lift Facebook Ban After 4-Year Digital Blackout

China starts mass producing humanoid robots

Bureau Secures $30 Million Series B Funding to Combat Global Fraud

Albania Bans TikTok One Year After Teenager’s Death

Top US App Downloads of 2024 Revealed

How to Build a Successful Tech Startup from Scratch

The Latest Smart Home Security Systems: Protecting Your Home and Family in the Digital Age

10 Red Flags of Online Dating Scams: Protect Yourself from Heartbreak and Financial Loss

Uganda May Lift Facebook Ban After 4-Year Digital Blackout

China starts mass producing humanoid robots

Bureau Secures $30 Million Series B Funding to Combat Global Fraud

Albania Bans TikTok One Year After Teenager’s Death

Top US App Downloads of 2024 Revealed

How to Build a Successful Tech Startup from Scratch

The Latest Smart Home Security Systems: Protecting Your Home and Family in the Digital Age

10 Red Flags of Online Dating Scams: Protect Yourself from Heartbreak and Financial Loss

AI show signs of cognitive decline similar to humans

Uganda May Lift Facebook Ban After 4-Year Digital Blackout

Samsung To Finally Deliver More RAM And Storage in the coming Galaxy S25 Series

The role of technology in fitness

China starts mass producing humanoid robots

Bureau Secures $30 Million Series B Funding to Combat Global Fraud

Uganda May Lift Facebook Ban After 4-Year Digital Blackout

Samsung To Finally Deliver More RAM And Storage in the coming Galaxy S25 Series

The role of technology in fitness

China starts mass producing humanoid robots

LEAVE A REPLY Cancel reply

TechPolyp

Latest

Uganda May Lift Facebook Ban After 4-Year Digital Blackout

Samsung To Finally Deliver More RAM And Storage in the coming Galaxy S25 Series

The role of technology in fitness

Popular

Uganda May Lift Facebook Ban After 4-Year Digital Blackout

Samsung To Finally Deliver More RAM And Storage in the coming Galaxy S25 Series

The role of technology in fitness

Sitemap