New Study Highlights Gaps in AI Search Accuracy and Reliability

Sep 20, 2025

This post is also available in: עברית (Hebrew)

A recent study has raised important concerns about the reliability of AI-powered search tools, showing that many of their responses are not backed by the sources they cite. The findings suggest that while these tools may be fast and convenient, their outputs often require closer scrutiny.

The study, conducted by researchers at Salesforce AI Research, examined the performance of several leading AI-based search platforms—including Perplexity, You.com, Microsoft’s Bing Chat, and OpenAI’s GPT-4.5—using a purpose-built evaluation framework called DeepTRACE. The audit involved more than 300 questions and assessed the tools across eight different criteria.

According to TechXplore, the results showed significant inconsistencies. GPT-4.5, for example, produced unsupported claims in nearly half of its responses (47%), while other tools ranged between 30% and 40%. In many cases, citations either failed to support the claim or were unrelated altogether.

The audit also tested how these systems handle two types of questions: debate-oriented prompts, which involve contentious or politically sensitive topics, and expertise-driven queries that demand subject-specific knowledge. In debate scenarios, AI responses often presented a one-sided viewpoint, with little consideration for counterarguments. These unbalanced replies were frequently delivered in an authoritative tone, raising concerns about their potential to reinforce user biases and limit exposure to diverse perspectives.

Human reviewers verified the results produced by DeepTRACE to ensure the findings were accurate and reflective of real-world usage.

Beyond identifying flaws, the study also demonstrates a path forward. DeepTRACE provides a practical method for systematically evaluating how AI systems handle information retrieval and citation. The researchers argue that such frameworks are essential as AI tools become more integrated into public and professional information workflows.

While AI continues to offer benefits in automating search and research tasks, the findings serve as a reminder that users should approach outputs critically and verify sources when accuracy matters.

The full findings are available on the arXiv preprint server.

New Study Highlights Gaps in AI Search Accuracy and Reliability

Latest

Air Force Explores Atomic Clock-Based Navigation for GPS-Denied Drone Swarms

Army Expands Battlefield Use of 3D Printing to Speed Up Repairs

Implementing Technology in VIP Protection in Complex Environments and Mass Events

Security Researchers Warn: AI Code Assistants Pose Security Threats

Liverpool City Council Confirms Ongoing Cyberattacks Linked to Russian Hacker Group

AI-Driven Aircraft Crash Survival System Unveiled by Student Engineers

New Framework Enhances Quality Control in Synthetic Data for AI Training

Wait, this isn’t ChatGPT? Malware Uses Open-Source AI App to Deploy...

New Study Offers Simple Rules for Coordinating Swarming Robots Inspired by...

Compact Foldable Drone Demonstrates Fast Deployment and Instant Stability

Malicious Apps Masquerade as WhatsApp and Chrome Through Google Search Manipulation

AI Tools Are Reshaping the Way the Military Prepares for Conflict

Switchblade 600 Successfully Launched from MQ-9A Reaper in Long-Range Strike Test

DNA Cassette System Offers New Approach to High-Density Data Storage

NASA Introduces AI Model for Early Solar Storm Forecasting

AI-Driven Fire Detection System Promises Faster, More Reliable Alerts

Popular AI Apps Expose Sensitive Data

Earth Mapping Enhanced with Space-Based LiDAR Technology

New High-Energy Laser Weapon Unveiled for Countering Drone Threats

FTC Urged to Probe Microsoft Over Security Practices Linked to Major...