AI Health Tools Are Booming, But Their Real-World Effectiveness Remains Unclear

TECH

Key Takeaways

Microsoft, Amazon, and OpenAI have launched AI health tools, with Copilot handling 50 million daily health questions, indicating massive unmet demand.
Researchers warn products are released before independent evaluations, posing potential risks in a high-stakes field like healthcare.
Users lacking medical expertise may not know how to effectively use health chatbots, a gap that lab-based testing might miss.
The absence of third-party data leaves it unclear if these tools help more than they harm, driving calls for transparency standards.

a person holding a cell phone with a chat app on the screen — Photo by Sanket Mishra on Unsplash

The artificial intelligence sector is witnessing an unprecedented surge in health-focused tool development. In recent weeks, tech giants including Microsoft, Amazon, and OpenAI have unveiled products designed to deliver medical advice through advanced chatbots. Microsoft introduced Copilot Health, a dedicated space within its app where users can link their medical records and ask specific health questions. Amazon expanded access to Health AI, a large language model-based tool previously limited to members of its One Medical service. These join OpenAI's ChatGPT Health, launched in January, and Anthropic's Claude, which can access user health records with permission.

Why It Matters

This matters because AI is reshaping healthcare access, but without rigorous evaluations, millions of users might rely on unproven tools, with critical implications for public safety.

Demand Drives the Boom

The driving force behind this trend is massive, unmet demand. Microsoft discloses that its Copilot platform fields 50 million health-related questions daily, making it the most popular discussion topic on its mobile app. Karan Singhal, who leads OpenAI's Health AI team, confirms a rapid increase in ChatGPT usage for medical queries even before specialized products debuted. This flood of inquiries reflects an uncomfortable reality: access to traditional healthcare systems is difficult, costly, and, for many populations, nearly inaccessible. Girish Nadkarni, chief AI officer at the Mount Sinai Health System, notes that these tools find their niche precisely because they fill a critical gap in care.

The Independent Evaluation Gap

Despite corporate enthusiasm, six academic researchers interviewed for this analysis expressed uniform concerns: products are going public before independent experts can rigorously assess their safety and efficacy. Andrew Bean, a doctoral candidate at the Oxford Internet Institute, argues that while it's plausible models have reached a point worth deploying, the evidence base must be solid. The risk lies in trusting companies to evaluate their own tools in a high-stakes area like health, especially if those evaluations aren't available for external review. Even with rigorous internal research, such as that conducted by OpenAI, blind spots may exist that the broader scientific community could identify.

Without trusted third-party evaluations, it remains genuinely unclear whether today's AI health tools help more than they harm.

a set of wooden blocks spelling the word mental — Photo by Greg Rosenke on Unsplash

Limitations of Lab Testing

Current studies suggest real users, lacking medical expertise, might not know how to phrase questions to get useful answers from health chatbots. This gap between controlled lab conditions and real-world use is a challenge some evaluations may miss. Dominic King, vice president of health at Microsoft AI and a former surgeon, attributes Copilot Health's launch to advances in generative AI capabilities for answering health questions. However, he admits demand is the other half of the equation. The ideal vision is that these chatbots improve user health while reducing pressure on the healthcare system, such as by assisting with triage tasks to decide if urgent medical attention is needed.

Implications and What's Next

The proliferation of AI health tools raises urgent questions about regulation and transparency. Without trusted third-party evaluations, it remains genuinely unclear whether today's tools help more than they harm. No one demands perfection, but the lack of independent data creates fertile ground for potential risks. As more companies join the race, pressure to establish testing standards and result publication will intensify. The future of AI in health will depend on balancing accelerated innovation with the responsibility to ensure these systems are not only popular but also safe and effective for all users.

50MHealth questions Microsoft fields daily on its Copilot platform.

Timeline

Jan 2026OpenAI launches ChatGPT Health, an AI tool for medical queries.

Mar 2026Amazon announces Health AI will be widely available, after being restricted to One Medical.

Mar 2026Microsoft launches Copilot Health, allowing users to link medical records and ask health questions.

Mar 2026Academic researchers raise concerns about lack of independent evaluations before mass releases.

AI Health Tools Are Booming, But Their Real-World Effectiveness Remains Unclear

Demand Drives the Boom

The Independent Evaluation Gap

Limitations of Lab Testing

Implications and What's Next

Related Articles

Claude Code Exploitation Fuels Malware Spread: Emerging Cybersecurity Threat

OpenAI's AGI Boss Fidji Simo Takes Medical Leave as Executive Shifts Continue

ZachXBT Accuses Circle of $420M in Compliance Failures Since 2022