The National Committee on Quality Assurance (NCQA) has launched an artificial intelligence working group to determine how to best measure performance of high-risk AI once it has been deployed by health plans and providers.
The 35-year-old organization runs a spate of quality measurement and reporting programs, like health plan accreditation and the Healthcare Effectiveness Data and Information Set (HEDIS) measures used by 90% of health plans, according to the Office of the Assistant Secretary for Planning and Evaluation.
The NCQA has convened more than 30 organizations to share their experiences using AI and help create standards for the technology. Some members of its AI working group are the American Academy of Family Physicians, America's Health Insurance Plans, Blue Cross Blue Shield of Tennessee, the Community Care Plan, Covered California, the Kaiser Foundation Health Plan and United HealthCare.
"As the modality of care, as the channels of healthcare delivery continue to evolve, and as we continue to see a very evolving healthcare delivery landscape, we do want to take a very hard look at what additional things we can do to continue putting that lens on quality and putting quality front and center," Vik Wadhwani, chief transformation officer at NCQA, said in an interview.
The NCQA hopes the monthly convenings of the AI working group will be a good opportunity for the organization to learn about the issues facing health plans and providers during post-deployment monitoring of AI. As the working group members share their experiences using responsible yet innovative AI, NCQA hopes the members can learn from each other as well.
The group also wants to gain consensus about risk-based AI governance and monitoring, though Wadhwani pointed out that there could be multiple definitions depending on the use case. To do so, the group will define high-risk AI use cases.
The NCQA seeks to identify where health plans and providers need help monitoring the effectiveness of AI. The problems they identify will then inform the output of the group, like finding policy alignment for transparency requirements, performance management of AI systems or an AI playbook.
The NCQA AI working group will focus on AI at the intersection of quality and patient safety that is high impact and high risk.
“Not all AI methods have the same level of risk,” Wadhwani said. “Large language models, generative AI [have] a different level of risk than very deterministic algorithms that have been in production in many places for a long time.”
Wadhwani is also interested in performance measurement for AI-assisted systems, in which multiple algorithms and agents are working together to complete tasks. How an organization might monitor and evaluate the effectiveness of a complex system is a question the group may consider.
“The industry has spent a lot of time focusing on, rightfully so, the effectiveness of individual AI models, but there's a lot more thinking to be done around AI-assisted systems and platforms. In those use cases, there aren't one, two or three models in a high-risk use case like utilization management, there's actually a series of models with agents orchestrating those models on either side of the stakeholders that are part of the process and value chain here," he said.
The working group will focus on the post-deployment measurement, once health plans and providers have already purchased AI technology. They will not be creating resources for AI vendors that relate to model training or data use for model testing.
The NCQA has a variety of health plan and provider accreditation, certification and recognition programs. Wadhwani said the AI work group is still determining what the end product of its work will be. It could mirror existing quality measurement programs, be an add-on to an existing framework or be a toolkit.