F128 What Are We Missing About AI Development in Healthcare? (Casey Ross)

 

AI has huge potentials to do good, but we’re not quite there yet, says Casey Ross - STAT News investigative reported, focused on AI development in healthcare. 

There is not a shortage of optimism and inspiring ideas about how AI could impact healthcare. The basic notion is simple: AI is capable of crunching numbers and making sense for one’s health improvement. In the future, with more data and more devices, we will have AI tools that will almost heal. Maybe someday that will be true. At the moment, there is still a lack of clarity around how useful the currently existing tools are, how they are regulated. We are in the early stages of AI development, especially in the diagnostics and decision support space.

“It is a very hard thing for clinicians to evaluate AI since there is a lot of positivity in the headlines about the use of the technology. It takes a lot of examining the research to see what is reported, what is not reported,  is there or isn't there any data on the racial composition of the dataset or the geographic distribution of the data set or the gender of participants. And if those questions aren't asked, and they're not answered, then, I don't think there's going to be confident in these products,” says Casey Ross who among other projects embarked on a several months-long digging into data reported in hundreds of pages of documents filed with the FDA over the last six years by companies that ultimately gained the approval of products that rely on AI.

The challenges in the approval of AI tools

Apart from the lack of data about the racial composition of the samples, some samples included only 100 patients, other 15,000 patients. 

“Sometimes the differences make sense in that some products are meant to be used in clinical care and others by consumers. But even in looking within specific product codes of products that are essentially designed to be used in the same way, there was a significant difference in the amount of data that was being applied in the validation study. And there just isn't any, um, public-facing information that tells you what the threshold is. What is the power analysis that's being done on the data to determine where the effect size can be achieved based on the data set.”

Who is competent to assess AI?

Casey Ross.

Casey Ross.

With the rising complexity of technology development, it is impossible for consumers and clinicians to gain the expertise needed for a valid assessment. That is why the role of regulators is that much more important. It also puts additional pressure on the regulators to have the expertise needed for reliable and safe evaluation.  The way the FDA works, explains Casey Ross, is that when it approves a product follow-up studies with providers who agree to use technology.

“Right now, there is a lot of pressure on providers to use and take up these products. And they really don't know that they're going to work. And the danger I think is really that without the confidence in the assessment, the FDA's approval begins to look more like a rubber stamp as opposed to a real meaningful certification of a product's value.”

To be fair, FDA is asking a lot of important, valuable questions about how the products work, how is AI applied, and how the validation was done. “But we don't get to see the answers and we don't get to see that back and forth. So it is very hard to get any clear picture of what is actually being done. That was surprising to me. I expected to get a better sense of how that process was unfolding,” Casey Ross says. 

Early days of studies about the use of AI in clinical practice

At the moment, a lot of the studying, a lot of the testing, and the validation that is done are all done on retrospective data sets most of the time. It's plugged in a clinical setting and used in perspective care. “So you can't really tell if in a more complicated, messy clinical setting the introduction of the AI tool is actually gonna benefit patient outcomes. There is enough data on that as yet because AI development is just in the beginning in different domains.”

This issue is being addressed by introducing products in clinical settings and studying them. “Massachusetts Institute of Technology has a breast cancer, risk prediction tool, which is now being tested in countries all across the world from Nordic European countries to Taiwan, to Brazil, to Africa, African-American populations. It takes an awful lot of commitment to ensuring fairness that it's gonna work for everybody,” Casey Ross adds.

How good should an AI tool be, if it outperforms doctors?

This opens up the question of the assessment criteria of technology. What if an AI tool outperforms the clinicians but is not 100 % accurate in all cases. That should be taken into account in technology assessment. “In so many different domains of our lives we use machines that work. And when you use a machine, you expect it's going to be perfect every time it doesn't get tired. Artificial intelligence is trained by human beings on data with inherent biases within it. I think judging it on an absolute, no-fail scenario is just not the right way to look at it because I don't think, I don't think it can ever really pass that test.”

Cost savings? Unclear.

The potential of AI are, in theory, huge. However, this is not a free technology and another open question is, will its use bring down costs or just increase healthcare expenditure. Looking at history, the latter is not impossible. Casey Ross: “One story I think about all the time is from the late nineties when an earlier generation of AI was put into use in breast imaging. CAD - computer-aided diagnosis technology. It was approved by the FDA in the late 1990s, it got a reimbursement through CMS and everybody started using it because everybody thought that they would make care better. Seven years later after huge studies were done, we found out that it did not improve outcomes. It did not improve care. It only added $400 million in cost a year. And that's the danger that we're facing right now.”

Tune in for the full episode.

Some questions addressed:

  • Casey, you are an investigative reporter covering the use of artificial intelligence in medicine and its underlying questions of safety, fairness, and privacy. So in your opinion is AI more a good or more a bad thing? Are we on the right path to use it for good?

  • Recently you reported that only 7 of 161 AI products cleared by the FDA in recent years, included any information about the racial composition of their datasets. Those devices were cleared to use AI for the diagnosis of a wide array of serious conditions, including heart disease, strokes, and respiratory illnesses. I was following some presentations of AI solutions at a digital health conference in the UK and it’s very difficult from the listener’s perspective to be anything but impressed. Usually, the tools address very specific and detailed medical issues and the solutions seem like a bright light at the end of the tunnel. Obviously, the results are usually compelling. So how can a clinician look at a specific solution critically, when they don’t have AI knowledge? 

  • If we look at your research and insight about how different are the datasets submitted to the FDA, it seems that regulators are lacking clear guidelines as to what criteria they use to evaluate and approve AI solutions. The number of patients in 161 AI products cleared by the FDA that you analyzed between 2012 and 2020 ranged between 100 to 15,000 patients. I find that quite concerning. Should we be concerned, or are discrepancies potentially unproblematic? 

  • You covered the downfall of Watson and revelation about the technology that was more than anything well marketed. However, for years we lived under the impression that Watson is the next big thing in medicine. Why is it, in your opinion, that it takes years for the truth about fraud revealed in disruptive technologies? Does the fact that disruptive technologies are new and unconventional hence difficult to judge a part of it? (Watson was at the forefront of AI in healthcare, as was Theranos in biotech)