What Exactly is Open Access To AI and Why We Are Not There Yet in Healthcare? (Bart De Witte)
In order for AI and algorithms to help improve the health of many, we should strive for algorithms to be open and transparent, says Bart De Witte, founder of HIPPO AI Foundation, a renowned expert on digital transformation in healthcare in Europe, who regularly speaks and posts about technology and innovation strategy, with a particular focus on the socioeconomic impact on healthcare.
After an intense race in AI development lighted by the release of ChatGPT at the end of 2022, two important things happened in the last week of March 2023: Over 1000 tech workers, such as Elon Musk, CEO of Tesla, Twitter and SpaceX, Steve Wozniak, Co-founder of Apple, Yoshua Bengio, Founder and Scientific Director at Mila, Turing Prize winner and professor at University of Montreal, Stuart Russell, Berkeley, Professor of Computer Science, director of the Center for Intelligent Systems, and co-author of the standard textbook “Artificial Intelligence: a Modern Approach", signed a public letter that urges a pause on AI development before humanity as a society decides how humans can control the development.
As written in the letter, “Powerful AI systems should be developed only once we are confident that their effects will be positive and their risks will be manageable.”
A day after this letter was published, UNESCO published a press release that calls on all governments to immediately implement the global ethical framework, which has unanimously been adopted by 193 Member States of Unesco. As warned by Unesco, we need to address many concerning ethical ethical issues raised by AI innovations, in particular discrimination and stereotyping, including the issue of gender inequality, but also the fight against disinformation, the right to privacy, the protection of personal data, and human and environmental rights. And the industry cannot self-regulate, states the press release.
Healthcare is moving from the era of gathering data through digitalized systems, EHRs, sensors, and wearables to the era of mining that data for better patient outcomes and operational efficiency. But how widely accessible are these innovations?
In this short discussion, recorded at the Vision Health Pioneers Demo Day on 28 March in Berlin, Bart explains: why is open and transparent AI important for the greater good in healthcare, where global medical development is going with different values and regulations about AI and data, and comments on the upcoming European Health Data Space.
TRANSCRIPT
[00:03:52] Bart, what are some of the discussions about open AI that we are not having, but should be having?
[00:04:09] Bart: Well, we should all consider if we are dependent now at the moment from one single institute or closed system that is influencing a hundred million brains at the moment. And we don't have any insight into that model. On the training data on anything that has been used. We should really consider if we really wanna go back to the Middle Ages where knowledge, open access to knowledge was not given and where there was one single institution that defined what morality was. There was an institution in Rome, that had a secret library called the Apostle Library in the Vatican and we bought letters of indulgence. That was the Middle Age. And are we going back into that? So we can decide if we wanna still enlighten ourselves and get open access to knowledge, or if we wanna really subordinate ourselves to a single AI that defines what the truth is?
[00:05:03] Where do you see the most resistance against open AI in medicine?
[00:05:11] Bart: Well, medicine is a bit like Game of Thrones, so nobody wants to really open; it's all about kingdoms that want to keep on having their authority. So I think the biggest resistance is culture and mindset.
I think people don't really understand what openness means and what the advantages are. I think one of the triggers is, well, the incentives in research, like a researcher, wants to be the first one to publish that paper. So he will not collaborate with hundred other institutes to build a larger AI model because then he needs to share his paper with a hundred other authors. I think we need to work on incentives, and we need to see how we can in a smart way, use all these resources.
“Open AI has listed a thousand employees; how many researchers do we have in Europe in public research institutes working on AI? Probably a few 20,000. And we cannot compete because we are not aligned. And I think we need to align in a shared vision of what we really want to achieve in Europe.”
[00:06:18]When Chat GPT was released to the public domain, many saw this as a revolution of really democratizing the use of AI. All you had to do, all you still have to do is go to the chat AI page and basically discuss things with the model, which learns in addition to that. And many see this as very democratizing, but I don't think you share that opinion. Why is that?
[00:06:45] Bart: A democracy is not somebody deciding whether you get access. That's feudal. If you own territory and you then have a gate, and you say you can come in and you cannot and you define the rules, then that is this feudalism that is not democracy.
Democracy was created by Lincoln address to modern democracy, where you said the government has to be off the people, by the people, and for the people. And what open AI does now is for the people, but not of the people and by the people. And true democratized technology means that we have common ownership, not public ownership controlled by our government. Because that's also dangerous, that's what China is doing. I think we need to work on commonly owned things, and that's what we call AI Commons or Data Commons.
[00:07:35] It's still sometimes difficult to understand what the world where algorithms are opened would look like. So given that you are also an advisor to digital health startups and companies across the world, how would you argue what kind of business model they can have if they don't protect their IP and what they are developing internally because it takes a lot of resources to develop algorithms, to pay people to be competitive on the market.
[00:08:11] Bart: So what if you share these resources? You say it takes a lot of resources. What if you share that cost with 10 others and you start jointly building things together? You will have 10% of your R&D costs. It doesn't make sense that every single startup is buying data, and training their own AI models. In the beginning, that really works, but over time they just wanna scale their product. They want to create a great customer experience. And there is this really dichotomy on the market that people think that the price of a product is, equally... the closed AI and the value of that closed assets. No. That is of interest of the investor. The investor wants to create assets like financial security. So what we are doing here is that the financial market is capturing data and capturing AI models that we protect, not IP but trade secrets, and then the investor can exit that startup, and that's when you have an exit mindset that you want to capture as much value as possible and make things as closed as possible.
So if you have an investor that thinks that way and you want to get him on board, think twice. What your official goal is if you're going for a three-year run or you wanna do an exit and quickly get some bucks, that's fine.
“But if you really wanna create truly great products, then a great product, a physician will trust more an AI service that he uses that is open and peer reviewable than something that is closed.”
And I think if you have more trust in that solution, your solution will scale faster. You will get more clients, and you will get more trust in that market. And then you will become perhaps a market leader. I don't think all these things that we are developing today are really built for scaling a certain business.
It is more with the investor mindset to capture as much value and to be able to sell that value quite quickly.
[00:10:08] Different cultures and different countries have different regulations when it comes to data privacy and data sharing what needs to be taken into account so you can use the data to either train your models or develop new solutions? So from that perspective, and especially given that you lecture across the world and also in China, are we going in the direction where basically there's going to be a discrepancy between access to novelties in the medical findings, in the scientific findings because of these different rules?
[00:10:45] Bart: People think... Yesterday I had a discussion strangely with somebody from the Ministry of Health in Germany who thought IP on data was something really good. And so what is the role of the ministry I asked. It's crazy that he made that statement. I think there is this misconception that only when you close things you are able to get investments and innovate.
20 or 30 years of software development, 30 years ago, everybody was closed. You could not even learn how to program because you had to go to a company called Unix or IBM to learn how to code. Nothing was publicly available; everything was proprietary. And as everything got democratized people could learn how to code. They can do it like at home. They can read it. They can go into libraries - as an analogy to the books that we were reading- we can to go to code libraries and read code and then replicate that code. And that created the whole ecosystem.
And people forget that we have on the demo day today that all these startups here would not exist if the software was not democratized. Without a democratization of these tools, you can't build. I have a bit of a feeling that people think today if Gutenberg had invented his printing machine today, his investor would probably advise them to patent every single printed letter of the alphabet.
And then probably thinking that's a good business model because you're gonna patent a single letter. That would've meant that everybody who wrote a book needed to license every single letter for writing that book. We probably would've seen only a thousand books. Luther would probably never have written his Bible and we probably would still be buying letters of indulgence.
Like we haven't learned anything from history, what openness did. And the enlightenment did. And I think
“in healthcare, there was a lot of proof during Covid- all the barriers were broken down the fact that we got so quickly the vaccine was based on open data sharing from the Chinese government on a platform called virology.org. There were so many open-source platforms and collaborations, and we saw the speed. Now we are pulling back; we are pulling back. We are fencing the walls and things are slower again.”
And I think we could see a glimpse ... I think it's an important moment in history that we suddenly had a joint enemy that was taking our freedoms and because of the joint enemy, we started to collaborate. I think that the best way to trigger open source collaboration is what is your enemy? What is the joint enemy that you have to join to collaborate to start joining forces, to build something equally... performance as closed AI or any sort of solution.
[00:13:27]Europe has a very fragmented healthcare ecosystem because of different regulations, and different languages different countries have different structures of their healthcare systems, but at the same time, we are building the European health data space, which is aiming to do a few things that go into the open approach: anonymizing data and enabling researchers to get a broader pool of available data for research. What are some of the things you believe that we as individual patients or consumers could be, should be mindful of when it comes to the European health data space and the whole development that we expect to see.
[00:14:12] Bart: I oppose the idea to even set any sort of paywall in front of data because as soon as you even... one hospital starts to say we need to pay for these resources and we need to clean the data and infrastructure so we're gonna ask for payment fee it's kind of similar as when we did that with public access or shifting public access to research papers to the demand into Elsevier and Springer. Now people need to pay 60 euros to get access to a research paper that is paid by tax dollars or Euros in this case and we can't access it. We are doing now the same with data. We today have a paywall that is perhaps limited, but that is already problematic putting a paywall. Why? Because you allow larger organizations that have more access to capital to access more data points because as soon as you equal data to capital you will allow that a larger capital-rich organizations are able to access more data pools because they have just a resource.
And I think that is just going to accelerate the monopolization of these things. The second thing I criticize, there is one passage in the proposal writing of the European data space where there is written that data can still be protected by IP, which is legal bullshit, sorry from my words, because you cannot protect data with IP.
You can protect databases with IP, but not data, but it is written in that way. And there was kind of written that certain data sets need to be protected. That means that within that common data space, there will be private organizations that always will keep their data, will not share. And the public infrastructures are sharing the data that is just going to create new asymmetries because it's a pool from the public sphere into a private sphere.
And I think that is trying to do something good but then trying to make a compromise and compromises in this case, are gonna not lead to good results. And then the third thing is that in Germany, they started pushing for the opt-out and I said, yes, opt-out is perhaps nice. But this week I used a dermatology app from a German startup, and I needed to answer a question: do I give consent to use my data for research?
And I'll ask them before I answer the question, can you answer me if my data's gonna be used to create closed AI protected by IP or trade secrets, or is gonna be open source? I never got an answer, so I never used the app. And I think we need to start differentiating because the content for research means now, today in the data science AI industry, that you give consent for data science or data scientists, and then people allowing to close that knowledge that you extract of data and be able to sell it back to you.
And I think that is just a wrong formulation. So I think if we can create an option that we have data trust where we can give our data and these data trust secure, all our data will be used for open knowledge. I think that will be a better win. My preference rather would be that Europe would really enforce open source standards because then all the others will not be able to enter our market because we say like, well, we don't close these things.
You need to open it. And the market advantages of Google and others will disappear in healthcare, and perhaps we will have 20,000 small and medium enterprises spread across Europe, creating local jobs. I think that's a much more promising scenario than thinking that we need a Google in Europe. I think that's absolutely not consistent with what Europe actually is. Although people dream of these things.
Become a part of the community that works on open sourced AI: www.askpaper.ai.