Collaboration Will Offer Data to Train Machine Learning Tools
Researchers at the University of Iowa (UI) have received a $1 million grant from the National Science Foundation (NSF) to develop a machine learning platform to train algorithms with data from around the world.
The phase one grant will enable the UI team to lead a multi-university and industry collaboration and address concerns around patient privacy and data security in clinical AI development.
The researchers noted that although the use of AI is widespread in healthcare, training effective machine learning algorithms require thousands of samples annotated by doctors. This can lead to privacy and security issues, the team stated.
“Traditional methods of machine learning require a centralized database where patient data can be directly accessed for training a machine learning model,” said Stephen Baek, assistant professor of industrial and systems engineering at UI.
“Such methods are impacted by practical issues such as patient privacy, information security, data ownership, and the burden on hospitals which must create and maintain these centralized databases.”
The team will develop a decentralized, asynchronous solution called ImagiQ, which relies on an ecosystem of machine learning models so that institutions can select models that work best for their populations. Organizations will be able to upload and share the models, not patient data, with each other.
As each institution improves the model using their local patient data sets, models will be uploaded back to a centralized server. This ensemble learning approach will allow the most reliable and efficient models to come to the forefront, resulting in a better AI system for analyzing images like lung x-rays or CT scans that detect tumors.
The UI-led team includes researchers from Stanford University, the University of Chicago, Harvard University, Yale University, and Seoul National University.
Over the next nine months, the group will aim to develop a prototype of the system as well as participate in the Accelerator’s innovation curriculum to ensure the solution has societal impact. By the end of phase one, the team will participate in a pitch competition and proposal evaluation and if selected will proceed to phase two, with potential funding up to $5 million for 24 months.
“ImagiQ will further federated learning by decentralizing the model updates and eliminating the synchronous update cycle,” said Baek. “We are going to create a whole ecosystem of machine learning models that will evolve and improve over time. High performing models will be selected by many institutions, while others are phased out, producing more reliable and trustworthy outputs.”
The research team is part of the AI-driven data and model sharing track topic under the 2020 cohort NSF Convergence Accelerator program, designed to leverage a convergence approach to transition basic research and discovery into practice. NSF is investing more than $27 million to support the teams in phase one to develop the solution groundwork for AI-Driven Data and Model Sharing.
The Convergent Accelerator’s AI-Driven Innovation via Data and Model Sharing topic involves 18 teams concentrating on solution development. These research teams will also address a variety of data and model-related challenges and data types to include platform development to enable easy and efficient data matching and sharing.
“The quantum technology and AI-driven data and model sharing topics were chosen based on community input and identified federal research and development priorities,” said Douglas Maughan, head of the NSF Convergence Accelerator program. “This is the program’s second cohort and we are excited for these teams to use convergence research and innovation-centric fundamentals to accelerate solutions that have a positive societal impact.”