IRB-approved registries available for use by researchers
To expedite clinical data delivery, UF Health’s data experts are introducing ready-for-use, UF Institutional Review Board-approved patient record registries to help faculty and staff advance medical knowledge and the delivery of care.
A new dataset has just been released featuring details about more than 300,000 patients diagnosed with or suspected of having cancer at UF Health since Jan. 1, 2012. It is available for use by anyone within the UF and UF Health community. This follows broad use of the UF Health COVID-19 patient dataset, which includes records for 300,000-plus patients who have presented with COVID-19-like symptoms and been tested for COVID-19 at UF Health since Jan. 1, 2020. With pre-approval by the IRB, the protected patient information in these registries is de-identified and delivered quickly, bypassing the customary study-specific review.
“Our faculty have reaped the benefits of expedited access to the COVID-19 dataset,” said Gigi Lipori, M.T., M.B.A., chief information officer and senior vice president for UF Health. “UF researchers have published several research papers using these data, and we’ve heard from health science professors using the dataset in the classroom.”
Releasing similar, condition-specific datasets following the new cancer package will continue to support faculty endeavors. Potential future datasets are being considered with information about patients with neurological disorders or pain conditions.
At every patient encounter in our hospitals and outpatient programs, valuable clinical information is collected for the patient electronic health record that feeds the UF Health Integrated Data Repository. This master database houses more than 2 billion observation facts from more than 2 million UF Health patients and is managed by staff in IDR Operations. The data are delivered to researchers by IDR Research Services, a joint program of UF Health IT and the UF Clinical and Translational Science Institute, led by Chris Harle, Ph.D., UF Health IT chief research information officer and professor of health outcomes and biomedical informatics.
“We’re here to support and expedite the work of our research team partners. High-quality, accessible, and reusable data are at the heart of innovation in the academic health environment, especially as UF emerges as a national leader in artificial intelligence,” Harle said. “The more we can create datasets and data pipelines for rapid access, the more we can support effective, high-quality patient care, innovative clinical research, and meaningful health education.”
Data are delivered in Observational Medical Outcomes Partnership, or OMOP, Common Data Model format. As a standard format supported by a global community of data scientists, OMOP makes it easy for analysts on UF research teams to organize and answer the teams’ research questions. IDR Research Services also provides a data dictionary describing each data element available for analysis.
According to Tanja Magoc, Ph.D., assistant director for IDR Research Services, “Clinical researchers might use the data to apply artificial intelligence or other analytic methods to develop predictive models of specific patient outcomes, to develop phenotypes, or to evaluate success of different medical treatments. Faculty, fellows, students
Important links: To access pre-approved data via these user-friendly self-serve tools, please make a request online at IDR.UFHealth.org. View the UF IRB’s Research Investigator Guidelines for more information on open access data banks here.
Courtesy Associate Professor of Health Outcomes & Biomedical Informatics | UF College of Medicine (Former Director of Artificial Intelligence and Decision-Making for Health Outcomes & Biomedical Informatics)
In which class did you use the COVID-19 dataset?
We used it for a key course in the graduate programs in biomedical informatics, titled “Foundations of Biomedical Informatics,” initially developed by Bill Hogan, M.D., M.S., director of biomedical informatics and data science.
The course I taught was project-based, and students worked in teams. Each team decided on a topic of interest and one team was interested in exploring the COVID-19 dataset. Aside from class team, I met with each team weekly to fine-tune the question they were answering. The team working on COVID-19 moved pretty quickly, so the work with them was really to help them explore the dataset in a critical fashion. They worked quickly enough that their work, focused on COVID-19 and nutrition deficiencies is currently under review for a journal publication.
What was helpful about the availability of this dataset?
The fact that the data didn’t require an IRB approval made it practical to use in class. In 15 weeks, if we can’t get data in the first couple of weeks, there isn’t enough time to do something meaningful, so having a fully deidentified data set that’s under an umbrella IRB makes it so much easier. Also, having a very thorough data guide, and the data provided in OMOP format made the data set very straightforward to use.
Do you teach students about the Integrated Data Repository?
Yes, and in my lectures I also cover working with the IDR Research Services UF Health i2b2 cohort discovery tool.
What advice would you offer faculty who may want to use similar datasets for teaching?
Just do it! It’s a lot of fun, it gives really hands-on experience to your students. You can use the dataset in various ways: for instance, to understand the importance of data representation, to present the OMOP data model and compare with other models, or to use the data set for analysis purposes.
Is there anything else you’d like to add?
I just want to thank the IDR Research Services team for developing these tools. They’ve made it possible for us to combine teaching and research very seamlessly. So, thanks Chris, Tanja and team!
Author: Kim Rose, Communications Manager, UF Health IT firstname.lastname@example.org