Scientists supported by NIAID are helping an international research consortium harmonize HIV data from around the world.
The International Epidemiology Databases to Evaluate AIDS (IeDEA) collects observational data representing over 2.2 million people living with and at risk for HIV. This international research consortium, established by NIH in 2006, collects data from clinical centers and research groups in seven geographic regions — which include 44 countries across five continents.
IeDEA networks combine de-identified health data from regional databases in multiple parts of the world for approved multiregional analyses. This helps answer HIV research questions that individual studies cannot address.
However, harmonizing datasets from different regions presents many challenges. Datasets from each region may be in different formats or languages, and regions are subject to different data-sharing regulations.
When researchers request data for multiregional studies, data managers are tasked with selecting data that match the study’s inclusion and exclusion criteria and mapping the requested data to the IeDEA Data Exchange Standard (DES). This process has historically required significant effort, which can result in delays in sending the data and challenges with subsequent analysis of the standardized data.
Developing the Harmonist Toolkit
To make it easier to harmonize data from multiple regions, NIAID-supported informatics specialists developed the Harmonist Data Toolkit. The Harmonist Toolkit is a web-based application that checks for data quality and DES conformance, displays possible errors for data managers to address, and generates data reports. Once a dataset meets the requisite criteria, the Harmonist Toolkit can submit the dataset to the requesting researcher.
Researchers worked with IeDEA’s regional data managers to develop and implement the Harmonist Toolkit, which launched in 2019. After a year of using the Toolkit, data managers and researchers reported that the Toolkit improved the quality of datasets, generated useful reports, and simplified the task of linking datasets to the DES. High data quality improves trust in study results, leading to greater impact on patient care and health policy.
The Harmonist Toolkit is built using the R/Shiny framework, but as a web-based application it does not require coding knowledge for data managers to use. It can be hosted on a cloud-based server or locally on a laptop or desktop computer.
Stephany Duda, Ph.D., an associate professor of biomedical informatics at the Vanderbilt University School of Medicine, is the primary investigator for the Harmonist project. She said that a key to building the Harmonist Toolkit was continually involving data managers at the different regional sites.
“Anything that I design should make their lives easier and should make it easier for them to develop datasets that are standardized and adhere to best practices,” Dr. Duda said.
Dr. Duda, along with lead author Dr. Judith Lewis and other members of the Harmonist team, published a paper in the Journal of Biomedical Informatics describing the results of the project in 2023.
“The datasets are never perfect. That’s just the nature of clinical observational research data,” Dr. Duda said. “But we have seen, as reported in the paper, a substantial downward trend in the number of errors that we’ve detected in these datasets.”
Applying the Harmonist framework to other international research
The Harmonist team is continuing to improve the IeDEA Harmonist Data Toolkit — while also looking into adapting it for use beyond the IeDEA consortium.
The project is currently supporting harmonization efforts for the Regional Prospective Observational Research for Tuberculosis (RePORT) International consortium, which studies tuberculosis (TB) in the context of HIV and is supported by NIAID. The team is also collaborating with other consortia to create a generalized version of the Toolkit code.
“There's so much work that still needs to be done to make international research more accessible for everybody,” Dr. Duda said. “It’s rewarding to have this opportunity to build resources that support other researchers.”
Harmonist is supported by NIAID’s Division of AIDS. Learn more by visiting RePORTER.
IeDEA is supported by NIAID as well as the Eunice Kennedy Shriver National Institute of Child Health and Human Development, the National Cancer Institute, the National Institute of Mental Health, the National Institute on Drug Abuse, the National Heart, Lung, and Blood Institute, the National Institute on Alcohol Abuse and Alcoholism, the National Institute of Diabetes and Digestive and Kidney Diseases, the Fogarty International Center, and the National Library of Medicine. Learn more about NIAID’s support for IeDEA.