NIAID has awarded contracts to seven small businesses to develop new software or web services that make infectious- and immune-mediated disease (IID) data easier to find and reuse. Software developed by these projects may lay the groundwork for applications such as artificial intelligence (AI).
Each year, NIH solicits research proposals from small businesses through a joint Small Business Innovation Research (SBIR) solicitation with the Centers for Disease Control and Prevention (CDC). The SBIR solicitation, typically released in August, funds qualifying U.S. small businesses to engage in research and development that supports NIAID’s public health mission.
The seven Phase I contracts were awarded in response to the 2024 solicitation. The contracts were awarded in two priority topic areas identified by the NIAID Office of Data Science and Emerging Technologies (ODSET) in close consultation with the NIAID Division of Allergy Immunology and Transplantation, Division of Microbiology and Infectious Diseases, and the Division of AIDS.
SBIR contracts to support small businesses are distinct from SBIR grants; visit the NIAID Grants and Contracts site to learn more about the different NIAID small business programs.
Automating metadata enrichment
The SBIR topic “Software or web services to automate metadata enrichment and standardization for data on infectious and immune-mediated diseases” seeks to streamline and enhance the accuracy and consistency of metadata for data related to IID.
Metadata provides key information about other data. High-quality metadata makes datasets more accessible and machine-readable, holding the potential to accelerate discovery with the help of artificial intelligence (AI) models.
However, creating high-quality, rich, standardized metadata consistent with FAIR (Findable, Accessible, Interoperable, and Reusable) guiding principles can be time-consuming work. That is why NIAID is funding a small-business project to automate parts of that process.
The goal is to develop software that will automate the process for “enriching” metadata. Rich metadata contains information relevant for researchers to determine the data’s potential value for reuse. Along with common administrative information, such as the author or creator of a dataset, rich metadata also includes information such as description of the data content and scientific methods used for data creation, as well as data provenance.
The small business contract recipient is tasked with developing software that automatically generates metadata that aligns with FAIR guiding principles, based on ontologies and standard vocabularies. The metadata should comply with schemas used by major data repositories so that researchers can submit it to those repositories after using the software. The software will incorporate feedback from the IID research community and be tested by researchers.
Building knowledge graphs
Six of the seven projects were awarded in the second SBIR topic area “Software or web services to re-represent existing scientific data and knowledge into a knowledge graph format.”
Knowledge graphs are data models that organize relationships between data entities in a “semantically rich” way — in other words, in a way that provides necessary context to help users understand the data’s relevance. A familiar use case of a knowledge graph is a map of collaborations between authors based on published papers. In this example, a knowledge graph can help the user understand relationships between scientific collaborators and how they build upon each other’s published research.
Knowledge graphs also enable computational tools, such as AI models, to retrieve and analyze data and have shown great promise for data management and knowledge discovery. Knowledge graph technology can represent many types of scientific information and relationships to accelerate research and discovery.
Currently, transforming and reformatting existing data into a knowledge graph-compatible format is a major obstacle. The six projects funded under this topic will pursue new software to make this process easier by automating the steps of extracting, transforming, and loading existing data into knowledge graph-compatible formats. This may include extracting facts and findings from published research, re-representing commonly produced scientific data types, and adding deep semantic information to scientific data in knowledge graph formats.
The goal of the project is to enable scale-up of knowledge graph applications for IID research. The resulting software generated by the six small business contract recipients should be usable by a broad range of researchers.
Contract Recipients
Software or web services to automate metadata enrichment and standardization for data on infectious and immune-mediated diseases
John Snow Labs
Principal Investigator: Hasham Ul Haq
Learn more by visiting RePORTER.
Software or web services to re-represent existing scientific data and knowledge into a knowledge graph format
Infotech
Principal Investigator: Samantha Sabatino
Insilicom
Principal Investigator: Jinfeng Zhang
Kitware
Principal Investigator: Jeff Baumes
MyOwnMed
Principal Investigator: Vicki Seyfert-Margolis
OmniSync
Principal Investigator: Norman Huang
Predictive
Principal Investigator: Kevin Causey