Note: Single-source report; awaiting corroboration.
The National Institutes of Health (NIH) announced that the All of Us Research Program has become the world’s largest integrated genomics and electronic health record (EHR) database, with data from more than 747,000 participants now available to scientists. The dataset includes over 535,000 whole genome sequences linked to nearly 482,000 EHRs, providing extensive genomic data with clinical information at an unprecedented scale.
The latest data release adds more than 114,000 participants since the previous update, bringing total enrollment to over 883,000. It contains over 1.3 billion genetic variants, 553,000 genotyping arrays, 96,000 structural variant records, 600,000 physical measurements, and 747,000 survey responses covering social circumstances, behaviors, and environments.
EHR data increased by 22%, supported by broader participant-mediated EHR submissions and health information exchange sources. The program’s diversity is notable: more than 645,000 participants—86%—are from groups historically underrepresented in biomedical research, including older adults, women, people with disabilities, individuals of varied racial and ethnic backgrounds, and residents of rural or non-metropolitan areas. Participants represent all 50 states and territories and over 98% of U.S. three-digit ZIP codes.
For the first time, the All of Us dataset includes multiomics data, such as proteomics from nearly 10,000 participants, RNA sequencing data from nearly 9,000, and long-read whole genome sequencing—expanding resources beyond genomic and clinical data.