Ten years ago, the American Association for Cancer Research (AACR) launched Project GENIE (Genomics Evidence Neoplasia Information Exchange) with the goal of providing cancer researchers with access to real-world, clinico-genomic data from multiple institutions that they couldn’t find in any one place—the type of data necessary to fulfill many researchers’ wishes of being able to study rare genetic variants.
“Our initial founding principle was the idea that we had to combine data because no single institution would’ve had enough information on rare variants or rare cancers to really make a meaningful impact using just their own data,” said Shawn M. Sweeney, PhD, senior director of the AACR Project GENIE Coordinating Center.
By January 2017, eight institutions had contributed to this new publicly accessible data registry. Today, 20 institutions worldwide have contributed more than 200,000 total samples to AACR Project GENIE, with new releases of additional sequencing data announced every six months.
“I don’t think there is any other dataset that’s collected as many genomic sequenced samples as AACR Project GENIE has,” said Kenneth L. Kehl, MD, MPH, from Dana-Farber Cancer Institute, the current chair of AACR Project GENIE. “And that enables a number of different types of analyses that wouldn’t otherwise be possible.”
At the AACR Annual Meeting 2025, held April 25-30, such possibilities were on full display. Researchers showcased using AACR Project GENIE data in a number of different ways, including to identify potential therapeutic opportunities for a rare cancer, test out an artificial intelligence (AI) platform that provides insights on colorectal cancer, develop a new targeted tumor-agnostic therapy, and understand the differences between early-onset and late-onset gastroesophageal cancers.
A Deeper Understanding of a Rare Kidney Cancer
While targeted therapies and immunotherapies have shown promise for common forms of kidney cancer, options are much more limited for patients with a rare and aggressive kidney cancer subtype called collecting duct carcinoma (CDC).
A deeper understanding of the molecular underpinnings of CDC is needed to identify new therapeutic opportunities, said Xiaofan Lu, PhD, of the University of Strasbourg.
“There is limited data published for rare cancers,” said Lu. “But we found that there’s a lot of data about rare cancers in the [AACR Project GENIE] database, which was very useful for our study to validate the findings.”
In the study he presented, Lu performed whole exome sequencing, RNA sequencing, and DNA methylation profiling to characterize the gene mutation and expression patterns of 22 cases of CDC. He then validated the findings against 25 CDC samples housed in the AACR Project GENIE database.
Lu’s analyses revealed that mutations in the NF2, LZTR1, and SMARB1 genes, found on chromosome 22q, were common in CDC and that copy number loss of chromosome 22q occurred in more than half of the cases examined. In addition, Hippo signaling was frequently dysregulated due to mutations in the NF2, SAV1, and WWC1 genes, leading to overexpression of the YAP gene signature. Further, CDC was found to have upregulation of cell cycle and immune cell enrichment signatures that were not found in other kidney cancer subtypes.
Lu also examined the tumor microenvironments of CDC, finding that immunologically “hot” tumor microenvironments were enriched for CD8 T cells, B cells, tertiary lymphoid structure signatures, and mutations in DNA damage response genes. Conversely, immunologically “cold” tumors frequently had copy number loss of chromosome 22q and overexpression of the YAP gene signature.
He also demonstrated that the loss of chromosome 22q and mutations in DNA damage response genes were effective biomarkers to predict tumor response to immune checkpoint inhibition.
“We added to the knowledge of pathogenesis of this disease, including, for example, chromosome 22q loss … which was never reported before,” said Lu. “We also identified … a subset of [CDC] that might respond to immune checkpoint inhibitors. We think our study would be beneficial to those patients diagnosed with this challenging disease.”
Validating a Conversational AI System
Enrique Velazquez-Villarreal, MD, PhD, MPH, of City of Hope, had a wish: a conversational AI platform that researchers could use to query datasets with simple natural language to help in their quests to advance precision medicine. But using publicly available platforms resulted in issues, including hallucinations. One example he provided: asking about the TAILORx clinical trial and getting a citation about Taylor Swift. So, Velazquez-Villarreal and his team built a new platform from scratch.
Called the Precision Medicine AI Agent (PM-AI), their model turns clinical, social determinants of health, and genomic data into tokens that can be more easily analyzed using AI. Users can then conduct integrative data analysis by querying with pertinent questions, such as potential molecular biomarkers, how patients from different socioeconomic backgrounds or ethnic groups react to different therapeutics, or treatment strategies for patients with different KRAS mutations. The program can supply analysis within hours, compared to the weeks or months it can take to get a similar analysis using other platforms.
To validate the platform, Velazquez-Villarreal’s team conducted a case-control study using data from The Cancer Genome Atlas (TCGA), cBioPortal for Cancer Genomics, and AACR Project GENIE. PM-AI was asked to compare how patients with colorectal cancer respond to FOLFOX (folinic acid, 5-fluorouracil, and oxaliplatin) depending on whether they have RAS mutations. The platform identified KRAS as a predictor of poorer response and increased risk of recurrence, which is consistent with published findings.
Velazquez-Villarreal said that in addition to publicly available data sources, researchers can also use PM-AI to query their own data.
“We want to help researchers take advantage of this AI revolution,” Velazquez-Villarreal explained. “By using the AACR Project GENIE dataset, we are able to provide researchers with an opportunity to interact with this AI agent for the first time and feel confident about how the technology works before they work to adopt or improve upon this open-source model for their own purposes.”
Reigning in a Rogue Genome Guardian
As the guardian of the genome, the p53 tumor suppressor pathway is one of the body’s most important defenses against cancer. When cells are damaged, p53 springs into action to help them fix themselves or, if the damage is too severe, to trigger them to self-destruct. But when p53 itself mutates, cells lose their brakes and threaten runaway malignancy. One such variation results from the R175H missense mutation that changes the shape and function of the p53 protein.
With the aid of AACR Project GENIE, a team at Clasp Therapeutics led by Kristen McEachern, PhD, analyzed genomic data from over 180,000 tumor samples to determine the frequency of this mutation that cripples p53’s protective power. Overall, R175H mutation occurred in about 2% of all tumors and more often in tough-to-treat diseases like colorectal, pancreatic, and ovarian cancer. This allowed them to estimate how many patients might benefit from new targeted therapies like CLSP-1025, thus defining an addressable population
CLSP-1025 is a bispecific T-cell engager designed to direct immune assassins to cancer cells that possess this altered version of p53.
One end of the diabody drug recognizes a peptide fragment unique to the R175H mutant, so long as it is presented in the context of a complex called HLA-A*02:01, which about 41% of people in the United States carry, according to the researchers. The other end of CLSP-1025 binds and activates T cells, triggering their cancer-killing function.
Going after any form of mutant p53 has been hard because the full protein is expressed only inside of cells. But by using an antibody that connects the immune cells to cancer cells that express the mutant p53-HLA combo on their surface, this treatment turns this previously “undruggable” protein into a precision immunotherapy target. Thus far, the approach has shown promise preclinically, and the investigators have launched GUARDIAN-101, the first-in-human trial of CLSP-1025, earlier this year in patients with solid cancers.
McEachern explained that the GENIE database “helped us to define a clear patient population for our ongoing phase I trial and support a tumor agnostic approach so that we can bring this important therapy to as many patients as possible.”
Understanding the Molecular Landscape of Early-onset Cancers
With cancer becoming increasingly common in younger populations, identifying the factors underlying early-onset cancers will be critical to developing successful prevention and therapeutic strategies.
To this end, Ronan McLaughlin, MBBCh, BAO, of Princess Margaret Cancer Centre, used data from AACR Project GENIE to understand how molecular and clinical features differ between early-onset and late-onset gastroesophageal cancers (GEC). He examined data from 5,897 patients, 943 (16%) of whom had been diagnosed with GEC at age 50 or younger.
The analysis benefited from data collected across 17 different institutions, McLaughlin noted, and it identified 26 genes that were differentially mutated between early-onset and late-onset GEC. All but two of these genes were more frequently mutated in late-onset GEC and included potentially actionable mutations, such as those in the CDKN2A, KRAS, FGF4, and FGFR3 genes.
The genes CDH1 and CCNE1 were significantly more likely to be mutated in early-onset GEC than in late-onset cases. McLaughlin noted that CDH1 and CCNE1 mutations were previously found to be associated with hereditary gastric cancer syndromes, aggressive disease, and poor survival.
Compared with the late-onset GEC cohort, the early-onset cohort had a higher proportion of female patients and a higher percentage of Asian patients. Survival analyses did not reveal statistically significant differences between the cohorts; however, patients with early-onset GEC had numerically shorter progression-free survival.
“Looking at somatic mutations has allowed us to identify two mutations that are identified more commonly and enriched in that younger-onset group,” McLaughlin said. “[The findings] will allow us to develop targeted therapies, design future clinical trials, and ultimately [impact] patient outcomes.”