Top AI models show high accuracy in breast cancer screening

News
Article

A major challenge study finds top-performing AI models can detect breast cancer on screening mammograms with accuracy comparable to experienced radiologists.

Top AI models show high accuracy in breast cancer screening | Image Credit: © Valerii Apetroaiei - © Valerii Apetroaiei - stock.adobe.com.

Top AI models show high accuracy in breast cancer screening | Image Credit: © Valerii Apetroaiei - © Valerii Apetroaiei - stock.adobe.com.

Artificial intelligence (AI) algorithms have shown high efficacy for identifying breast cancer on mammography images, according to a recent study published in Radiology on August 12, 2025.1

These algorithms were submitted in the Radiological Society of North America’s (RSNA) AI Challenge, a crowdsourced competition with over 1500 participating teams. Of submissions, the best 10 showed a performance close to that of the average screening radiologist in Europe and Australia.

“We were overwhelmed by the volume of contestants and the number of AI algorithms that were submitted as part of the Challenge,” said Yan Chen, PhD, professor in cancer screening at the University of Nottingham. “…We were also impressed by the performance of the algorithms given the relatively short window allowed for algorithm development.”

RSNA AI challenge framework

The challenge was conducted to evaluate the efficacy of algorithms submitted for breast cancer detection.2 The RSNA allowed submissions between November 2022 and February 2023, publicly announcing results in May 2023. There were 5415 breast screening cases included in the evaluation set, none of which were interval cancer cases.

Pathologic examination was used to identify cancer and benign lesions. Additional data collected included cancer laterality, patient age, invasive status, and whether a biopsy was performed. Breast density was also included when available.

AI algorithms were allowed to be developed using any architecture. However, they were run on the Kaggle server for assessment, making them limited by the server’s technical specification. These algorithms provided breast-level prediction data, with some reporting outcomes as binary predictions and others on a continuous probability scale.

Cancer prevalence and performance scores

Women were aged 59 years when undergoing screening mammograms, with 2% of breasts having biopsy proven cancer. Of noncancer cases, 3.2% were reported benign through biopsy. While 52% of breasts did not have density data available, 5.2% were A, 20% B, 20.1% C, and 2.6% D.

There were 1537 AI algorithms included in the final analysis, 1072 of which provided binary prediction and 465 provided a predicted probability. When ranking algorithms with a probabilistic F1 metric, a score of 0.555 was reported for the top algorithm.

A median recall rate of 1.7% was reported for the algorithms, alongside a median cancer detection rate of 0.54%. For the highest ranked algorithm, these rates were 1.5% and 1%, respectively, vs 2.4% and 1.2%, respectively, for the top 3, and 3.5% and 1.3%, respectively, for the top 10.

Predictive values and sensitivity

The algorithms had a median positive predictive value (PPV) of 36.9% and a median negative predictive value (NPV) of 98.5%. For the top ranked algorithm, the top 3, and the top 10, PPVs were 64.6%, 49.2%, and 37.9%, respectively, while NPVs were 99%, 99.2%, and 99.4%, respectively.

The sensitivity and specificity of the algorithms were also reported, with medians of 27.6% and 98.7%, respectively. For the highest ranked algorithm, these rates were 48.6% and 99.5%, respectively. Cancer type, acquisition site, and equipment manufacturer were all found to influence sensitivity.

Sensitivities of 56.7% and 29.7% were reported for invasive cancer cases and noninvasive cancer cases, respectively, for the top performing model. Additionally, a lower sensitivity was reported for site 1 vs site 2, and the use of Hologic equipment was linked to reduced sensitivity.

Implications

These results highlighted efficacy of the submitted AI algorithms for detecting cancers on screening mammograms. Investigators concluded these algorithms could be used to increase sensitivity while maintaining clinically realistic recall rates.

“By releasing the algorithms and a comprehensive imaging dataset to the public, participants provide valuable resources that can drive further research and enable the benchmarking that is required for the effective and safe integration of AI into clinical practice,” said Chen.1

References

  1. RSNA AI challenge models can independently interpret mammograms. Radiological Society of North America. August 12, 2025. Accessed August 14, 2025. https://www.eurekalert.org/news-releases/1093699.
  2. Chen Y, Partridge GJW, Vazirabad M. Performance of algorithms submitted in the 2023 RSNA screening mammography breast cancer detection AI challenge. Radiology. 2025;316(2). doi:10.1148/radiol.241447

Newsletter

Get the latest clinical updates, case studies, and expert commentary in obstetric and gynecologic care. Sign up now to stay informed.

Recent Videos
Robin Noble, MD, MHCDS, debunks estrogen therapy myths | Image Credit: linkedin.com.
Neal Barnard, MD, FACC, highlights AMA's new breast cancer prevention guidelines | Image Credit: pcrm.org.
Sharon Erdrich, PhD, discusses why oral health should be routine in health care | Image Credit: linkedIn.com.
© 2025 MJH Life Sciences

All rights reserved.