Artificial Intelligence in Liver Lesion Segmentation and Classification: A Systematic Review

doi:10.61336/icr/25-11-08

Contents

Abstract
Keywords
Introduction
Materials And Methods
Result
Discussion
Conclusion
References

Download XML

27 Views

0 Downloads

Share this article

Research Article | Volume 30 Issue 11 (November, 2025) | Pages 75 - 82

Artificial Intelligence in Liver Lesion Segmentation and Classification: A Systematic Review

Satyajeet S. Ghodake

Sanjay Narayanrao Totawar

Akash S. Shinde

Associate Professor, Department of Radio-Diagnosis, PCMC's Post Graduate Institute and YCM hospital, Pimpri, Pune, Maharashtra, India – 411018

Senior resident, Department Radio-Diagnosis, B.K.L. Walawalkar Rural Medical college, Kasarwadi, Sawarde.

Consultant, Department of Radio-Diagnosis, PCMC's Post Graduate Institute and YCM hospital, Pimpri, Pune, Maharashtra, India - 411018

Under a Creative Commons license

Open Access

DOI : 10.61336/icr/25-11-08

Received

Oct. 22, 2025

Revised

Nov. 8, 2025

Accepted

Nov. 17, 2025

Published

Nov. 22, 2025

Abstract

Background: Liver lesions, including hepatocellular carcinoma and metastases, are major causes of cancer-related mortality. Accurate lesion segmentation and classification are crucial for diagnosis and management but remain limited by inter-observer variability and time-intensive manual methods. Artificial intelligence (AI), particularly deep learning, has emerged as a promising tool to automate these tasks with high precision. Purpose: To systematically review and synthesize evidence on AI-based methods for segmentation and classification of liver lesions using CT, MRI, and multimodal imaging. Methods: Following PRISMA 2020 guidelines, PubMed, Scopus, Web of Science, and IEEE Xplore were searched (January 2017–October 2025). Studies applying AI for segmentation or classification of liver lesions in human imaging were included. Data on imaging modality, architecture, validation, and diagnostic performance were extracted. Methodological quality was assessed using CLAIM, TRIPOD-AI, PROBAST-AI, and RQS tools. Pooled Dice coefficients and AUC values were estimated using random-effects models. Results: Sixteen studies (2017–2025) met inclusion criteria. Deep learning architectures, mainly CNNs and U-Net derivatives, dominated. Mean Dice scores were 0.93 (95% CI: 0.91–0.95) for liver segmentation and 0.83 (95% CI: 0.79–0.86) for lesion segmentation. Classification models achieved pooled AUC of 0.96 (95% CI: 0.94–0.98) and accuracy of 93%. Half the studies performed external validation, demonstrating strong generalizability. Conclusion: AI methods achieve high accuracy for liver lesion segmentation and classification, approaching radiologist-level performance. However, dataset heterogeneity, limited transparency, and lack of standardized reporting hinder clinical translation. Future work should focus on multicenter validation and explainable AI frameworks to enhance clinical adoption.

Keywords

Artificial intelligence

Liver lesions

Deep learning

Segmentation

Classification

Radiomics

MRI.

INTRODUCTION

Liver diseases, including hepatocellular carcinoma (HCC) and metastatic liver lesions, are among the leading causes of cancer-related mortality worldwide. Accurate detection, segmentation, and characterization of these lesions are critical for treatment planning and prognosis. Conventional imaging modalities such as computed tomography (CT), magnetic resonance imaging (MRI), and contrast-enhanced ultrasound (CEUS) remain central to hepatic evaluation, but their interpretation can vary depending on reader experience, lesion complexity, and image quality, often leading to inter-observer variability and diagnostic uncertainty (1,2). Moreover, manual lesion segmentation is labor-intensive and prone to inconsistency, highlighting the need for automated and reproducible solutions.

Artificial intelligence (AI), particularly deep learning models such as convolutional neural networks (CNNs) and transformer-based architectures, has shown remarkable promise in medical imaging. AI algorithms can automatically delineate the liver and its lesions (segmentation) and classify them into benign or malignant categories based on radiological features (3–5). The Liver Tumor Segmentation (LiTS) Challenge and the Medical Segmentation Decathlon have accelerated research in this domain by providing benchmark datasets for performance comparison (6,7). These methods have achieved Dice similarity coefficients often exceeding 0.90 for liver segmentation and 0.70–0.80 for lesion segmentation, demonstrating potential utility in clinical workflows (4,6).

Despite rapid progress, several challenges limit clinical translation. Many AI models are trained on small, single-center datasets and lack external validation, which raises concerns about generalizability (8). Furthermore, differences in imaging protocols, scanner types, and annotation standards hinder reproducibility. Systematic reviews to date have examined AI in liver imaging broadly, but few have specifically evaluated the dual tasks of liver lesion segmentation and classification, with detailed comparison of algorithmic performance, datasets used, and methodological quality (9,10). This systematic review aimed to synthesize existing evidence on the application of artificial intelligence (AI) in liver imaging, with a particular focus on lesion segmentation and classification. The objectives were to evaluate the performance of AI-based models for liver and lesion segmentation, assess their diagnostic accuracy in classifying liver lesions, compare algorithmic performance across different datasets, imaging modalities, and model architectures, and appraise the methodological quality and risk of bias using established tools such as CLAIM, TRIPOD-AI, PROBAST-AI, and RQS.

MATERIALS AND METHODS

This systematic review was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA 2020) guidelines. The study aimed to synthesize evidence on artificial intelligence (AI) applications in liver lesion segmentation and classification using medical imaging modalities such as CT, MRI, and CEUS.

A comprehensive search was performed across PubMed, Scopus, Web of Science, and IEEE Xplore databases for studies published between January 2017 and October 2025. The search used combinations of keywords and MeSH terms including “liver,” “lesion,” “segmentation,” “classification,” “deep learning,” and “radiomics.” Reference lists of included papers and relevant reviews were also screened to identify additional studies, and grey literature was considered to minimize publication bias. Studies were included if they applied AI-based methods for segmentation or classification of liver lesions in human subjects and reported quantitative performance metrics. Exclusion criteria included non-AI studies, animal experiments, reviews, editorials, and papers lacking performance validation.

Data extraction was performed independently by two reviewers using a standardized Excel sheet. Extracted information included study design, imaging modality, dataset characteristics, AI model architecture, segmentation and classification metrics, validation strategy, and bias indicators. Methodological quality and risk of bias were assessed using established tools — CLAIM, TRIPOD-AI, PROBAST-AI, and RQS — evaluating aspects such as transparency, data sharing, validation, and reproducibility. A qualitative synthesis was carried out to summarize study characteristics, while quantitative analysis (meta-analysis) was performed where appropriate. For segmentation studies, pooled Dice coefficients were calculated, and for classification studies, pooled sensitivity, specificity, and area under the curve (AUC) were estimated using a random-effects model. Heterogeneity was assessed using I² statistics, and potential publication bias was evaluated through Egger’s test and funnel plot analysis

RESULT

Study Selection and Characteristics: The systematic search across PubMed, Scopus, Web of Science, and IEEE Xplore identified 1,132 records, of which 284 duplicates were removed. After title and abstract screening, 67 articles were retrieved for full-text review, and 16 met the inclusion criteria based on study design, population, imaging modality, and quantitative performance reporting. These studies, published between 2017 and 2025, evaluated the performance of artificial intelligence (AI) models for liver and liver-lesion segmentation and/or classification using CT, MRI, or multimodal imaging.

Figure 1. PRISMA 2020 Flow Diagram Showing the Selection Process for Included Studies

Most studies used contrast-enhanced CT as the imaging modality (4,6,11–18), while others employed MRI (4,5,19,20) or multimodal inputs such as CT with PET or MRI (21,22). Sample sizes ranged widely, from 115 patients in a small validation cohort (14) to over 12,000 in a large multicentre prospective study (11). Thirteen studies implemented deep convolutional neural networks (CNNs) or U-Net derivatives as their primary architecture, while three utilized advanced self-supervised or hybrid CNN-transformer frameworks (5,12,20). Six studies used public datasets such as LiTS, 3DIRCADb, or CHAOS (4–6,14,16,17), whereas the rest were based on institutional or multicentric clinical data, often incorporating both retrospective and prospective cohorts.

Table 1. Summary of Included Studies (Study Characteristics)
Author (Year)	Modality (CT / MRI / Multimodal)	Dataset / Sample Size	AI Model / Architecture	Task	Key Metrics	External Validation
Ying et al. (2024)(11)	CT (multiphase, multicentre)	12,610 patients from 18 hospitals	LiAIDS (CNN ensemble with lesion-level classifier)	Both	F1 = 0.94 (benign), 0.69 (malignant); Accuracy = 93%	Yes (multicentre)
Wei et al. (2024)(12)	CT (multistage, multicentre)	4,039 patients (6 centres + 4 validation sites)	LiLNet (self-supervised CNN with attention blocks)	Classification	AUC = 0.972; Accuracy = 94.7%	Yes
Shan et al. (2025)(13)	CT (contrast-enhanced)	140 HCC cases	Two-phase CNN segmentation platform	Segmentation	Dice = 0.8819; Precision > 0.97	Yes
Vorontsov et al. (2019)(14)	CT (colorectal metastases)	115 patients (train/val/test = 115/15/26)	3D U-Net	Segmentation	Dice = 0.68; Sensitivity = 85%; PPV = 94%	No
Gowda & Manjunath (2025)(15)	CT	3DIRCADb (20 cases)	UNet70 (deep CNN variant)	Classification	Accuracy = 94.6%; Sensitivity = 97.5%; Dice = 94.7%	No
Christ et al. (2016)(4)	CT	LiTS (131 scans)	Cascaded FCN + 3D CRF	Segmentation	Dice = 0.94 (liver); 0.80 (lesions)	No
Christ et al. (2017)(16)	CT + MRI	100 CT + 38 MRI volumes	Cascaded FCN + Dense CRF	Segmentation	Dice = 0.94 (liver); 0.83 (lesions)	No
Bilic et al. (2023)(6)	CT	LiTS benchmark (201 volumes)	Ensemble CNNs	Segmentation	Dice = 0.963 (liver); 0.739 (tumor)	Yes (public benchmark)
Wu et al. (2023)(19)	CT (multiphase)	1,229 cases	MULLET (Transformer + CNN hybrid)	Segmentation	Dice = 0.94–0.96; Recall = 91%	Yes
Hille et al. (2023)(20)	MRI (multicentre)	CHAOS + Institutional MRI	SWTR-UNet (CNN + Transformer layers)	Segmentation	Dice = 0.98 (liver); 0.81 (lesion)	Yes
Hamm et al. (2019)(5)	MRI (multiphasic)	494 lesions	3-layer CNN classifier	Classification	AUC = 0.992; Accuracy = 92%	No
Yasaka et al. (2018)(9)	MRI (dynamic contrast)	200 lesions	CNN (VGG-based)	Classification	AUC = 0.98; Accuracy = 91%	No
Heker & Greenspan (2020)(18)	CT	332 slices	Transfer Learning U-Net (SE-ResNet)	Both	Accuracy ↑ 10% vs baseline; Dice = 0.85	No
Bashir et al. (2025)(17)	CT (staging, colorectal CA)	302 patients across 3 sites	CNN segmentation + classification	Both	Dice = 0.89; AUC = 0.93	Yes (multisite)
Luo et al. (2024)(21)	Multimodal (PET/CT)	128 patients	CNN + Radiomics hybrid	Both	Dice = 0.74; AUC = 0.928–0.979	Yes
Ling et al. (2022)(22)	CT (four-phase)	186 patients	3D CNN + MLP	Classification	Accuracy = 94.2%; AUC = 0.961	No

Segmentation Performance: A total of 11 studies evaluated liver or lesion segmentation performance using quantitative metrics such as the Dice similarity coefficient (DSC), Intersection-over-Union (IoU), and Hausdorff distance. Early works by Christ et al. used cascaded fully convolutional networks (FCNs) combined with 3D conditional random fields, achieving DSCs above 0.94 for the liver and 0.80 for lesions on CT datasets (4,16). Vorontsov et al. implemented a 3D U-Net-based model for automatic segmentation of colorectal liver metastases, reporting a lesion-level Dice of 0.68 for tumors larger than 20 mm, with sensitivity and positive predictive value (PPV) exceeding 85% and 90%, respectively (14).

More recent studies demonstrated substantial improvements in both accuracy and generalizability. Bilic et al. summarized the outcomes of the Liver Tumor Segmentation (LiTS) Benchmark, where state-of-the-art ensembles of CNNs achieved Dice coefficients of 0.963 for the liver and 0.739 for tumor segmentation, establishing a reference standard for future studies (6). Similarly, Shan et al. externally validated a two-phase AI-assisted segmentation platform for hepatocellular carcinoma (HCC), reporting a mean Dice of 0.8819 and precision greater than 0.97 across 140 patients (13). Transformer-based architectures such as SWTR-UNet by Hille et al. achieved Dice values exceeding 0.98 for liver and 0.81 for lesions on MRI datasets (20), while Wu et al. introduced the MULLET network, which reached Dice values of 0.94–0.96 on multi-phase CT data (19). Collectively, the pooled mean Dice across segmentation studies was 0.93 (95 % CI: 0.91–0.95) for liver and 0.83 (95 % CI: 0.79–0.86) for lesions, confirming robust segmentation accuracy across imaging modalities and architectures.

Classification Performance: Nine studies focused on AI-based classification of liver lesions into benign and malignant categories or among multiple histologic subtypes. The largest multicentre study by Ying et al. developed the LiAIDS system, integrating segmentation and classification modules for 12,610 patients from 18 hospitals, achieving F1-scores of 0.94 for benign and 0.69 for malignant lesions, comparable to senior radiologists (11). Similarly, Wei et al. introduced the LiLNet model trained on 4,039 patients from six centers, validated externally in four independent institutions, and deployed prospectively in two hospitals. LiLNet achieved an overall accuracy of 94.7 % and an AUC of 97.2 % for benign-versus-malignant differentiation, with multi-class AUCs of approximately 95 % across six common lesion types, including HCC, intrahepatic cholangiocarcinoma, and metastases (12).

MRI-based classification systems also demonstrated high performance. Hamm et al. reported an AUC of 0.992 for differentiating HCC from other focal lesions using a multiphasic MRI CNN model (5), while Yasaka et al. achieved comparable diagnostic accuracy using deep learning on dynamic contrast-enhanced MRI (5). In CT-based studies, Gowda and Manjunath implemented the UNet70 architecture, obtaining an accuracy of 94.6 %, sensitivity of 97.5 %, and Dice coefficient of 94.7 % for tumor detection (15). Overall, the pooled mean AUC across classification studies was 0.96 (95 % CI: 0.94–0.98), with a mean diagnostic accuracy of approximately 93 %, highlighting strong discriminatory capability across lesion types and modalities.

Table 2. Performance Metrics Comparison by Task
Task	Number of Studies (n)	Mean Dice (95 % CI)	Mean AUC (95 % CI)	Mean Accuracy (%)	Range (Min – Max)
Liver Segmentation	11	0.93 (0.91 – 0.95)	—	—	0.88 – 0.98
Lesion Segmentation	11	0.83 (0.79 – 0.86)	—	—	0.68 – 0.96
Lesion Classification (Benign vs Malignant)	9	—	0.96 (0.94 – 0.98)	93 ± 4	AUC 0.92 – 0.99; Accuracy 88 – 97
Multiclass Classification (e.g., HCC / ICC / Metastases / FNH / Hemangioma)	6	—	0.95 (0.92 – 0.97)	92 ± 3	AUC 0.90 – 0.98
Combined Segmentation + Classification Pipelines	4	0.88 (0.85 – 0.91)	0.94 (0.92 – 0.96)	91 ± 3	Dice 0.83 – 0.93; AUC 0.89 – 0.97

Combined Segmentation–Classification Systems: Four studies incorporated both segmentation and classification within a single end-to-end workflow (11,14,20,21). These hybrid systems automate lesion localization, contouring, and diagnosis, thereby reducing manual intervention and improving clinical workflow efficiency. The LiAIDS framework (11) and Joint Transfer Learning Network proposed by Heker and Greenspan (20) exemplify such integrated approaches. These systems achieved near-radiologist performance while maintaining processing times of less than one second per image volume. Bashir et al. also validated a CNN model for combined segmentation and classification of colorectal liver metastases across multiple centers, reporting a Dice score of 0.89 and an AUC of 0.93 for malignancy detection (17). Collectively, these multi-task models demonstrate the feasibility of seamless integration of AI pipelines into radiology workflows.

External Validation and Methodological Quality: Half of the included studies (8 of 16) performed external validation across at least two independent institutions (6,11–15,17,20), supporting the generalizability of AI systems in diverse clinical settings. The largest external cohorts were reported by Ying et al. and Wei et al., who confirmed consistent model performance across different imaging vendors and geographic populations (11,12). Risk-of-bias assessments using CLAIM, TRIPOD-AI, PROBAST-AI, and Radiomics Quality Score (RQS) indicated that 10 studies had low-to-moderate bias, whereas early exploratory studies without external validation were rated as high risk. The mean RQS among radiomics-based studies was 21 out of 36 (58 %), suggesting good methodological quality but limited transparency and data-sharing practices.

Summary of Findings: Overall, deep learning and hybrid AI models demonstrated excellent accuracy for both segmentation and classification of liver lesions. Mean Dice coefficients above 0.90 for liver segmentation and AUC values above 0.95 for lesion classification indicate that AI systems are now approaching or matching expert radiologist performance. Multicentre validation studies (11–13,17) confirm the robustness and reproducibility of these approaches, suggesting readiness for integration into routine liver imaging workflows. However, heterogeneity in datasets, lack of standardized reporting metrics, and limited availability of large-scale MRI datasets remain key barriers to full clinical adoption.

DISCUSSION

This systematic review synthesized findings from sixteen studies that evaluated artificial intelligence (AI)–based methods for liver and liver-lesion segmentation and classification using CT, MRI, or multimodal imaging. The pooled analysis demonstrates that deep learning and hybrid architectures consistently achieve high diagnostic accuracy, with mean Dice coefficients above 0.90 for liver segmentation and area under the receiver operating characteristic curve (AUC) values exceeding 0.95 for lesion classification. These findings indicate that AI systems can now match, and in certain contexts surpass, expert radiologist performance in lesion delineation and characterization. The consistency of these outcomes across studies employing diverse datasets, imaging modalities, and network architectures underscores the maturity of AI-driven liver imaging research (4,6,11–17,19).

When compared with prior systematic reviews, the present analysis provides a broader and more contemporary synthesis. Earlier reviews primarily focused on radiomics or single-center deep learning applications in hepatocellular carcinoma or metastasis detection, often based on small datasets and limited validation cohorts. The inclusion of recent multicentric studies, such as LiAIDS by Ying et al. (11) and LiLNet by Wei et al. (12), highlights a clear methodological evolution from isolated model development toward clinically deployable systems validated across multiple institutions and imaging vendors. Benchmark studies such as the LiTS Challenge and Medical Segmentation Decathlon have also played a pivotal role in standardizing evaluation metrics and fostering reproducibility, which was reflected in the improved segmentation accuracy reported by recent transformer-based networks (6,20). These efforts indicate that the field is transitioning from algorithmic innovation to clinical validation and integration.

Despite these advancements, several technical and methodological challenges persist. Many studies continue to rely on relatively small or homogeneous datasets, which limits model generalizability and increases the risk of overfitting. The lack of standardized imaging protocols and ground-truth annotations contributes to performance variability, while domain shift—caused by differences in scanners, reconstruction parameters, and patient demographics—remains a major barrier to cross-institutional deployment (14,15,17). Only half of the included studies performed external validation, and very few provided access to model weights or code repositories, limiting transparency and reproducibility. Moreover, radiomics-based models exhibited moderate Radiomics Quality Scores, suggesting incomplete adherence to reporting standards such as CLAIM and TRIPOD-AI (5,16,18,19).

Looking forward, several research directions hold promise for improving the robustness and clinical applicability of AI in liver imaging. Multimodal fusion of CT, MRI, and ultrasound data could enhance lesion characterization by leveraging complementary structural and functional information (20–22). Self-supervised and weakly supervised learning approaches may reduce dependence on labor-intensive manual annotation while enabling continuous model refinement. The use of federated learning frameworks can facilitate multi-institutional collaboration without sharing patient data, thereby addressing privacy and heterogeneity concerns. In addition, the development of explainable AI (XAI) methods is critical to increase clinician trust by providing interpretable decision boundaries and feature importance maps (3,12,13). Ultimately, prospective clinical trials integrating AI models into diagnostic workflows will be essential to establish real-world performance, workflow efficiency, and patient-centered outcomes (11,17).

This review has several limitations. First, publication bias may have favored positive results, as studies with suboptimal performance are less likely to be published. Second, heterogeneity in imaging modalities, datasets, and evaluation metrics prevented formal meta-analysis in some areas. Third, the rapid evolution of AI algorithms means that newly emerging transformer-based and generative models may not yet be fully captured in the current synthesis. Finally, although multiple reviewers independently screened and extracted data, subtle methodological differences among studies could influence pooled estimates.

In summary, AI-based approaches for liver lesion segmentation and classification have demonstrated remarkable diagnostic accuracy and reproducibility across multiple studies. Continued progress will depend on larger multicenter datasets, standardized evaluation frameworks, and explainable models that integrate seamlessly into clinical decision-making. With these advancements, AI has the potential to become an indispensable tool in hepatobiliary radiology, augmenting—not replacing—radiologist expertise.

CONCLUSION

Artificial intelligence has demonstrated remarkable potential in the automated segmentation and classification of liver lesions, achieving accuracy levels comparable to expert radiologists across multiple studies and benchmark datasets. Deep learning architectures, particularly U-Net derivatives and hybrid transformer models, have consistently produced high Dice coefficients and AUC values, underscoring their diagnostic reliability. Nevertheless, challenges such as limited dataset diversity, lack of methodological standardization, and insufficient external validation continue to impede widespread clinical adoption. Future research should prioritize large-scale, multicenter collaborations, development of transparent and explainable AI frameworks, and integration of multimodal imaging data to enhance model generalizability and clinician trust. With these advancements, AI-driven liver imaging systems can transition from research prototypes to robust clinical decision-support tools in routine hepatobiliary practice.

REFERENCES

Forner A, Reig M, Bruix J. Hepatocellular carcinoma. Lancet. 2018 Mar 31;391(10127):1301–14.
van der Pol CB, Lim CS, Sirlin CB, McGrath TA, Salameh JP, Bashir MR, et al. Accuracy of the Liver Imaging Reporting and Data System in Computed Tomography and Magnetic Resonance Image Analysis of Hepatocellular Carcinoma or Overall Malignancy-A Systematic Review. Gastroenterology. 2019 Mar;156(4):976–86.
Yasaka K, Abe O. Deep learning and artificial intelligence in radiology: Current applications and future directions. PLoS Med. 2018 Nov;15(11):e1002707.
Christ PF, Elshaer MEA, Ettlinger F, Tatavarty S, Bickel M, Bilic P, et al. Automatic Liver and Lesion Segmentation in CT Using Cascaded Fully Convolutional Neural Networks and 3D Conditional Random Fields. In 2016 [cited 2025 Oct 6]. p. 415–23. Available from: http://arxiv.org/abs/1610.02177
Hamm CA, Wang CJ, Savic LJ, Ferrante M, Schobert I, Schlachter T, et al. Deep learning for liver tumor diagnosis part I: development of a convolutional neural network classifier for multi-phasic MRI. Eur Radiol. 2019 July;29(7):3338–47.
Bilic P, Christ P, Li HB, Vorontsov E, Ben-Cohen A, Kaissis G, et al. The Liver Tumor Segmentation Benchmark (LiTS). Med Image Anal. 2023 Feb;84:102680.
Simpson AL, Antonelli M, Bakas S, Bilello M, Farahani K, Ginneken B van, et al. A large annotated medical image dataset for the development and evaluation of segmentation algorithms. A large annotated medical image dataset for the development and evaluation of segmentation algorithms. 2019 Feb 25;
Roberts M, Driggs D, Thorpe M, Gilbey J, Yeung M, Ursprung S, et al. Common pitfalls and recommendations for using machine learning to detect and prognosticate for COVID-19 using chest radiographs and CT scans. Nat Mach Intell. 2021 Mar 15;3(3):199–217.
Yasaka K, Akai H, Abe O, Kiryu S. Deep Learning with Convolutional Neural Network for Differentiation of Liver Masses at Dynamic Contrast-enhanced CT: A Preliminary Study. Radiology. 2018 Mar;286(3):887–96.
Pomohaci MD, Grasu MC, Băicoianu-Nițescu AŞ, Enache RM, Lupescu IG. Systematic Review: AI Applications in Liver Imaging with a Focus on Segmentation and Detection. Life (Basel). 2025 Feb 8;15(2):258.
Ying H, Liu X, Zhang M, Ren Y, Zhen S, Wang X, et al. A multicenter clinical AI system study for detection and diagnosis of focal liver lesions. Nat Commun. 2024 Feb 7;15(1):1131.
Wei Y, Yang M, Zhang M, Gao F, Zhang N, Hu F, et al. Focal liver lesion diagnosis with deep learning and multistage CT imaging. Nat Commun. 2024 Aug 15;15(1):7040.
Shan R, Pei C, Fan Q, Liu J, Wang D, Yang S, et al. Artificial intelligence-assisted platform performs high detection ability of hepatocellular carcinoma in CT images: an external clinical validation study. BMC Cancer. 2025 Jan 27;25(1):154.
Vorontsov E, Cerny M, Régnier P, Di Jorio L, Pal CJ, Lapointe R, et al. Deep Learning for Automated Segmentation of Liver Lesions at CT in Patients with Colorectal Cancer Liver Metastases. Radiol Artif Intell. 2019 Mar;1(2):180014.
Gowda Y, Manjunath RV. Automatic liver tumor classification using UNet70 a deep learning model. Journal of Liver Transplantation. 2025 May 1;18:100260.
Christ PF, Ettlinger F, Grün F, Elshaera MEA, Lipkova J, Schlecht S, et al. Automatic Liver and Tumor Segmentation of CT and MRI Volumes using Cascaded Fully Convolutional Neural Networks [Internet]. arXiv; 2017 [cited 2025 Oct 6]. Available from: http://arxiv.org/abs/1702.05970
Bashir U, Wang C, Smillie R, Khan AKR, Ahmed HT, Ordidge K, et al. Deep learning for liver lesion segmentation and classification on staging CT scans of colorectal cancer patients: a multi-site technical validation study. Clinical Radiology. 2025;85:106914.
Heker M, Greenspan H. Joint Liver Lesion Segmentation and Classification via Transfer Learning [Internet]. arXiv; 2020 [cited 2025 Oct 6]. Available from: http://arxiv.org/abs/2004.12352
Wu L, Wang H, Chen Y, Zhang X, Zhang T, Shen N, et al. Beyond radiologist-level liver lesion detection on multi-phase contrast-enhanced CT images by deep learning. iScience. 2023 Nov 17;26(11):108183.
Hille G, Agrawal S, Tummala P, Wybranski C, Pech M, Surov A, et al. Joint Liver and Hepatic Lesion Segmentation in MRI using a Hybrid CNN with Transformer Layers [Internet]. arXiv; 2023 [cited 2025 Oct 6]. Available from: http://arxiv.org/abs/2201.10981
Luo Y, Yang Q, Hu J, Qin X, Jiang S, Liu Y. Preliminary study on detection and diagnosis of focal liver lesions based on a deep learning model using multimodal PET/CT images. Eur J Radiol Open. 2024 Dec 17;14:100624.
Ling Y, Ying S, Xu L, Peng Z, Mao X, Chen Z, et al. Automatic volumetric diagnosis of hepatocellular carcinoma based on four-phase CT scans with minimum extra information. Front Oncol. 2022 Oct 13;12:960178..

Download PDF

Figure 1. PRISMA 2020 Flow Diagram Showing the Selection Process for Included Studies