A comparison of vendor artificial intelligence solutions for automated post-processing of short-axis cine images in cardiovascular magnetic resonance imaging
Authors
- Thomas Hadler
- Clemens Ammann
- Hadil Saad
- Yashraj Bhoyroo
- Jana Veit
- Teodora Chitiboi
- Jens Wetzl
- Christian Geppert
- Jeanette Schulz-Menger
Journal
- Scientific Reports
Citation
- Sci Rep 16 (1): 17154
Abstract
Automated segmentation of cardiac magnetic resonance (CMR) imaging is integrated into clinical workflows, yet comparative performance across vendor AI solutions remains insufficiently characterized. This study assessed three models (two commercial, one research) for short-axis cine segmentation in a diverse cohort of 346 cases, including dilated cardiomyopathy (DCM), left ventricular hypertrophy (LVH), healthy volunteers, and other cardiac diseases. Clinical parameter agreement between AI-derived and expert-derived ventricular volumes and left ventricular mass (LVM) was evaluated using correlations and mean differences, segmentation agreement with Dice coefficient, and slice detection was characterized with false positive and negative rates (FPR/FNR). Papillary muscle (PM) inclusion was examined with subgroup analyses. AI-derived clinical parameters agreed strongly with expert measurements (r > 0.8). Nevertheless, inter-model biases included differing ventricular volume estimates. Midventricular segmentation was reliable (Dice > 80%), whereas apical slices were poor (Dice < 65%) with minor area impact (< 1cm(2)). Basal slice detection varied substantially, with AI1 and AI2 over- and AI3 under-detecting slices (e.g. RV FPR: AI1 24%, AI2 14%, AI3 FNR: 32%), producing large area differences. Due to PM exclusion AI2 overestimated volumes and underestimated LVM - particularly LVH-cases. While AI-expert agreement is high, AI solutions are not interchangeable and produce clinically relevant differences to experts across cardiac regions and disease groups.