folder

Impact of training data composition on the generalizability of CNN aortic cross section segmentation in 4D Flow MRI

Authors

  • C. Manini
  • M. Hüllebrand
  • L. Walczak
  • S. Nordmeyer
  • L. Jarmatz
  • T. Kuehne
  • H. Stern
  • C. Meierhofer
  • A. Harloff
  • J. Erley
  • S. Kelle
  • P. Bannas
  • R.F. Trauzeddel
  • J. Schulz-Menger
  • A. Hennemuth

Journal

  • Journal of Cardiovascular Magnetic Resonance

Citation

  • J Cardiovasc Magn Reson 101081

Abstract

  • BACKGROUND: Time-resolved, three-dimensional phase-contrast magnetic resonance imaging (4D flow MRI) plays an important role in assessing cardiovascular diseases. However, the manual or semi-automatic segmentation of aortic vessel boundaries in 4D flow data introduces variability and limits reproducibility of aortic hemodynamics visualization and quantitative flow-related parameter computation. This paper explores the potential of deep learning to improve 4D flow MRI segmentation by developing models for automatic segmentation and analyzes the impact of the training data on the generalization of the model across different sites, scanner vendors, sequences, and pathologies. METHODS: The study population consists of 260 4D flow MRI datasets, including subjects without known aortic pathology, healthy volunteers, and patients with bicuspid aortic valve (BAV) examined at different hospitals. The dataset was split to train segmentation models on subsets with different representations of characteristics such as pathology, gender, age, scanner model, vendor, and field strength. An enhanced 3D U-net convolutional neural network (CNN) architecture with residual units was trained for 2D+t aortic cross-sectional segmentation. The model performance was evaluated using Dice score, Hausdorff distance, and average symmetric surface distance on test data, datasets with characteristics not represented in the training set (model-specific), and an overall evaluation set. Standard diagnostic flow parameters were computed and compared with manual segmentation results using Bland-Altman analysis and interclass correlation. RESULTS: The representation of technical factors such as scanner vendor and field strength in the training dataset had the strongest influence on the overall segmentation performance. Age had a greater impact than gender. Models solely trained on BAV patients' datasets performed well on datasets of healthy subjects but not vice versa. CONCLUSION: This study highlights the importance of considering a heterogeneous dataset for the training of widely applicable automatic CNN segmentations in 4D flow MRI, with a particular focus on the inclusion of different pathologies and technical aspects of data acquisition.


DOI

doi:10.1016/j.jocmr.2024.101081