Improving Training Stability of Deep Neural Networks for Multi-Institutional Medical Image Segmentation
Main Article Content
Abstract
Deep neural networks have become standard tools for medical image segmentation. However, training stability remains sensitive to data imbalance, annotation variability, and scanner heterogeneity. This study evaluates segmentation performance on multi-institutional datasets containing 18,700 CT and MRI scans across five diagnostic categories. Variations in contrast protocols and labeling standards were found to cause unstable convergence in baseline models. A hybrid normalization and uncertainty-aware sampling strategy was applied during training. Dice coefficients improved by 6%–9% across most categories, while inter-run variance decreased substantially. Nevertheless, rare pathological cases remained difficult to segment reliably. Model robustness remains constrained by annotation consistency and domain diversity.