Workload-Aware Performance Analysis of Distributed Storage Systems for Scientific Computing

Main Article Content

Brent Calloway
Aoyu Zhang

Abstract

Distributed storage infrastructures support large-scale scientific simulations and data analysis workflows. Despite advances in parallel file systems, uneven access patterns frequently lead to performance bottlenecks. A shared storage cluster supporting climate modeling and materials simulation workloads was monitored over twelve months. Access traces from 420 user projects show that fewer than 9% of files accounted for more than half of total I/O requests during peak periods. Metadata servers experienced recurring saturation when checkpoint operations overlapped. Several mitigation techniques, including dynamic data placement and burst-buffer scheduling, were evaluated. While average throughput increased by 18%, workload-specific tuning remained necessary. Effective storage management requires continuous workload characterization rather than static provisioning.

Article Details

Section

Articles