When Faster GPUs Do Not Help: An Empirical Study of Edge Video Analytics Abstract

Main Article Content

Jonathan Reilly

Abstract

Recent advances in accelerator hardware have significantly reduced inference latency for deep learning models deployed at the network edge. Nevertheless, practical deployments frequently fail to achieve proportional improvements in end-to-end service performance. We present an empirical analysis of a video analytics pipeline deployed on a small-scale edge cluster. The system processes live camera streams for face recognition and access monitoring. Detailed tracing reveals that data decoding, frame transmission, and storage access dominate response time once GPU acceleration is introduced. Several system-level optimizations are evaluated, including task co-location, buffer reuse, and pipeline reordering. While these measures improve overall throughput, the study demonstrates that infrastructure-level constraints remain a primary performance bottleneck. The results suggest that accelerator-centric optimization strategies must be complemented by holistic system redesign.

Article Details

Section

Articles