Understanding Non-Computational Overheads in Accelerator-Based Video Analytics

Main Article Content

Marko Petrovic

Abstract

Edge-based video analytics platforms increasingly rely on hardware accelerators to reduce inference latency. Nevertheless, practical deployments often fail to achieve proportional improvements in service response time. A face recognition pipeline deployed on a multi-node edge cluster was instrumented to capture fine-grained resource utilization. Measurements show that data decoding, storage access, and network transmission dominate latency once accelerator performance reaches saturation. As a result, further increases in computing power yield diminishing returns. Several infrastructure-level optimizations were evaluated, including task co-location and pipeline restructuring. Although overall throughput improved, system balance remained a limiting factor. The study demonstrates that performance optimization must consider the entire processing chain rather than isolated components.

Article Details

Section

Articles