AI Benchmarks Fail to Capture Real-World Performance Challenges, F5 Suggests Solution
AI enterprise teams have historically focused on compute, GPU allocation, and training throughput, often assuming the storage-to-compute data path would keep pace. However, real-world conditions introduce latency, network jitter, and node degradation, which controlled benchmarks typically fail to capture. This leads to AI pipelines performing well in labs but stalling in production, resulting in GPU underutilization and degraded AI outputs. F5 and MinIO testing demonstrated significant S3 throughput degradation even with modest latency. As a solution, F5 advocates for AI data delivery by deploying Application Delivery Controllers (ADCs) or Application Delivery and Security Platforms (ADSPs) to manage the data path as a resilient and secure control point, ensuring efficient data flow to GPUs.

Enterprise AI teams often dedicate resources to compute capacity, securing GPU allocations, and optimizing training throughput. This approach commonly relies on the assumption that the data path between storage and compute will maintain adequate performance. However, this assumption frequently does not hold true in production environments.
Real-world traffic introduces factors such as latency spikes, network jitter, and node degradation, which are typically not replicated in controlled benchmark settings. This discrepancy can lead to AI pipelines that demonstrate high performance in laboratory conditions but experience stalls and inefficiencies once deployed.
According to Paul Pindell, principal solutions architect for technology alliances at F5, standard benchmark methodologies often aim for optimal performance rather than realistic scenarios. He notes that S3 performance is known to degrade with latency, yet most benchmark environments omit consistent latency introduction. This means performance data used for infrastructure decisions often originates from conditions that production systems will not replicate.
F5 and MinIO conducted throughput testing under degraded network conditions to investigate this issue. The results indicated a rapid decline in S3 throughput when latency was introduced, with even modest latency significantly impacting performance. Latency was found to be a far greater contributor to throughput loss than jitter, a finding that inverted the team's initial expectations. This suggests that S3 object storage deployments require engineering for degraded network conditions rather than idealized environments.
Tanu Mutreja, senior director of product management at F5, highlights that while GPUs are a visible and expensive resource in AI infrastructure, their value in production depends entirely on the efficiency of the data path feeding them. A degraded data path can lead to GPU underutilization, reduced inference performance, lower quality AI outputs, increased egress costs from unnecessary data replication, and heightened operational complexity. AI workloads are particularly susceptible to these failures due to their parallel nature and lack of protection mechanisms found in traditional enterprise applications.
F5 proposes AI data delivery, which involves deploying an application delivery controller (ADC) or application delivery and security platform (ADSP) in front of storage. Hunter Smit, senior manager of product marketing at F5, states that provisioning addresses capacity but not delivery, which is where current constraints lie. F5’s integration with MinIO, through its BIG-IP ADSP, positions itself in the data path to monitor storage node health and direct requests efficiently, particularly crucial when nodes degrade in distributed storage clusters.
This approach aims to manage the storage-to-compute path as an engineered control point rather than a direct assumption. Smit explains that inserting a full-proxy ADC makes the path observable, programmable, and failure-aware, enabling health-based routing, quality of service, and inline security. This converts data delivery into a disciplined process, ensuring GPUs remain productive even under adverse conditions.
Challenges multiply at scale, especially when AI pipelines span multiple locations, clouds, or edge environments. Smit notes that across regions and clouds, control and digital sovereignty become paramount design constraints, influencing architecture before speed considerations. This pressure is driving enterprises to repatriate AI workloads from public clouds to owned and governed infrastructure. The proposed architecture decouples applications from single storage locations, establishing a unified control point for consistent policy enforcement across all environments. (Source: VentureBeat)